[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2024127004A1 - Procédé d'imagerie et dispositif d'imagerie - Google Patents

Procédé d'imagerie et dispositif d'imagerie Download PDF

Info

Publication number
WO2024127004A1
WO2024127004A1 PCT/GB2023/053207 GB2023053207W WO2024127004A1 WO 2024127004 A1 WO2024127004 A1 WO 2024127004A1 GB 2023053207 W GB2023053207 W GB 2023053207W WO 2024127004 A1 WO2024127004 A1 WO 2024127004A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
display
imaging device
gesture
autostereoscopic image
Prior art date
Application number
PCT/GB2023/053207
Other languages
English (en)
Inventor
Atma Heerah
Original Assignee
Temporal Research Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Temporal Research Ltd filed Critical Temporal Research Ltd
Publication of WO2024127004A1 publication Critical patent/WO2024127004A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/041Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means
    • G06F3/042Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means by opto-electronic means
    • G06F3/0425Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means by opto-electronic means using a single imaging device like a video camera for tracking the absolute position of a single or a plurality of objects with respect to an imaged reference surface, e.g. video camera imaging a display or a projection screen, a table or a wall surface, on which a computer generated image is displayed or projected
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/0304Detection arrangements using opto-electronic means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/261Image signal generators with monoscopic-to-stereoscopic image conversion
    • H04N13/268Image signal generators with monoscopic-to-stereoscopic image conversion based on depth image-based rendering [DIBR]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance

Definitions

  • This disclosure relates to an imaging method and an imaging device.
  • this disclosure relates to the displaying of an image that includes a feature which appears in a region above a display, and the detecting of a gesture that is performed in the region above the display.
  • Touchless human interfaces have been recognised as more sanitary for users, however, adoption has been limited due to cost, integration and retrofitting issues.
  • current naked eye 3D solutions require specialist software applications to generate images and user interfaces. Due to the complexity of the 3D calculations the user facing images have to be calculated offline and then recalled in the correct sequence in response to the user input.
  • a touchless device that can utilized as a drop in replacement for a touchscreen is not available.
  • Integral Images are continuous parallax images that can be recorded for display with a 2D image and an optical array and viewed with the naked eye.
  • the optical array may be a sheet of small tenses (micro-lens array) or pinhole array or an alternative optical arrangement that performs the image reconstruction.
  • the 2D images contain information about a volume of space including horizontal and vertical parallax. Current commercial applications of InIm are limited to high cost specialised areas such as microscopy.
  • FIG. IB shows a conventional integral imaging arrangement 10.
  • the arrangement 10 is applicable to image display or image pickup of a 3D object 11.
  • the arrangement 10 shows a microlens array (MLA) 12 positioned with respect to a screen (13, 14).
  • the screen serves as a 2D display 13 which has pixels which emit light (e.g., red, green and blue light for a colour display), such that this light from the display 13 is directed by the MLA 12 to provide the image of the 3D object 11.
  • the screen serves as a camera 14 which has sensors which are sensitive to light (e.g., red, green and blue light), such that the light is directed by the MLA 12 from the 3D object 11 to the camera 14.
  • Reconstructed images are viewable in lobes with the main frontal lobe normally having a viewing angle of 30-35 degrees.
  • the MLA 12 and the screen (13, 14) are separated by a barrier.
  • the barrier shown in FIG. IB is to prevent flipping or sudden image changes and can be costly to manufacture on a micro level.
  • Stereo cameras generally consists of identifying the objects and then using disparity (relative displacement) to estimate depth.
  • disparity relative displacement
  • Horizontal and vertical dimensions are estimated using triangulation.
  • Identification of foreground objects in images can be done by comparison with a background image and may include tracking image differences or motion to establish disparity.
  • the following description deals mainly with the location of objects near the cameras once they have been identified as objects. However, the arrangement does inherently make identification of such objects easier.
  • a possible depth estimation technique is to use stereo cameras in parallel configurations or slightly angled towards each other. However, this depth estimation is only considered viable in the far region where disparity reduces with distance. At close distances with angled cameras, near and far objects can be confused.
  • Disclosure is presented of methods and implementation allowing the construction of a device that is a drop in replacement for conventional touchscreens.
  • Existing applications can be used with the human interface floated above the screen.
  • 2D images can be floated at a set distance above the screen with a touchless pointer function coplanar with the floating image. All images fed to the device are automatically and seamlessly converted into integral images for display in 3D space.
  • an imaging method comprising: displaying to a user an autostereoscopic image that includes a feature which appears in a region above a display; detecting a gesture performed by the user in the region above the display; and updating the feature in response to the gesture.
  • the autostereoscopic image is an integral image which is configured to be displayed using an optical array.
  • the feature comprises a virtual button; the gesture comprises a selection of the virtual button; and the updating comprises providing an indication of the virtual button being selected.
  • the updating includes an animation of the autostereoscopic image, wherein the feature appears to move in the region above the autostereoscopic image. This can give the user the indication that a virtual button is being pressed.
  • the method further comprises generating an autostereoscopic image from a non- autostereoscopic image in real time enabling video display.
  • the method further comprises a lookup table to maximise processing efficiency.
  • the method further comprises modifying the autostereoscopic image in response to sensor data to present more information and enhance the user experience in a three dimensional way.
  • the method further comprises providing a feedback signal which coincides with the gesture being detected.
  • the method further comprises generating an autostereoscopic image from a non- autostereoscopic image.
  • the method further comprises: identifying a foreground of the image; defining a region in the foreground of the image that is occupied by a moving object; and interpreting whether the moving object is providing the gesture.
  • the method further comprises detecting a change in size and/or a change in disparity of the moving object, in order to define the region occupied by the moving object.
  • disclosure is provided of a method for detecting a gesture, the method comprising: identifying a foreground of the image; defining a region in the foreground of the image that is occupied by a moving object; and interpreting whether the moving object is providing the gesture.
  • identifying the foreground of the image comprises using a number of reference lines to identify a bounded volume in front of a detection unit.
  • the method further comprises performing near field focussing to distinguish between a first object that occupies the foreground of the image and a second object that occupies a background of the image.
  • the method further comprises detecting a change in size and/or a change in disparity of the moving object, in order to define the region occupied by the moving object.
  • the method further comprises interpreting whether the moving object is providing the gesture comprises determining whether the moving object can serve as a pointer.
  • the method further comprises interpreting whether the moving object is providing the gesture comprises performing object location by referencing data that is stored by a look up table or a memory.
  • Disclosure is provided of a combination of the first aspect (which may include any optional feature thereof) and the second aspect (which may include any optional feature thereof).
  • disclosure is provided of a program which, when executed by a computer, causes the computer to perform a method according to the first aspect or the second aspect.
  • disclosure is provided of a computer-readable storage medium storing a program according to the third aspect.
  • a imaging device comprising: a display configured to display to a user a autostereoscopic image that includes a feature which appears in a region above a display; a detection unit configured to detect a gesture performed by the user in the region above the display; and an update unit configured to update the feature in response to the gesture.
  • the autostereoscopic image is an integral image; and the display comprises an optical array configured to display the integral image.
  • the optical array has a flat surface which is configured to face the user.
  • the optical array covers a portion of the display.
  • the optical array does not cover all of the display.
  • a first portion of the display is covered by the optical array, and a second portion of the display is not covered by the optical array.
  • the portion of the display which is covered by the optical array can be used to display a three dimensional image and/or a two dimensional image.
  • the portion of the display which is not covered by the optical array can be used to display a two dimensional image.
  • the imaging device further comprises at least two image sensors configured to capture a stereoscopic image of an object in the region; wherein the detection unit is configured to detect the gesture by identifying a disparity in the stereoscopic image of the object that has been captured by the at least two image sensors.
  • the display is configured to display a virtual keypad.
  • the imaging device further comprises a card reader configured to read information that is stored on a card that is presented by the user.
  • the imaging device further comprises a feedback unit configured to provide a feedback signal which coincides with the gesture being detected by the detection unit.
  • the imaging device further comprises means to generate an autostereoscopic image from a non-autostereoscopic image.
  • the imaging device further comprises means to generate an autostereoscopic image from a non-autostereoscopic image in real time enabling video display and seamless system integration.
  • the imaging device further comprises a lookup table to maximise processing efficiency and reduce power consumption compared to established calculation methods.
  • the imaging device further comprises means to modify the autostereoscopic image in response to sensor data making the device more user friendly, intuitive and increasing productivity compared to alternative human interfaces.
  • a device configured to detect a gesture
  • the apparatus comprising: an identification unit configured to identify a foreground of the image; a defining unit configured to define a region in the foreground of the image that is occupied by a moving object; and an interpreting unit configured to interpret whether the moving object is providing the gesture.
  • an imaging method comprising generating an autostereoscopic image from a non-autostereoscopic image in real time enabling video display.
  • the imaging method further comprises a look up table.
  • the imaging method further comprises autostereoscopic image responses to sensor data .
  • an imaging device comprising means to generate an autostereoscopic image from a non-autostereoscopic image in real time enabling video display.
  • the imaging device further comprises a look up table.
  • the imaging method further comprises autostereoscopic image responses to sensor data.
  • a solution is disclosed which does not have barriers due to the problem being reduced by the gap g being a minimum.
  • the disclosed software can also reduce this problem.
  • viewing angle improvement is unnecessary for privacy reasons.
  • the system is inherently more secure than normal keypads as someone standing next to the user will see a different lobe to the user and find it difficult to establish the key being pressed.
  • This disclosure relates to a way of providing the general public with displays that make use of Integral Imaging (InIm). This is achieved using displays that include mass produced precision parts, with these displays being coupled with innovative software. These displays allow images to be displayed that appear to be in front of the display. This is referred to as Float Photon (trade mark) InIm technology, which includes the following advantages:
  • a true optical model is generated in space, with the result that the image is autostereoscopic (no glasses).
  • FIG. 1 A shows the front view of a typical microlens array
  • FIG. IB shows a conventional integral imaging arrangement
  • FIG. 2 shows an improved integral imaging arrangement
  • FIGs. 3A-D show views from different angles of a display configured to display a stereoscopic image
  • FIG. 4 provides a schematic arrangement of an imaging device
  • FIG. 5 shows two parallel image sensors embodied as cameras arranged to create a stereoscopic image, together with details of depth estimation and disparity;
  • FIG. 6 shows an arrangement for the detection of near field objects coexisting with a 3D image floating above a display, together with details of foreground images and foreground disparity;
  • FIG. 7A shows a conventional InIm capture or display setup
  • FIG. 7B shows an example of how the object distances are calculated
  • FIG. 8A is a flow chart showing an imaging method that is implemented
  • FIG. 8B is a flow chart showing a pointer detection method that is implemented
  • FIG. 8C is a flowchart showing the real time conversion methodology of non- autostereoscopic images to autostereoscopic images; and • FIG. 9A-C provides some possible details of an imaging device, with FIG. 9A showing a hardware topology; FIG. 9B showing a software topology; and FIG. 9C showing some applications of the imaging device.
  • Disclosure is provided of a user being presented with an autostereoscopic image.
  • FIG. 2 shows an improved integral imaging arrangement 100, for which the autostereoscopic image is an Integral Image (InIm).
  • a floating image 1 is shown which has been created using an array of microlenses (MLA) 2 to project a 2D image which is displayed by a display panel 3.
  • MVA microlenses
  • the 2D image may appear to the user as a 3D image.
  • the user is presented with floating image 1, which includes a feature that appears to occupy a region above the display 3 (i.e., the 3D image appears to be coming out of the screen towards the user).
  • a distinctive feature is that the 2D image is not binocular, i.e. separated into areas for left and right eye viewing.
  • the integral imaging arrangement 100 further includes cameras 4 which monitor the region in which the floating image 1 is displayed.
  • the cameras 4 are configured to detect movement in this region, such as a gesture that is performed by the user.
  • the combination of displaying an Integral Image (InIm) with finger detection allows user interaction, which is achieved by updating the stereoscopic image based on the gesture, so that the 3D feature appears to be moved by the user performing the gesture.
  • Im Integral Image
  • Existing mid-air control interfaces are not intuitive and not suitable for secure systems where privacy is important.
  • a Float Photon Interface (FPI) is introduced that uses camaras for pointer/ mouse functionality, resulting in a touchless screen, as shown in FIG. 2.
  • FPI Float Photon Interface
  • the near field image in front of the screen is captured and automatically recognises close foreground objects and their distance from the screen using disparity between left and right images.
  • Such touchless methodology is applicable to both 2D and 3D displays.
  • Multi-touch response is possible, as is button animation to show downward movement in response to a button press.
  • the micro-lens array (MLA) 2 has a flat face which faces the user, giving a surface that can easily be dusted, and allowing a touchscreen to coexist with the float photon interface.
  • This example shows the use of a micro-lens array 2, although the disclosure extends to the use of any optical component 2 which confers the functionality of projecting a 2D image which is being displayed.
  • Innovation is found in the way in which the touchless mouse functionality using foreground/ background separation and the mid-air button animation being done by using the finger position to alter the constructed image.
  • a variety of ways are well-known for the construction and the animation of a stereoscopic image. For example, processing left and right images separately using standard techniques before combining them into a viewing format.
  • the animation of the stereoscopic image provides the user with feedback which indicates that the gesture has been detected.
  • the imaging device 100 includes a feedback unit which can provide a feedback signal that coincides with the gesture being detected.
  • Possible feedback units include a speaker configured to provide an audio signal, and a haptic device configured to provide a haptic signal.
  • Advantages of these methods include computational and mechanical simplicity. Wide angled cameras can be located behind a standard flat bezel without raised parts around the edge of the screen as would be required by an infra-red or other beam interruption solution operating in front of the screen.
  • Tests show that a floating keypad around 25mm above the display is sufficient for users to avoid contact with the display surface. Furthermore, users have a natural incentive not to press excessively and will be reminded with user feedback when the selection is registered.
  • FIGs. 3A-D show views from different angles of a display 3 configured to display an autostereoscopic image 1.
  • the autostereoscopic image shows a floating cube 1 that appears to be at around 20 mm above the display 3.
  • FIGs. 3A-D illustrate the cube 1, showing full horizontal and vertical views from four different angles.
  • the virtual buttons can be arranged to provide a virtual keyboard or a virtual keypad.
  • the display 3 may display one or more virtual button which can be selected by the user by performing a gesture of pressing the button with their finger or a stylus.
  • a user typically indicates button selection by performing a press gesture by moving their finger towards the display, and then moving their finger away from the display.
  • Selection of a virtual button is confirmed to the user by the display presenting an animation, which shows the button being pressed.
  • the animation indicates to the user that a selection gesture is being performed. While the finger is moving towards the display, the button appears to move from a first position that is further away from the display, to a second position that is closer to the display. While the finger is moving away from the display, the button appears to move from the second position that is closer to the display, back to the first position that is further away from the display. Accordingly, the image is updated to indicate the virtual button being selected.
  • a secure application such as a card payment reader typically includes a numeric keypad.
  • An InIm keypad will typically be arranged with keys 4, 5, 6, arranged horizontally and keys 9, 6, 3 and decimal point arranged vertically.
  • FIG. 4 shows a schematic arrangement of an imaging device 100.
  • a floating image 1 is shown which is created using an optical component 2 to image a flat screen display 3.
  • the imaging device 100 includes application software 5, a display image processor 6, and a front image processor 7.
  • an input/output (I/O) unit 8 is used to provide a connection with any number of live feeds 9 and/or any number of memory storage devices 10.
  • I/O input/output
  • the illustrated components can be implemented as hardware, firmware, software, or a mixture of all three.
  • the floating image 1 provides a 3D image of a 2D integral image which is produced by the display 3.
  • the floating 3D image 1 appears at varying depths to illustrate animation, with a push gesture being depicted by the user’ s finger moving towards the display.
  • a number of ways are well-known for creating integral images including using third party modelling tools such as Autodesk 3ds Max (registered trademark), however, the method described in Fig. 8C represents the fastest and most efficient way of creating integral images from non-integral images.
  • the floating 3D image 1 is generated from hardware, firmware and software components, with the user pushing a floating button that animates downwards with the finger press.
  • the optical component 2 is arranged in front of the display 3.
  • the optical component 2 is typically a lens array (such as a microlens array (MLA)) or pinhole array that is configured to construct the 3D image 1 from the integral image (InIm) that is displayed by the display 3.
  • the flat screen display 3 may or may not include a touchscreen to display integral images (InIms) that are generated by the application 5, the display image processor 6, the front image processor 7, or any combination of the three.
  • the touchscreen can be integrated into the optical component (e.g., the microlens array (MLA)) 2.
  • Two or more cameras 4 are positioned to capture a near field view of the volume around the floating image 1. Captured content is fed from the cameras 4 to the front image processor 7 for processing.
  • the content can include stereoscopic or depth information.
  • the cameras 4 can be rotated or repositioned to an optimum angle, either electronically or mechanically.
  • the cameras have a focal length and a depth of field that can also be varied optically or electronically.
  • the application 5 determines the overall function that the device 100 performs for the user, for example, the device 100 serving as a credit card reader.
  • the application 5 determines what functionality is to be performed by the display image processor 6, the front image processor 7, and the I/O unit 8.
  • the application 5 is in control of configuring the display 3 to present to the user an integral image that is based on input that is received from the front image processor 7 or via the I/O unit 8.
  • the display image processor 6 is configured to process images for the display 3 based on instructions received from the application 5.
  • the source image information can come from the front image processor 7 directly or from a live feed 9 and /or a memory 10 via the I/O connectivity unit 8.
  • One process is the conversion of real time (video rate) 2D images to InIms that are perceived as 2D images that float a set distance from the display 3.
  • Another is the real time conversion of 2D images plus depth (2DD) to InIms that are perceived as 3D floating images.
  • An additional process can alter the InIm generated in real time in response to input from the front image processor 7. This allows part of the constructed image to vary in depth enabling button animation in response to a push event.
  • the front image processor 7 is configured to processes images received from the cameras 4, to implement a touchless human interface by identifying objects close to the optical component 2, and calculating their 3D position. This near field technique has advantages over existing visual tracking and artificial intelligence (Al) camera solutions, because of its simplicity.
  • the front image processor 7 can also process 2D and 3D scene images and video for conversion into InIms by the display image processor 6 or direct transmission to the live feeds 9 and/or the memory storage 10. Connectivity is achieved using the input/output (I/O) unit 8, which can implement an internal software process or USB, wireless or other established interfacing techniques.
  • I/O input/output
  • Live stream input from one or more live feed 9 is where real time video (2D, 2DD or 3D) or an image is streamed into the display image processor 6 for processing.
  • Live stream output to one or more live feed 9 is where real time video (2D, 2DD or 3D) or an image is streamed out having been processed by the display image processor 6.
  • a memory 10 that is used for storage can be solid state or non-solid state, and can be used for storing programs, processed and source images and videos.
  • Fig. 7A shows a conventional InIm capture or display setup.
  • the volume (74) made up of voxels (75) is captured using the optical component (72) that includes a lens array in front of an image sensor (71).
  • the components can be hardware or software or a combination of both.
  • the value of each voxel at a particular location in the volume is denoted by S(x,y,z).
  • the value of each pixel in the image sensor or display is denoted by T(XI,YI).
  • S(x,y,z) and T(XI,YI) are typically red, green and blue (RGB) values.
  • ray optics is used to trace each voxel location through each lens (73) to a location on (71).
  • Each lens (73) has a corresponding pixel area (76) on the image sensor (71).
  • All voxel locations are defined in the InIm because the InIm is not resolution limited and is independent of display resolution.
  • a given volume and micro-lens specification allows a look up table to be generated offline. This table can be used in real time operations as ray tracing calculations are eliminated as illustrated in Fig. 8C.
  • the InIm is resolution limited by the display panel and down sampling is required. This can be but is not limited to nearest voxel by rounding or truncation or an interpolation of surrounding voxels.
  • the image presented by the display includes a foreground and a background.
  • the foreground region is in the vicinity of the display.
  • the camera is configured to detect a moving object. If the foreground is occupied by the moving object, then a region of the image is defined that includes the moving object. The definition of this region is based upon detecting a change in the size and/or a change in the disparity of the moving object.
  • the image detected by the cameras is interpreted to assess whether the moving object is providing a gesture.
  • FIG. 5 shows two parallel cameras arranged to create a stereoscopic image, together with details of depth estimation and disparity.
  • the parallel cameras have wide viewing angles 2 x 0o.
  • Objects 51-54 are situated in front of the cameras. Object 54 is in the foreground whilst the others are in the background.
  • object 54 obscures object 53 in Image L and object 51 in Image R. If both object 53 and 54 are similar to object 54, then the viewer does not know if object 54 or object 51 and 53 are being viewed.
  • Image L and Image R are combined along the parallel camera centre lines to give the disparity image from which the disparity can be calculated using standard processing. It can be seen that an object moving from the object 52 position in the background to the object 54 position in the foreground will increase the disparity.
  • the new approach rotates the disparity reference lines inwards by 0i. Image L and R remain the same, however, the combined disparity image has a new characteristic. This time an object moving from the object 52 to 54 position has decreasing disparity to a null point (0 disparity) and then increases in disparity as it gets closer to the cameras.
  • FIG. 6 shows an arrangement for the detection of near field objects coexisting with a 3D image floating above a display.
  • FIG. 6 further shows details of foreground images and foreground disparity.
  • the cameras are angled to make the disparity reference lines coincident with the camera centre lines to maximise the detection volume and reduce the maximum disparity for close objects thereby minimising the ultra-sensitivity at close distances.
  • the cameras have a viewing angle around 80 degrees as is typical for small low cost cameras and a depth of field set to focus on the foreground area only.
  • the 3D image is not registered by the cameras as it is assumed the light from the display constructing the 3D image is projected away from the cameras.
  • Object 55 has been added in the 3D displayed volume. Both object 54 and 55 are in the foreground qualification regions of Images L and R. Objects 51 and 53 are not shown in Images L and R for clarity. In any event the possibility that objects 51 and 53 could give a false foreground reading is greatly diminished due to the background being out of the cameras’ depth of field and therefore defocused and lower contrast. Foreground Disparity shows the disparities for objects 54 and 55 enabling depth detection in the foreground space as an object moves from outside the 3D volume to inside.
  • Table 1 allows object location determined by disparity calculation to be validated using the object’s relative motion. This together with the near field camera setup increases the overall reliability of the method and reduces false readings.
  • the location of an object such as object 55 can be determined by triangulation.
  • the x, y and z object distances can be calculated using the known parameters of camera spacing and field of view angles.
  • FIG. 7B shows an example of how the x, y and z object distances can be calculated.
  • the cameras have field of views of 90 degrees.
  • a and B are the centres of the camera lenses and C is the null point.
  • A is the reference 0, 0, 0 position.
  • the spacing between the cameras IAB is commonly referred to as the Interocular Distance but in this case the camera spacing is not intended to replicate human eye separation as is typically used in current stereo cameras.
  • the generalised example shown is for an object P at an elevated position vertically and located at Xp, Yp and Zp, which is similar to the location of object 55.
  • Images L and R show the camera images at the camera’s resolution.
  • the relative positions of object P in these images will be used to determine the real world distances Xp, Yp and Zp.
  • UVL and VHL are the left camera’s horizontal and vertical resolutions and UHR and VVR the right camera’s.
  • the position of object P in Images L and R is given by UPL, UPR and Yp. Its position can also be expressed as a ratio of the maximum dimension.
  • the object P position is derived from UPL/UH, UPR/UH and VPL/VV. This approach makes the calculation independent of camera resolution providing there is a minimum resolution that can meet the required granularity of the design.
  • Disparities can be expressed as proportions of the horizontal and vertical camera resolutions making the calculation of the location of an object of interest independent of image sensor resolution and simpler. This lends itself to real time implementations with look up tables. In this way, finger detection above a display can be implemented and then initiate and process real time 3D image changes to facilitate functions such as button press animation with finger movement.
  • Pointer calibration can be used as an alternative or supplementary to determining the x,y,z locations of objects with trigonometry.
  • reference objects are placed throughout the volume of interest and their actual locations compared with the calculated positions to produce an error correction value applicable to objects found at or around the space occupied by the reference object.
  • a high resolution map can be created for use as a look up table for accurate pointer operation.
  • the 3D position of an object P entering the near field space can be determined using a minimum of three variables
  • FIG. 8 A is a flow chart showing an imaging method SI 00 that is implemented.
  • step S101 an image is sourced for display by display 3.
  • This image can be sourced in a number of ways including from stored memory, communications, image processing or from a combination of one or more sources in accordance with the arrangement shown in FIG. 4.
  • Step SI 02 determines if the image is suitable for display based on the requirements of the application software 5. If it is the image is displayed in step SI 04, otherwise the image or the designated part is first converted to an autostereoscopic image in step SI 03 using topologies FIG. 9 A and 9B. Thus a stereoscopic image can be displayed to the user. Disclosure is provided of autostereoscopic images being generated from non-autostereoscopic images. Disclosure is also provided of an autostereoscopic image being displayed, which includes a feature which appears in a region above the display 3.
  • step SI 05 the cameras 4 are setup with parameters such as frame rate, focal length, aperture, field-of-view, resolution and orientation angle. This allows the pointer operation to be optimised for the required operational conditions.
  • a detection can be made of any gesture that is performed by the user in the region above the display.
  • This detection method S200 is detailed in the flow chart FIG. 8B.
  • step SI 06 images captured by the cameras 4 are monitored, and any objects of interest are detected. This can be done by comparing successive video frames, establishing differences of interest and using motion with respect to the cameras as shown in Table 1.
  • step SI 07 the horizontal and vertical centre positions are determined for any objects that are detected in the images that are captured by the cameras 4.
  • One method of calculating the centre positions is to take the average of the maximum and minimum points occupied by the object for both axes. This method works well for objects with a uniform shape such as a finger.
  • step SI 08 an assessment is made of whether any objects have been detected in the foreground of the image that has been captured. If one or more object has been detected in the foreground of the image, then the method progresses to step SI 09. On the other hand, if no objects have been detected in the foreground of the image that has been captured, then the method returns to step SI 06.
  • step S109 the location of the object is determined. This can be done by calculating the location of the object. Alternatively, this can be done by looking up the location of the object.
  • step SI 10 an assessment is made of whether any object is in the 3D image volume. If one or more object is in the 3D image volume, then the method progresses to step Si l l. On the other hand, if no object is in the 3D image volume, then the method returns to step S106.
  • step S 111 an assessment is made of whether a system response is to be carried out. Examples of system responses are connection to a website, operation of a mechanical device such as a door, powering down a device or user feedback such as an audible beep. If a system response is to be carried out, then the method progresses to step SI 12. On the other hand, if no response is to be carried out, then the method returns to step SI 06.
  • step SI 12 the system carries out the response or designated instruction.
  • step SI 13 an assessment is made of whether a display response is to be performed. If a display response is to be performed, then the method progresses to step SI 14. On the other hand, if no display response is to be performed, then the method returns to step SI 06.
  • step SI 14 the display response is performed.
  • the 2D and or 3D image that is displayed by the display 3 is changed or animated by returning to step S 101.
  • a button may change colour and/ or move downwards with the user’s finger. Accordingly, the image is updated in response to the gesture.
  • FIG. 8C is a flowchart showing the real time conversion methodology of SI 03.
  • Step S 103.1 determines the format of the input image. This could be a standard 2D image, a 2D image with depth information, a point cloud 3D image or other 3D format.
  • Step SI 03.4 based on User Setup information from SI 03.3.
  • the user may specify additional parameters such as depth as is typically required for purely 2D source images.
  • the user may also specify modifying parameters such as magnification factors and positional offsets.
  • Another example is depth inversion used for changing psuedoscopic images to orthoscopic (normal) images.
  • the modifying parameters could also come from Sensors in SI 03.2 such as from an accelerometer, gyroscope or magnetometer.
  • the volume parameters can be modified to tilt and/ or rotate the 3D image towards the user in response to non-optimum orientation of the screen.
  • Another example is using thermometer data to change the 3D size of a feature in response to changes in temperature.
  • Step SI 03.5 translates the source image based on the volume parameters from SI 03.4 to produce an object image volume in voxels that is to be described in subsequent steps as an integral image.
  • Step S103.6 the 3D volume from S103.5 is mapped onto the 2D display surface using a lookup table to determine equivalent locations.
  • Step SI 03.7 sends the voxel value, typically specified as red, green and blue data, for use at its equivalent integral image 2D location provided the location has not been previously used by another voxel in the 3D volume. Duplication can occur for instance where one voxel is lined up behind another by line of sight. Such instances enter Step S103.8 for Voxel/ Pixel Processing. Step S103.8 makes decisions on how mapping conflicts are resolved based on the User Setup in SI 03.3 resulting in a use or not use outcome. The user may have prioritised voxels nearest the viewer in which case the depth of the voxel will determine whether the voxel is mapped. Another use case example is where the user wishes to smooth rounding errors and uses a voxel value interpolated from similarly mapped or surrounding voxels.
  • Step S103.9 assembles the integral image for display on a voxel by voxel or pixel by pixel basis. The mapping is repeated until the target integral image is complete.
  • the S103 is repeated at frame rate e.g. 60 fps. Note that this process can be done without any calculations.
  • the real time method shown in Fig. 8C is equally applicable to a range of 3D display solutions such as pin hole and lenticular display systems.
  • the detection of the gesture is implemented using a pointer method.
  • the features of this pointer method are -
  • FIG. 8B is a flow chart showing a pointer detection method S200 that is implemented.
  • S200 shows a method of calculating pointer position at run time, however, the referred to equations can be used to generate a look up table that can be stored in memory to achieve the same result.
  • Step S201 inputs the camera parameters once the cameras are setup in SI 05. These parameters form the basis of the pointer location.
  • Step S202 tracks an object’s size and motion to determine if it is an object of interest in accordance with Table 1. In near field detection this is an object satisfying the foreground conditions. If there is an object of interest then the method progresses to S203 otherwise step S202 is repeated.
  • Step S203 gets the disparity parameters and the relative motion Mp of the object of interest P before proceeding to step S204.
  • Step S204 confirms the location of object P in the foreground region using disparity according to FIG. 6. If this is not confirmed the method returns to S202 otherwise S200 progresses to S205 and S206 where the left and right horizontal object angles are calculated using Equations 2 and 3 respectively.
  • the horizontal angles 0PL and OPR are used in S207, S208 and S209 to calculate the object P coordinates X p , Y p , Z p using Equations 1, 6 and 4.
  • Step S210 uses the difference between the current and previous locations of object P to determine the relative motion of P and compares this to Mp retrieved from S203. If there is consistency then the method progresses to S212 otherwise an error is detected and handled in step S211. Valid coordinates are output for use by the system in S212 before returning to the monitoring step S202.
  • the display 3 is used to present both a 2D image and a 3D integral image.
  • the 3D arrangement can present to the user a plurality of virtual buttons which are arranged to form a virtual keypad or a virtual keyboard.
  • the MLA 2 being applied to the bottom half of the screen, which for the card reader application device 100 can thus present a 3D image such as a floating keypad in a secure customer area.
  • disclosure is provided of the MLA covering a portion of the display area, which allows the 3D image to be presented in this portion.
  • the MLA 2 can cover the whole screen and still present a 2D image, because the height and depth of each voxel is settable in software.
  • This disclosure covers credit card readers that comply with EMVCo certification (registered trade mark). Innovation is to be found in the use of the disclosed device 100 for keypad entry, health and security systems.
  • the keypad can be located behind a piece of glass or window with the keypad being operable in front of the glass in another room or outside. This is inherently more secure as the entry system can be accomplished with no external wiring or mechanical attachment.
  • Disclosure is provided of software that has been developed to model any 3D full colour image, icon or video, real world or graphic for 3D visualisation in space without glasses.
  • FIG. 9 A illustrates that the hardware topology of the imaging device 100 may include a microlens array 2, an OLED (organic light emitting diode) display 3, a first camera (camera 1) and a second camera (camera 2), sensors, input/output (VO) units 8, one or more processor, one or more memory, and a voltage regulation unit.
  • Possible I/O units may include any combination of a HDMI port, a USB 1 port, a USB 2 port, a USB 3 port, and a Wi-Fi receiver (registered trade marks).
  • the device 100 is shown including a central processing unit (CPU), a field- programmable gate array, and firmware.
  • CPU central processing unit
  • firmware firmware
  • FIG. 9B illustrates that the software topology of the imaging device 100 may include applications, libraries & a web based interface, a hardware abstraction layer, drivers, and a kernel.
  • FIG. 9C shows some applications of the imaging device 100.
  • Applications are available for different sizes of display 3.
  • Applications that implement a small sized display include access systems, lift controls, and appliances.
  • Applications that implement a medium sized display include fuel station devices, parking meters, ATMs (automated teller machines), and vending machines.
  • Applications that implement a large sized display include airport check-in devices, kiosks, ordering systems, point of sale devices, self check out devices, and healthcare devices.
  • the algorithm, software and optical development that has been disclosed confers advantages over expensive holographic and light field solutions, and utilises standard components.
  • disclosure is provided of a human interface that is presented in space, so that there is no user contact and therefore no surface pathogen transmission.
  • the Float Photon (trade mark) Interface eliminates cleaning agents, waste and human intervention whilst simultaneously increasing throughput and being carbon negative.
  • the disclosed examples can be realised by a computer or a system or apparatus. These examples can be implemented by any device configured to execute instructions, or any dedicated hardware that is capable of carrying out all or a portion of the functionality. Disclosure is provided of hardware (e.g., a processor such as a central processing unit (CPU) or a microprocessor unit (MPU)) configured to read out and executes a program recorded on a memory device to perform the functions of the disclosed examples.
  • the program is provided to the computer for example via a network or from a recording medium of various types serving as the memory device (e.g., a computer-readable medium such as a non-transitory computer-readable medium).
  • the steps of the disclosed methods may be performed in any suitable order, or simultaneously where possible.
  • An imaging method comprising: displaying to a user an autostereoscopic image that includes a feature which appears in a region above a display; detecting a gesture performed by the user in the region above the display; and updating the feature in response to the gesture.
  • the autostereoscopic image is an integral image which is configured to be displayed using an array of microlenses.
  • the updating includes an animation of the autostereoscopic image, wherein the feature appears to move in the region above the display.
  • the imaging method according to any preceding clause further comprising: providing a feedback signal which coincides with the gesture being detected. 6. The imaging method according to any preceding clause, further comprising: generating an autostereoscopic image from a non-autostereoscopic image.
  • a method for detecting a gesture comprising: identifying a foreground of the image; defining a region in the foreground of the image that is occupied by a moving object; and interpreting whether the moving object is providing the gesture.
  • identifying the foreground of the image comprises using a number of reference lines to identify a bounded volume in front of a detection unit.
  • a program which, when executed by a computer, causes the computer to perform a method according to any preceding clause.
  • An imaging device comprising: a display configured to display to a user an autostereoscopic image that includes a feature which appears in a region above a display; a detection unit configured to detect a gesture performed by the user in the region above the display; and an update unit configured to update the feature in response to the gesture.
  • the autostereoscopic image is an integral image
  • the display comprises an array of microlenses configured to display the integral image
  • the array of microlenses has a flat surface which is configured to face the user.
  • the imaging device according to any one of clauses 17 to 20, further comprising: at least two cameras configured to capture a stereoscopic image of an object in the region; wherein the detection unit is configured to detect the gesture by identifying a disparity in the stereoscopic image of the object that has been captured by the at least two cameras.
  • the imaging device according to any one of clauses 17 to 22, further comprising: a card reader configured to read information that is stored on a card that is presented by the user.
  • the imaging device according to any one of clauses 17 to 23, further comprising: a feedback unit configured to provide a feedback signal which coincides with the gesture being detected by the detection unit.
  • the imaging device according to any one of clauses 17 to 24, further comprising: means to generate an autostereoscopic image from a non-autostereoscopic image.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Un procédé d'imagerie et un dispositif d'imagerie (100) sont divulgués. Un affichage (3) peut être utilisé pour afficher à un utilisateur une image autostéréoscopique. L'image autostéréoscopique comprend une caractéristique qui apparaît dans une région (1) au-dessus du dispositif d'affichage (3). Une unité de détection (4) détecte un geste effectué par l'utilisateur dans la région (1) située au dessus de l'affichage. Une unité de mise à jour est conçue pour mettre à jour la caractéristique en réponse au geste.
PCT/GB2023/053207 2022-12-13 2023-12-13 Procédé d'imagerie et dispositif d'imagerie WO2024127004A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB2218713.2 2022-12-13
GB2218713.2A GB2627426A (en) 2022-12-13 2022-12-13 An imaging method and an imaging device

Publications (1)

Publication Number Publication Date
WO2024127004A1 true WO2024127004A1 (fr) 2024-06-20

Family

ID=84974845

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2023/053207 WO2024127004A1 (fr) 2022-12-13 2023-12-13 Procédé d'imagerie et dispositif d'imagerie

Country Status (2)

Country Link
GB (1) GB2627426A (fr)
WO (1) WO2024127004A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320969A1 (en) * 2010-06-28 2011-12-29 Pantech Co., Ltd. Apparatus for processing an interactive three-dimensional object
WO2012128399A1 (fr) * 2011-03-21 2012-09-27 Lg Electronics Inc. Dispositif d'affichage et procédé de commande associé
US20170054971A1 (en) * 2014-02-17 2017-02-23 Samsung Electronics Co., Ltd. Electronic device and operation method therefor
US20190147665A1 (en) * 2016-07-16 2019-05-16 Hewlett-Packard Development Company, L.P. Gesture based 3-dimensional object transformation

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224138B (zh) * 2015-10-22 2019-04-19 京东方科技集团股份有限公司 悬浮触控显示装置
CN106324848B (zh) * 2016-10-31 2019-08-23 昆山国显光电有限公司 显示面板及其实现悬浮触控和裸眼3d的方法
CN107980116B (zh) * 2016-11-22 2021-04-06 深圳市柔宇科技股份有限公司 悬浮触控感测方法、悬浮触控感测系统及悬浮触控电子设备
CN107589884A (zh) * 2017-07-18 2018-01-16 朱小军 一种3d立体显示交互方法及智能移动终端
CN112925430A (zh) * 2019-12-05 2021-06-08 北京芯海视界三维科技有限公司 实现悬浮触控的方法、3d显示设备和3d终端
TWI775300B (zh) * 2021-02-02 2022-08-21 誠屏科技股份有限公司 觸控顯示裝置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110320969A1 (en) * 2010-06-28 2011-12-29 Pantech Co., Ltd. Apparatus for processing an interactive three-dimensional object
WO2012128399A1 (fr) * 2011-03-21 2012-09-27 Lg Electronics Inc. Dispositif d'affichage et procédé de commande associé
US20170054971A1 (en) * 2014-02-17 2017-02-23 Samsung Electronics Co., Ltd. Electronic device and operation method therefor
US20190147665A1 (en) * 2016-07-16 2019-05-16 Hewlett-Packard Development Company, L.P. Gesture based 3-dimensional object transformation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
G. LIPPMANN: "Epreuves reversibles. Photographies integrates", C. R. ACAD. SCI., vol. 146, pages 446 - 451, XP009062092

Also Published As

Publication number Publication date
GB202218713D0 (en) 2023-01-25
GB2627426A (en) 2024-08-28

Similar Documents

Publication Publication Date Title
US11782513B2 (en) Mode switching for integrated gestural interaction and multi-user collaboration in immersive virtual reality environments
US20230269353A1 (en) Capturing and aligning panoramic image and depth data
RU2524834C2 (ru) Устройство для автостереоскопического рендеринга и отображения
US11734876B2 (en) Synthesizing an image from a virtual perspective using pixels from a physical imager array weighted based on depth error sensitivity
US11854147B2 (en) Augmented reality guidance that generates guidance markers
US11582409B2 (en) Visual-inertial tracking using rolling shutter cameras
US9442561B2 (en) Display direction control for directional display device
US9791934B2 (en) Priority control for directional display device
US11741679B2 (en) Augmented reality environment enhancement
US11587255B1 (en) Collaborative augmented reality eyewear with ego motion alignment
Stommel et al. Inpainting of missing values in the Kinect sensor's depth maps based on background estimates
WO2024127004A1 (fr) Procédé d'imagerie et dispositif d'imagerie
US20230007227A1 (en) Augmented reality eyewear with x-ray effect
Piérard et al. I-see-3d! an interactive and immersive system that dynamically adapts 2d projections to the location of a user's eyes
JP5765418B2 (ja) 立体視画像生成装置、立体視画像生成方法、立体視画像生成プログラム
JP5642561B2 (ja) 家屋異動判読支援装置、家屋異動判読支援方法及び家屋異動判読支援プログラム
Naheyan Extending the Range of Depth Cameras using Linear Perspective for Mobile Robot Applications
KR100926348B1 (ko) 무안경식 3d 온라인 쇼핑몰 구현을 위한 단말 장치 및 이에 의한 디스플레이 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23838201

Country of ref document: EP

Kind code of ref document: A1