[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

EP4416697A1 - Procédé et appareil de reconnaissance et d'analyse optiques dans un environnement de mouvement - Google Patents

Procédé et appareil de reconnaissance et d'analyse optiques dans un environnement de mouvement

Info

Publication number
EP4416697A1
EP4416697A1 EP22838797.3A EP22838797A EP4416697A1 EP 4416697 A1 EP4416697 A1 EP 4416697A1 EP 22838797 A EP22838797 A EP 22838797A EP 4416697 A1 EP4416697 A1 EP 4416697A1
Authority
EP
European Patent Office
Prior art keywords
camera
positions
cameras
joints
joint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP22838797.3A
Other languages
German (de)
English (en)
Inventor
Pascal Siekmann
Gerhard DICK
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heero Sports GmbH
Original Assignee
Heero Sports GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heero Sports GmbH filed Critical Heero Sports GmbH
Publication of EP4416697A1 publication Critical patent/EP4416697A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/147Details of sensors, e.g. sensor lenses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present patent application relates in general to a method and a device for optical detection and analysis, in particular of bodies, body parts and/or joints of a human being, surfaces and objects in 3-dimensional space in a movement environment, as described by the independent claims .
  • the digital technology that the smartphone has given us is not only the trigger, but also the solution to the "activity crisis” and the solution to recording tasks in work areas as described, with digital technologies such as self-tracking with wearables, i.e Wearable IT elements or IT elements integrated into clothing, but also augmented and virtual reality applications can play a major role.
  • devices have been proposed, for example, which have a display surface, a movement or training space assigned to the display surface, a sensor system, a detection system for detecting the position of objects and/or a player in at least a partial area of the movement or training space and have a computing unit.
  • the detection system can have a depth camera, which is set up to record a two-dimensional or three-dimensional image of the movement space, with each pixel of the two-dimensional or three-dimensional image representing a distance value.
  • the distance value can be, for example, the distance of a point of the object and/or the player from the depth camera.
  • the depth camera can be set up to determine the distance by measuring the transit time of an electromagnetic pulse.
  • laser scanners or a touch-sensitive floor have been proposed for the detection system, it being possible for the laser scanner to be set up to determine a distance of a point on the object and/or the player from the laser scanner.
  • the touch-sensitive floor can limit the play space at its lower end and be arranged to determine the position of the object and/or the player by their contact with the touch-sensitive floor.
  • the fish-eye cameras or cameras which are provided with a fish-eye lens, are used so that objects to be recognized can be fully viewed from as small a distance as possible. In contrast, must normal camera lenses must be further away from the object to be recognized in order to view it fully.
  • the received camera image is severely distorted by the fisheye lens.
  • an AI can see an entire room, which, among other things, also enables running analyzes in the training of the user and the like. possible are.
  • the cameras and thus the AIs see everything that happens to the side of them.
  • a "dead zone" is therefore only available behind the cameras, which means that no additional, expensive cameras or sensors are required to additionally record the blind spots.
  • Whether the AIs focus on recognizing people and objects and recognizing the orientation of the person or object or recognizing body parts and joints depends on whether a person has already been tracked as part of the process or whether a previously tracked person has left the tracking area or the training field. If a person was previously tracked who is in the area of the training field, the position and orientation of the person are generated by the AI with a focus on recognizing body parts and joints.
  • the method is preferably also designed in such a way that the artificial intelligences check each other for plausibility in order to increase the detection frequency and accuracy. If only one camera with a fish-eye lens is provided in the method, the 3-dimensional positions of the detected body part/joint and/or object positions are determined by 3D real-time estimation.
  • At least two cameras with a fish-eye lens can be provided, with the determination of the 3-dimensional positions of the recognized body part, joint and/or object positions being carried out by triangulation.
  • the 2-N cameras are spaced 28 cm apart, although this spacing can be varied at will to increase or decrease accuracy.
  • the at least one camera with a fisheye lens preferably has a diagonal field of view (FOV) of 220° or more. Furthermore, the at least one camera with a fisheye lens preferably has a field of view of up to 180° horizontally and 180° vertically.
  • FOV diagonal field of view
  • the distance specifically selected for the respective application and the position of the one or more cameras is preferably selected so that the area in front of the camera or cameras is covered within a radius of up to 10 meters.
  • the AIs preferably include object recognition AIs, with the determination of 2D positions of the body, body parts, joints and/or surfaces and objects using camera images on the respective camera or cameras also including the following steps: a. Preparing the camera images for input to the object detection AI by downsizing the images through GPU (Graphic Processor Unit) acceleration, b. Cutting out the bodies, joints, body parts and/or objects, correcting the orientation and scaling the image section size using GPU acceleration, c. Recognition of body part and/or joint positions on the image sections of the body using other object recognition AIs that are trained to recognize body parts and joints, i.e. Recalculation of the recognized 2D body part and/or joint points to the original image size and orientation of the camera image; e.
  • object recognition AIs with the determination of 2D positions of the body, body parts, joints and/or surfaces and objects using camera images on the respective camera or cameras also including the following steps: a. Preparing the camera images for input to the object detection AI by downsizing the images through GPU (Graphic Processor Unit)
  • the object recognition AI is preferably provided with data on the relationships between human joints and carries out a pose correction.
  • a second data processing unit with graphics processors can be provided in order to provide games and game and sports programs and to enable training based on the 3D data, in particular to control the games or sports programs with the body and/or joints and to display feedback ( via video and audio).
  • the method according to the invention can also be characterized in that the AIs forward images, data and results of the analysis to an Internet cloud.
  • clones of the AIs can be trained to increase the accuracy of the entire system for the future, whereby the data in the cloud can also be used to make results available on other platforms.
  • HMI Human Machine Interface
  • the user can also be provided with additional use cases, such as a mirror image of himself, by another additional mirror camera, i.e. another camera that is not one of the cameras mentioned above and on their Images that AIs are not applied.
  • another additional mirror camera i.e. another camera that is not one of the cameras mentioned above and on their Images that AIs are not applied.
  • the user can see himself live and get the results displayed in real time on his mirror image/live video.
  • a training device for carrying out a method for optical recognition and analysis as described above, with one or more cameras with a fisheye lens, which are arranged at a distance and position from the training area specifically selected for the application, for determining 2D - Positions of the body, body parts and/or joints, surfaces and objects through camera images, with at least one data processing unit being provided, with devices for calculating neural networks and artificial intelligence in real time, for applying multiple AIs to captured camera images for recognition and/or analysis accordingly the method described and with at least one interface for one or more audio/video feedback units for outputting data or audio-Zvideo feedback about the analysis results by means of the audio-Zvideo feedback unit.
  • the device can have all the features that are also related to the procedures have already been described. All elements are preferably arranged in a single housing for easy handling by the user.
  • Fig. la/b shows the schematic comparison of the detection of a person in the training area by a fisheye lens 2 compared to a camera with a normal lens 4
  • FIGS. 2a/b show a camera with and without a fish eye.
  • FIG. 3a shows a flow chart of the method
  • FIG. 3b shows the flow chart of a variant of the method according to FIG. 3a
  • Figures la and 1b show a schematic comparison of the detection of a person in the training area by a fisheye lens 2 compared to a camera with a normal lens 4.
  • the fish-eye cameras 2, or cameras which are provided with a fish-eye lens 2 are used so that objects to be recognized, here a person 6, can be fully viewed from as small a distance 8 as possible.
  • normal camera lenses 4 must be at a greater distance 10 from the object 12 to be recognized in order to view this fully.
  • the fisheye camera has a significantly larger field of view 14 than the field of view 16 of the normal camera. Because of this, the fisheye camera 2 can see an entire room, whereas a camera 4 cannot capture what is happening to the side of them.
  • Figures 2a and 2b show in greater detail the schematic representation of the camera 2 with a fisheye lens 3 and the camera 4 with a non-fisheye lens 5.
  • conventional non-fisheye lenses 5 which are perpendicular to the optical axis map object plane proportionally, form fisheye lenses 3 from a hemisphere or more, with clear but not excessive distortions, on the image plane.
  • Straight lines that do not run through the center of the image appear curved; the figure is strongly barrel-shaped. Area ratios or radial distances are usually reproduced more faithfully than with an ordinary, gnomonically projecting wide-angle lens 5.
  • the fisheye lens 3 has a very large angle of view (shown here as a broken line) of 180° or more, which cannot be achieved with conventional projection methods.
  • 3a shows a flow chart of the method according to a preferred embodiment of the invention.
  • two cameras with a fish-eye lens are provided at a specifically selected distance and in a position relative to the movement area, via which a 2D image is recorded in step 20a, 20b.
  • the captured image is fed to an AI for the recognition of bodies, body parts and/or joints, surfaces and objects, with bodies/objects first being recognized in camera images at 24a, 24b, i.e. the 2D image position of the same is determined.
  • a next step 26a, 26b the detected bodies and/or objects are cut and then fed to a further AI to determine the 2D positions of the bodies, body parts and/or joints, surfaces and objects in a further step 28a, 28b. Only when this step has been completed are the previously parallel process strands combined and the 3D positions of bodies, joints, surfaces and objects in space determined by triangulation in step 30 on the basis of the data calculated in the previous steps, as well as the movement paths of the 3D Body part, joint and/or object positions are analyzed in a next step 32 . In a last step 33, the calculated data are then made available for exercise evaluation.
  • step 22c the captured image is fed to an AI for feature matching of bodies and/or objects before the detected bodies and/or objects are cut in the next step 26b and then to a further AI on the one hand to a further AI for determination the 2D positions of the bodies, body parts and/or joints, surfaces and objects in step 28b and on the other hand, in a step 28c, another AI is supplied for renewed feature matching of bodies and/or objects.
  • the AI data from steps 28b and 28c are then corrected for body part and/or joint positions, whereupon the parallel process strands are then combined again and the 3D positions of bodies , joints, surfaces and objects in space determined by triangulation in step 30 on the basis of the data calculated in the previous steps, and the movement paths of the 3D body part, joint and/or object positions are analyzed in a next step 32 and in a last step 33 the calculated data are then made available for exercise evaluation.
  • Figure 4 shows the schematic representation of an embodiment of the device, with a housing 34, in which cameras 2 with fisheye lenses are arranged in the front at a distance specifically selected for the application, whose fields of view or field of view are diagonal 220 ° and 180° horizontal and 180° vertical and intersect as shown by the dashed lines.
  • a data processing unit (not shown) is provided in the housing 34 with devices for calculating neural networks and artificial intelligence in real time, for determining 2D positions of the body, body parts and/or joints, surfaces and objects using camera images captured by the cameras 2, detection and cropping of bodies/objects on camera images, detection of body part-Zjoint and/or object positions on image crops, determination of the 3-dimensional positions of the detected body part-Z joint and Zor object positions, analysis of the movement patterns of the detected 3-dimensional body part -, joint andZor object positions, with the output of data or audio-Zvideo feedback, via recognition-Zanalysis results, calculated from the camera images, using an audio-Zvideo feedback unit (not shown), preferably a high-resolution screen with high-quality loudspeakers, be issued.
  • This Audio-ZVideo-Feedback Ein is controlled via a conventional connection 36, such as USB or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Psychiatry (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Vascular Medicine (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

L'invention concerne un appareil et un procédé correspondant pour la reconnaissance et l'analyse optiques, en particulier, de corps, parties de corps et/ou articulations d'un être humain, de surfaces et d'objets dans une région de mouvement tridimensionnel en temps réel, l'appareil comprenant un ou plusieurs dispositifs de prise de vues ayant un objectif fisheye, qui sont disposés à une distance et une position par rapport à la région de mouvement qui sont sélectionnées spécifiquement pour le cas d'utilisation, pour la détermination de positions 2D des corps, parties de corps et/ou articulations, surfaces et objets au moyen d'images de dispositif de prise de vues. Au moins une unité de traitement de données est prévue, qui comprend des dispositifs de calcul de réseaux neuronaux et d'intelligences artificielles en temps réel, pour l'application d'une pluralité d'IA à des images capturées de dispositif de prise de vues, et l'appareil comprenant au moins une interface pour une ou plusieurs unités de retour d'informations audio/vidéo pour délivrer en sortie des données ou un retour d'informations audio/vidéo concernant les résultats d'analyse au moyen de l'unité de retour d'informations audio/vidéo.
EP22838797.3A 2021-12-22 2022-12-16 Procédé et appareil de reconnaissance et d'analyse optiques dans un environnement de mouvement Pending EP4416697A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102021006307.4A DE102021006307A1 (de) 2021-12-22 2021-12-22 Verfahren und eine Vorrichtung zur optischen Erkennung und Analyse in einer Bewegungsumgebung
PCT/EP2022/086255 WO2023117723A1 (fr) 2021-12-22 2022-12-16 Procédé et appareil de reconnaissance et d'analyse optiques dans un environnement de mouvement

Publications (1)

Publication Number Publication Date
EP4416697A1 true EP4416697A1 (fr) 2024-08-21

Family

ID=84829997

Family Applications (1)

Application Number Title Priority Date Filing Date
EP22838797.3A Pending EP4416697A1 (fr) 2021-12-22 2022-12-16 Procédé et appareil de reconnaissance et d'analyse optiques dans un environnement de mouvement

Country Status (4)

Country Link
EP (1) EP4416697A1 (fr)
CA (1) CA3239174A1 (fr)
DE (2) DE102021006307A1 (fr)
WO (1) WO2023117723A1 (fr)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110279368A1 (en) 2010-05-12 2011-11-17 Microsoft Corporation Inferring user intent to engage a motion capture system
GB2589843B (en) 2019-11-19 2022-06-15 Move Ai Ltd Real-time system for generating 4D spatio-temporal model of a real-world environment
DE102020100366A1 (de) 2020-01-09 2021-07-15 Technische Universität Chemnitz Verfahren zur 3D-Bewegungsanalyse und Sofortfeedback für Trainingsübungen

Also Published As

Publication number Publication date
DE202022002783U1 (de) 2023-05-11
WO2023117723A1 (fr) 2023-06-29
CA3239174A1 (fr) 2023-06-29
DE102021006307A1 (de) 2023-06-22

Similar Documents

Publication Publication Date Title
CN106251399B (zh) 一种基于lsd-slam的实景三维重建方法及实施装置
CN112613609B (zh) 基于联合位姿优化的神经辐射场增强方法
DE112012001984B4 (de) Integrieren von Video-Metadaten in 3D-Modelle
DE69832119T2 (de) Verfahren und Apparat zur visuellen Erfassung von Menschen für aktive öffentliche Schnittstellen
DE69823001T2 (de) Verfahren und Vorrichtung zur Rekonstruktion der dreidimensionalen Bewegung eines menschlichen Körpers aus Monokularbildsequenzen
DE10326943A1 (de) Autonomes Fahrzeug sowie zugehörige Verfahren und Vorrichtung zur Bewegungsbestimmung, Bewegungssteuerung und Objekterkennung
DE102015206110A1 (de) System und verfahren zur herstellung von computersteuersignalen aus atemattributen
EP3304496B1 (fr) Procédé et dispositif pour générer des données pour une représentation à deux ou trois dimensions d'au moins une partie d'un objet et pour générer la représentation à deux ou trois dimensions de la partie de l'objet
CN110598590A (zh) 基于多视角相机的紧密交互人体姿态估计方法及装置
EP3347876B1 (fr) Dispositif et procédé pour générer un modèle d'un objet au moyen de données-image de superposition dans un environnement virtuel
DE202016008004U1 (de) Automatische Verbindung von Bildern unter Verwendung visueller Eigenschaftsquerverweise auf zugehörige Anwendungen
DE102016123149A1 (de) Bilddatenbasierte rekonstruktion dreidimensionaler oberflächen
CN114049434A (zh) 一种基于全卷积神经网络的3d建模方法及系统
EP3403240B1 (fr) Dispositif et procédé pour superposer au moins une partie d'un objet à une surface virtuelle
DE112019006107T5 (de) Authoring-Vorrichtung, Authoring-Verfahren und Authoring-Programm
EP4416697A1 (fr) Procédé et appareil de reconnaissance et d'analyse optiques dans un environnement de mouvement
DE10042387A1 (de) Verfahren zum Transformieren dreidimensionaler Objektpunkte in zweidimensionale Bildpunkte für Linear-Fächerstrahlsensor-Bilder
DE102015010264A1 (de) Verfahren zur Erstellung einer 3D-Repräsentation und korrespondierende Bildaufnahmevorrichtung
DE112019002126T5 (de) Positionsschätzungsvorrichtung, positionsschätzungsverfahren und programm dafür
WO2007048674A1 (fr) Système et procédé de poursuite par caméra
WO2022034196A1 (fr) Procédé de reconstitution d'un modèle 3d d'une scène
EP3449463A1 (fr) Système d'analyse de déplacement et système de suivi de déplacement, le comprenant, d'objets déplacés ou se déplaçant, se détachant thermiquement de leur environnement
WO2022034197A1 (fr) Procédé de reconstruction d'un modèle 3d d'une scène
WO2024099605A1 (fr) Procédé et dispositif de suivi d'un objet
DE102024103009A1 (de) Steuerbares dynamisches erscheinungsbild für neuronale 3d-porträts

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20240517

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR