WO2008031369A1

WO2008031369A1 - System and method for determining the position and the orientation of a user

Info

Publication number: WO2008031369A1
Application number: PCT/DE2006/001631
Authority: WO
Inventors: Mehdi Hamadou; Andreas MÜLLER
Original assignee: Siemens Aktiengesellschaft
Priority date: 2006-09-15
Filing date: 2006-09-15
Publication date: 2008-03-20
Also published as: DE112006004131A5

Abstract

The invention relates to a system and a method for determining the position and the orientation of a user in relation to a real environment viewed by said user. In order to enable the determination of the pose of a user inside a spatially extended real environment as efficiently as possible, a method is proposed in which sections (10,11) of the real environment, which lie in the viewing field of the user, are acquired with a camera (20), the method having an initialization phase comprising the following procedural steps: selection of a partial model (0..8) suitable for determining an initial pose of the user from an overall model of the real environment, broken down into various partial models (0..8), the selection being performed in accordance with a section (10,11) of the real environment acquired by the camera (20) during the initialization phase, and determination of the initial pose of the user by comparison of the section (10,11) with the partial model (0..8). Said method also has a tracking phase, following the initialization phase, in which the pose of the user is determined continuously, starting from the initial pose by means of a tracking algorithm, the initialization phase being restarted as soon as the accuracy of the determination of the pose, achieved or achievable in the tracking phase, no longer satisfies a predefined quality criterion.

Description

description

System and method for determining the position and orientation of a user

The invention relates to a system and a method for determining the position and orientation of a user with respect to a real environment which he regards.

Such a system or such a method are used, for example, in augmented reality applications. Augmented reality, augmented reality, is a form of human-technology interaction that gives people, e.g. Using a pair of data goggles fades information into his field of view and thus extends the reality perceived by him. This extension of reality is also called augmentation. It happens contextually, i. suitable for and derived from the object considered. The object under consideration may be, for example, a component, a tool, a machine, an automation system or an open engine compartment of a car, to name but a few examples. For example, when augmenting the field of view of the user, safety, assembly or dismantling instructions are displayed which assist a user in his or her activity.

Automation devices can be used in the application domain producing industry, medicine or in the consumer sector. In the manufacturing industry, applications ranging from simple operator control and observation processes to complex service activities can be supported. In operations, examinations and treatments in the medical environment, such methods and devices serve a user to improve the quality of work. In the consumer area, applications such as navigation of people, information provision, etc. can be realized. In order to ensure positionally accurate augmentation of the displayed information with respect to the real objects in the field of view of the user, tracking methods are used. With these methods, the position and orientation of the user are first determined. Position and orientation of the user in relation to his real environment are also referred to in the general jargon with the term "pose" which includes both sizes.

To determine the pose, the field of view of the user is continuously recorded with a camera. In a so-called markerless tracking matches between certain features in the recorded camera image and a model of the real environment of the user is determined and determines the pose of the user. Such a method is referred to as markerless tracking, since it requires no special emphasis in reality. The augmented reality system recognizes the pose exclusively by means of the image features in the detection range of the camera.

In order to be able to perform markerless tracking, a so-called initialization must first be carried out. This provides the user's pose for the first time. The initialization of the augmented reality system is computationally intensive and requires an interaction on the part of the user. For example, the user must explicitly adjust his viewing direction to a defined object with the camera for initialization in such a way that the augmented reality system recognizes the real object based on an augmentation of the object in the camera image displayed to the user at a defined position. So he has to bring the augmentation of the object and the real object to coincide. From this, the system can determine an initial position and orientation, a so-called initial pose, in space. However, the augmentation does not have to be absolutely exactly brought to coincidence by the user. The augmented reality system can independently bring the augmentation with the real object within a certain tolerance range to cover. An alternative method of initialization is the use of so-called keyframes, ie predefined views of the environment, which the augmented reality system must recognize again within a tolerance range based on the current camera image. Depending on the initialization method used, more or less user interaction is required.

After the successful initialization described in this way, the user can vary his viewing direction and thus the coverage of the camera as desired. The augmented reality system can then display augmentations with exact position as long as the object or its model used for initialization is within the detection range of the camera. Tracking examines the movements of the camera image and derives therefrom the position and orientation of the user in relation to the real environment, the so-called pose. This happens almost in real time and without any user interaction. Overall, the posi- tion determination is divided into an initialization phase and a subsequent tracking phase in which the pose is redefined from image to image.

However, the workspaces to which an operator wishes to use an augmented reality system can be much larger than the detection range of the camera, in particular if the user makes meaningful distances to the objects to be augmented. As soon as the object used for initialization is no longer within the detection range of the camera, no pose determination can take place within the scope of the tracking phase described above. The augmented reality system loses its initialization and pose relative to the object originally used for initialization. As a result, the tracking process aborts, and initially no augmentation can be displayed. The invention is based on the object of enabling a pose determination of a user within a spatially extended real environment as efficiently as possible.

This object is achieved by a method for determining the pose of a user with respect to a real environment considered for him, in which sections of the real environment in the field of vision of the user are detected with a camera, the method being an initialization phase with the process steps:

Selecting a partial model suitable for determining an initial pose of a user from an overall model of the real environment which is subdivided into different partial models, the selection being carried out as a function of a section of the real environment detected by the camera during the initialization phase, and determining the initial pose of the user by comparing the user Section with the partial model, and wherein the method has a tracking phase following the initialization phase, in which the pose of the user is determined continuously from the initial pose by means of a tracking algorithm, wherein the initialization phase is restarted as soon as it reaches the tracking phase or achievable accuracy of the pose determination a predetermined quality criterion is no longer met.

Furthermore, the object is achieved by a system for determining the pose of a user with respect to a real environment which he regards, the system comprising: a camera for capturing sections of the real environment which are in the field of vision of the user, a first memory area for an overall model of the real environment decomposed into different submodels, initialization means for selecting a submodel suitable for determining an initial pose of a user from the overall model as a function of an excerpt captured by the camera during an initialization phase tes of the real environment and for determining the initial pose by comparing the detail with the first partial model,

Tracking means for determining the pose of the user during a tracking phase following the initialization phase, wherein the tracking means are provided for continuously determining the pose from the initial pose by means of a tracking algorithm, and monitoring means for restarting the initialization phase as soon as said in the Tracking phase achieved or achievable accuracy of the pose determination no longer meets a predetermined quality criterion.

The determination of the pose of the user is divided into two phases: the initialization phase and the tracking phase. The tracking phase can only be carried out if at least once before an initial pose of the user has been determined, ie at least one initialization phase has preceded, during which the position and orientation of the user with respect to the real environment was determined for the first time. In order to determine the poses as fast as possible and as little as possible in the calculation, it is usually attempted to determine the pose as long as possible with the aid of an efficient and fast tracking algorithm. If, however, the user changes his viewing direction very strongly after the initial pose has been determined, so that the detection range of the camera differs greatly from that used during the initialization phase, the quality of the pose determination during tracking generally suffers. At the latest when the detection area is completely outside the section used in the initial position determination, a pose determination by means of the tracking algorithm is no longer possible.

According to the invention, a new initialization phase is started as soon as the achieved accuracy no longer fulfills the previously determined quality criterion or as soon as the achievable accuracy no longer fulfills this criterion. Letztgenann- Causal connection has prophylactic character. In this case, a new initialization is carried out if, for example, it is no longer possible to expect a fulfillment of the quality criterion due to the change in the pose that has taken place since the last initialization.

In the cases mentioned, therefore, a new comparison of the new captured section with a model of the real environment must be performed. The initialization phase must therefore be restarted.

One criterion for the accuracy of the pose determination is the tracking error during the tracking phase. Various methods are known for determining the tracking error. For example, the number of model features used in the initial pose setup is continuously determined in the current frame. If, in particular, this falls permanently below a predetermined threshold value, it is concluded therefrom that the tracking error and thus the non-compliance with the required quality criterion are too great.

Such a method is also called a robust method. By contrast, so-called non-robust methods return for each feature a probability with which it was found. From the total probability for all characteristics, the overall accuracy of the tracking results.

The invention is based on the finding that the expense for the reinitialization can be significantly reduced if the overall model which simulates the real environment considered by the user is broken down into individual smaller submodels. The purpose of this decomposition is not to have to use the entire model in the reinitialization, but only a suitable and much smaller sub-model. This has the advantage that the algorithms used for initialization must be applied to a much smaller data area. Thus, the initialization process can be performed much faster. This advantage turns out in particular in very large Environments of the user noticeable, which must be emulated accordingly with very large environment models.

According to the invention, it is thus investigated within the tracking phase whether the user's pose determined by the tracking algorithm can still achieve or achieve a required accuracy. If this predetermined quality criterion is no longer met, the tracking phase is interrupted and a new initialization phase is activated. In this case, the image section of the camera acquired at the end of the tracking phase is again used as the basis for the determination of a new submodel. With the aid of this new suitable partial model, a comparison with the section in the detection range of the camera is then performed again and a corresponding initial pose is redetermined. This happens at least largely without interaction of the user and can run almost unnoticed in the background. Once the new initial pose is determined, the very fast tracking algorithm can be used for continuous, continuous pose determination.

Due to the decomposition of the overall model according to the invention into individual partial models, which can each be used independently of each other for determination of the initial pose, the subject matter of the invention opens up the possibility, for the first time, of expediently using an augmented reality system even in very large environments. The specific selection of individual partial models for determining the initial position as a function of a current viewing angle of a user and depending on its position makes it possible to control the computational effort that the image processing algorithms need to determine the initial pose.

In an advantageous embodiment of the invention, when the initialization phase is restarted, a submodel is determined which is adjacent to a submodel used in the overall model in the preceding initialization phase. If, during the tracking phase, it is determined that the quality If the determination of the pose no longer corresponds to the predefined quality criterion or if such a deviation is to be expected on the basis of the changes in the user field of view during the tracking phase, one of the submodels adjacent to the previous submodel is selected in order to redetermine the initial pose. Thus, it is not necessary to consider all submodels of the overall model since it can be assumed that, after a change in the field of view of the user and thus of the detection range of the camera, a submodel adjacent to the previous submodel is suitable for determining the initial pose.

In a further advantageous embodiment of the invention, in particular the determination of a suitable submodel among the adjacent submodels is simplified in that the submodel is determined by evaluating the last pose determined within the tracking phase and the position of the submodel used in the preceding initialization phase. The submodel used in the previous initialization phase is the starting point in the search for a new submodel. During the tracking phase, the change in the field of vision of the user or the detection range of the camera was tracked, so that with this information a new sub-model can be found to determine a new initial pose.

In order to avoid the emergence of a tracking error in advance, it can be provided in a further advantageous embodiment of the invention that the initialization phase is restarted as soon as the detection range of the camera has left the cutout detected during the initialization phase by a predetermined amount. As soon as this has happened, a new suitable partial model is determined, for example, by evaluating the change in the detection range of the camera determined during the tracking phase and calculating a new initial pose. This happens especially without User interaction and therefore can be done almost unnoticed by the user of the system.

A further advantageous embodiment of the invention is characterized in that the scope of the environment modeled by the submodels is dependent on the size of the detection range of the camera. This makes sense, since the initial pose is performed by comparing the detection range of the camera with one or more elements of a suitable sub-model. If the partial model were much larger than the detection range of the camera, then elements of the partial model would have to be examined by a corresponding algorithm, which does not appear in the detection range of the camera and thus is not available for comparison. Conversely, if the partial model is too small, the problem would be that elements of the real environment captured by the camera would be searched unsuccessfully in the corresponding submodel.

Since the size of the detection range of the camera can change, for example by the user moving farther away from the real environment or by changing the zoom factor of the camera, an embodiment of the invention is advantageous in which the overall model changes when the size of the sensor changes. The scope of the camera is now divided into different submodels. Thus, the decomposition of the overall model into individual submodels would happen "on the fly". When determining the initial posi- tion, it would be ensured that a submodel of ideal size is always available for comparison with the detection range of the camera. Of course, a less performant embodiment of the invention is also possible in which the decomposition of the overall model into individual submodels is already carried out in a preceding engineering phase. As a rule, it must be ensured that the detection range of the camera does not vary too much during pose determination. In the aforementioned "on the fly" decomposition of the overall model such a nearly rigid definition of the detection range is not necessary. A simple determination of the partial model suitable for the initial position determination can be achieved in an advantageous embodiment of the invention in that the partial model is determined by backprojecting the detected section onto the real environment.

A suitable criterion for determining the suitable submodel is given in a further advantageous embodiment of the invention in that the submodel is selected such that it has the largest overlap area with the detected section of the real environment among the various submodels. The larger the overlapping area of the detected section with the partial model, the greater the probability that the comparison of the section with the partial model leads to a successful initial pose determination.

A suitable procedure for comparing the section with the selected submodel is given in an advantageous embodiment in that, to determine the initial pose, an object of the real environment located in the detected section is recognized with the aid of the selected submodel and an augmentation of the object is made to coincide with the object becomes. For the initial initial position determination in particular, user interaction may be necessary for this, in which the user aligns his field of vision with respect to the real environment in such a way that the augmentation of the object coincides with the object itself. However, smaller deviations in particular can also be compensated for by the system itself, in that the augmentation and the real object are reconstructed mathematically and from this process the information required for determining the initial position is determined.

A particularly advantageous application of the method results in an embodiment in which the particular pose is used for positionally accurate augmentation of the field of view of the user with information. Such information For example, installation instructions for an automation technician can be displayed with exact position, assistance for a surgeon, which can be shown in his field of view during an operation, or even simple explanations for a visitor to an exhibition, fitting in and out of the elements he is looking at.

A particularly user-friendly superimposition of such augmentations is provided by an embodiment of the invention, in which the information is faded into the field of vision of the user by means of data glasses. For example, with such a data goggles, a so-called optical see-through method can be realized, in which the user perceives the objects of the real environment directly through the data goggles and the information about this is displayed at a suitable location in the data goggles.

In the following, the invention will be described and explained in more detail with reference to the exemplary embodiments illustrated in the figures. Show

1 shows a detection range of a camera of an embodiment of the system for determining the pose of a user, FIG. 2 shows augmented information in the field of vision of the user, FIG.

3 shows an initialization of an embodiment of the system for determining the pose of a user,

4 shows a planar representation of a captured section of the real environment and its modeling,

5 shows a perspective view of the captured detail of the real environment and its modeling, FIG. 6 shows possibilities of movement within the space modeled by neighboring partial models, FIG.

7 shows a first area of a camera detected at a first initialization phase, 8 shows a second section captured by the camera at the beginning of a second initialization phase,

9 shows a flow chart of an embodiment of the method for determining the pose of a user, and FIG. 10 shows an application example of an embodiment of the method within an augmented reality system.

1 shows a detection area 30 of a camera 20 of an embodiment of the system for determining the pose of a user. The illustrated camera 20 is part of an augmented reality system with which information is displayed in a positionally accurate and context-dependent manner to objects of the real environment viewed by a user in its field of view. The system is designed such that the camera 20 always at least partially detects the field of view of the user. For example, for this purpose, the camera 20 is mounted on the user's head so that it automatically follows his gaze. When determining the position and orientation of the user, the so-called pose, objects of the real environment lying in the detection area 30 of the camera 20 are compared with a three-dimensional model of the real environment. If the objects recognized by the camera 20 are found again in the three-dimensional model, a determination of the pose of the user is possible.

Which objects are located in the detection area 30 of the camera 20 depends on the one hand on the settings of the camera 20, such as the zoom factor, on the other hand on the distance 70, the camera 20 has the considered real environment. In particular, in very large work environments of the user, it may happen that at a time only a very small portion of the real environment in the detection range 30 of the camera 20 is located. An object, which is initially located in the detection area 30 of the camera 20 and was used to determine an initial pose, can become very hot when using the system in a very large environment quickly outside the detection range 30, which would require reinitialization of the system.

FIG. 2 shows an augmented information 60 in the field of vision of the user, which is recorded by a camera 20. The augmented information 60 was previously derived as a function of an object detected in the detection area 30 of the camera 20 and is now superimposed in an exact position for this purpose in the field of view of the user, for example via data glasses. If the user now changes his field of view and thus the detection range 30 of the camera 20, then the augmented information 60 first of all "sticks" to said object. This is just a desirable feature of the augmented reality system.

The four dashed rectangles arranged around the augmented information 60 indicate the maximum deflection of the detection area 30 of the camera 20, which is possible without augmentation loss. Provided that the detection range corresponds to the tracking range, the augmented information 60 can no longer be displayed as soon as the detection range 30 moves further away from the starting position shown in FIG. 2 than indicated by these dashed rectangles. Such a tracking loss and thus augmentation loss occur frequently, especially in large environments.

3 shows an initialization of an embodiment of the system for determining the pose of a user. In the detection area 30 of the camera 20 there is a real object 40, which is simulated by a three-dimensional environment model and used to initialize the system, ie to determine the position and orientation of the user. The real object 40 is found again in the model of the real environment. A corresponding augmentation 50 is finally projected into the real environment. For the initial initial determination of poses, the user now has to try to implement the augmentation 50 with the real object 40 in bring. This is done by adjusting his pose and thus the pose of the camera 20 with respect to the real environment. Once the augmentation 50 and the real object 40 are brought into coincidence, the pose of the user can be uniquely determined. Subsequently, a continuous pose determination is carried out by a tracking algorithm with which changes in the position of the real object 40 in the detection area 30 of the camera 20 are tracked and the pose is determined continuously with relatively little computation effort. However, this is only possible as long as the real object 40 is located at least partially in the detection area 30 of the camera 20. As soon as the user's field of vision changes so much that the real object 40 completely leaves the detection area 30, the system must be reinitialized using another object in the detection area of the camera 20.

The previously described initial initialization of the system can also be performed at least partially without user interaction. With smaller deviations, the augmentation 50 can also be automatically brought into coincidence by the system with the real object 40 "mathematically" and from this the initial pose can be determined.

4 shows a planar representation of a captured section of the real environment and its modeling. To model the real environment, an overall model is provided, which is broken down into individual submodels 0..8. Illustrated here are, by way of example, a first submodel 0, whose modeled reality is located in the detection range of a camera 20, and directly adjacent submodels 1. As a rule, the overall model of the real environment is subdivided into substantially more submodels, which are not all shown here as well as in FIGS. 5, 6, 7 and 8 for reasons of clarity.

The size of the submodels 0..8 largely corresponds to the detection range of the camera 20. If, therefore, during an initial alisierungsphase is located in the detection range section of the real environment with the image to be compared in the model, so to be processed for the corresponding algorithms three-dimensional model is significantly smaller than the overall model of the entire real environment of the user's workspace. As shown in FIG. 4, the first partial model 0 is located in the current detection range of the camera 20. When moving the camera 20, the area of the real environment modeled by the first partial model 0 may possibly be left out, so that one of the adjacent partial models 1..8 must be used for a re-initialization determination.

5 shows a perspective view of the detected area of the real environment and its modeling. Again, it can be seen that movements of the camera 20 always lead to a continuous displacement of the detection area within the neighborhood area with respect to the last camera position.

Finally, FIG. 6 shows various possibilities of movement within the space modeled by the adjacent partial models 0..8.

FIG. 7 shows a first section 10 detected by a camera 20 during a first initialization phase. The first detail 10 of the real environment lying in the detection range of the camera 20 is at least partially modeled by a first sub-model 0 which is part of an overall model of the real environment. In order to determine the position and the orientation of the user for the first time, a comparison of the first detail 10 with the first partial model 0 is performed. For example, the initial pose is determined according to the method already described in FIG. If, for the first time, the user's position and orientation relative to the real environment are fixed, the pose can first be tracked using a tracking algorithm that tracks movements of real objects within images recorded with the camera 20. be determined continuously and with relatively little computational time almost in Ξchtzeit. However, this is possible only as long as objects used for initialization in the detection range of the camera 20 are at least partially located. If the detection range of the camera 20 leaves the said area of the real environment beyond, the system must be reinitialized.

8 shows a second section detected by the camera 20 at the beginning of a second initialization phase, which follows the tracking phase described above in FIG. In the illustrated case, the camera has been moved out of the position that it held during the first initialization phase so far that a tracking error exceeds a previously determined maximum value. Therefore, it is first checked which sub-model 0..8 of the real environment is best suited to re-determine the initial pose within a new initialization phase. In the illustrated case, it has been found that the now detected second section 11 of the real, environment with a second

Submodel 5, which is arranged in direct proximity to the first submodel 0, forms the largest possible overlap area of all submodels 0 to 8 of the overall model. Therefore, this second submodel 5 is selected for reinitialization. After the initial pose has been redetermined, further pose changes can be tracked using the tracking algorithm.

The redetermination of the initial pose can be performed almost unnoticed by the user and without its interaction.

This is possible because the first submodel 0 used in the previous initialization is known, and also since the changes made to the pose by the trackin algorithm are tracked. Thus, enough information is available to the system to redetermine the pose without selecting user interaction after selecting the appropriate second submodel 5. 9 shows a flow chart of an embodiment of the method for assuring the pose of a user. In a first method step 80, the initialization phase is first performed for the initial pose determination. For the first time, a suitable submodel from the overall model is selected. By comparing this selected sub-model with the area of the real environment captured by the camera in this step, the position and orientation of the user is determined for the first time. After the initial pose has been determined in this way, in a second method step 81, the tracking phase is started, in which a tracking error is continuously determined. As long as this tracking error is less than a predetermined maximum value, the tracking phase will continue. However, if the tracking error is greater than the previously determined maximum value, it is checked in a third method step 82 whether the partial model used in the first method step 80 has sufficient coverage with the current detection range of the camera. If this is the case, an automatic reinitialization is carried out with the partial model already used previously. However, if the coverage quality is not sufficient, a new partial model is determined in a fourth method step 83, which is loaded into a fast memory in a fifth method step 84. Finally, in the sixth method step 85, an automatic initialization of the system, ie a new determination of the initial pose, is carried out using the newly determined submodel, if previously an insufficient coverage quality was determined with the old submodel, or an automatic initialization using the submodel already used previously if the coverage quality in the third method step 82 appeared to be sufficient. After the system has been automatically reinitialized, the pose can be continued again using the track-in algorithm as described in the second method step 81. 10 shows an application example of an embodiment of the method within an augmented reality system. The augmented reality system has a camera 20, with which a spatially far-reaching pipeline system is received in the real environment of a user. The user is, for example, an installer who is to carry out repair work on the illustrated piping system. To assist the user, this information should be displayed contextually in his field of view. For this purpose, the user wears a data glasses, which allows a positionally accurate display of this information. While working on the piping system, the user should be shown successively 90.91.92 augmentations at three separate points. Initially, it is located at the time t 1 at a position at which a camera 20 located on its head detects a first cutout 10 of the pipeline system. Since the detection range of the camera 20 approximately corresponds to the field of view of the user, said field of view can be set approximately equal to the first section 10. After a first initialization of the augmented reality system, the augmentation can be superimposed with exact position at the location 90 provided for this purpose. If it then continues to move in such a way that its field of view detects the second section 11, the detection area of the camera 20 leaves the first section 10 and thus the elements of the partial model used for the initialization. The pose determined by tracking deteriorates in advance, which leads to a check of the current coverage of the detection range of the camera with the partial model used for the initialization. It will be appreciated that a sub-model adjacent to the previously used sub-model should be used for reinitialization. The corresponding submodel is automatically loaded into memory and used for reinitialization. From here an accurate tracking can take place again. The partial model for the initialization used in the first section 10 is accordingly deleted from the memory. An analogous procedure takes place when the detection range of the camera at time t3 has also left the second section 11 so far that the tracking algorithm provides only insufficient results. Correspondingly, in a third section 12 captured by the camera, a re-initialization of the system is carried out on the basis of a third submodel.

Claims

claims

1. A method for determining the pose of a user with respect to a real environment which he regards, in which sections (10, 11) of the real environment which are in the field of vision of the user

User are detected with a camera (20), the method having an initialization phase with the method steps:

Selection of a submodel (0..8) suitable for determining an initial pose of the user from an overall model of the real environment decomposed into different submodels (0..8), wherein the selection depends on one of the camera (20) during the initialization phase captured section (10,11) of the real environment is performed, and

Determining the initial pose of the user by comparing the section (10,11) with the partial model (0..8), and wherein the method has a tracking phase following the initialization phase, in which the pose of the user starting from the initial pose continuously by means of a Tracking algorithm is determined, wherein the initialization phase is restarted as soon as the achieved or achievable in the tracking phase accuracy of the pose determination no longer meets a predetermined quality criterion.

2. The method according to claim 1, wherein upon restarting the initialization phase, a partial model (5) is determined, which is adjacent to a partial model (0) used in the preceding initialization phase in the overall model.

3. Method according to claim 1 or 2, wherein the submodel (0..8) is determined by evaluating the pose last determined within the tracking phase and the position of the submodel (0..8) used in the preceding initialization phase.

4. The method according to any one of the preceding claims, wherein the initialization phase is restarted as soon as the detection range of the camera (20) has left the section (10, 11) detected during the initialization phase by a predetermined amount.

5. The method according to any one of the preceding claims, wherein the scope of the modeled by the submodels (0..8) environment on the size of the detection range (30) of the camera (20) is dependent.

6. The method according to any one of the preceding claims, wherein the overall model when a change in the size of the detection range (30) of the camera (20) is newly decomposed into different sub-models (0..8).

7. The method according to any one of the preceding claims, wherein the partial model (0..8) by rear projection of the detected section (10,11) is determined on the real environment.

8. The method according to any one of the preceding claims, wherein the sub-model (0..8) is selected such that it among the different sub-models (0..8) the largest overlap area with the detected section (10,11) of the real Environment.

9. Method according to one of the preceding claims, wherein to determine the initial pose, an object (40) of the real environment located in the detected section is recognized with the aid of the selected submodule (0..8) and an augmentation (50) of the object is brought to coincide with the object (40).

10. The method according to any one of the preceding claims, wherein in the tracking phase changes in position of elements of the real environment within with the camera (20) listed on evaluated images to determine the pose.

11. The method according to any one of the preceding claims, wherein the particular pose is used for positionally accurate augmentation of the field of view of the user with information (60).

12. The method of claim 11, wherein the information (60) are inserted by means of smart glasses in the field of view of the user.

A system for determining a user's pose with respect to a real environment he is considering, the system comprising: a camera (20) for acquiring clippings (10, 11) of the real environment in view of the user Memory for a complete model of the real environment, decomposed into different submodels (0..8),

Initialization means for selecting a partial model (0..8) suitable for determining an initial pose of the user from the overall model as a function of a section of the real environment detected by the camera (20) during an initialization phase and for determining the initial pose by comparing the section with the partial model (0..8),

Tracking means for determining the pose of the user during a tracking phase following the initialization phase, wherein the tracking means are provided for continuously determining the pose from the initial pose by means of a tracking algorithm, and monitoring means for restarting the initialization phase as soon as the one achieved in the tracking phase or he - Targeted accuracy of the determination of poses no longer meets a given quality criterion.