WO2018134897A1 - Position and posture detection device, ar display device, position and posture detection method, and ar display method - Google Patents
Position and posture detection device, ar display device, position and posture detection method, and ar display method Download PDFInfo
- Publication number
- WO2018134897A1 WO2018134897A1 PCT/JP2017/001426 JP2017001426W WO2018134897A1 WO 2018134897 A1 WO2018134897 A1 WO 2018134897A1 JP 2017001426 W JP2017001426 W JP 2017001426W WO 2018134897 A1 WO2018134897 A1 WO 2018134897A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- display
- unit
- orientation
- content
- map information
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01B—MEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
- G01B11/00—Measuring arrangements characterised by the use of optical techniques
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
Definitions
- the present invention relates to a position / orientation detection technique and an AR (Augmented Reality) display technique.
- the present invention relates to a position / orientation detection technique and an AR display technique of an apparatus including an imaging unit.
- Patent Document 1 Japanese Patent Laid-Open No. 2004-133867 describes that “when a user designates a three-dimensional object with a cursor on a three-dimensional map image, an actual building corresponding to the three-dimensional object is captured from the image taken by the camera at that time. The extracted image portion is extracted as texture image data and registered as texture image data of the three-dimensional object.After that, the rendering processing unit texture-maps the registered texture image data as the surface texture.
- a navigation device is disclosed that is drawn on a three-dimensional map image (summary excerpt).
- Patent Document 2 Japanese Patent Laid-Open No. 2004-151867 discloses that “the image feature that stores the image feature Fa to be recognized, the image feature detection unit that detects the image feature Fb from the preview image to be recognized, and the recognition based on the matching result of the image features Fa and Fb”.
- a posture estimation unit that estimates an initial posture of a target, a tracking point selection unit that selects a tracking point Fe from an image feature Fa based on the initial posture, and a template that generates a template image of a recognition target based on the estimation result of the initial posture
- a generating unit that matches the template image and the preview image with respect to the tracking point Fe
- a posture tracking unit that tracks the posture of the recognition target in the preview image based on the tracking point Fe that has been successfully matched.
- JP 2009-276266 A Japanese Unexamined Patent Publication No. 2016-066187
- AR display is a technique for superimposing and displaying information such as images and data related to a real scene (real scene) viewed by a user.
- information such as images and data related to a real scene (real scene) viewed by a user.
- it is necessary to specify the position on the user side and the direction (line-of-sight direction) with high accuracy.
- Patent Document 1 a user designates an object to be displayed with related information in an actual scene superimposed. For this reason, it takes time and effort. Also, a 3D map is displayed by pasting a sight image around the car as a texture image on a 3D model. For this reason, a three-dimensional map and a three-dimensional model corresponding to the scene around the vehicle are essential. Since image processing is performed using a 3D map and a 3D model corresponding to the real space, the amount of information to be processed increases.
- Patent Document 2 tracks the posture change of the recognition target using an image. Although the recognition target, that is, the posture change of the object can be detected, the position on the user side cannot be detected.
- the present invention has been made in view of the above circumstances, and an object of the present invention is to provide a position detection technique for detecting the position and orientation on the user side with high accuracy from a small amount of information with a simple configuration. Problems, configurations, and effects other than those described above will be clarified by the following description of embodiments.
- the present invention provides a photographing unit that photographs a predetermined photographing range including two or more objects, an object detection unit that identifies the pixel position of each of the objects in a photographed image photographed by the photographing unit, and each of the objects
- a direction calculation unit that calculates an object direction that is a direction with respect to the shooting unit using the pixel position, map information of each object, and a focal length of the shooting unit, and an object direction of each object
- a position / orientation calculation unit that calculates the position and orientation of the photographing unit using the map information
- the object detection unit stores information on positions and shapes of a plurality of objects in a predetermined area.
- 2D map information corresponding to the shooting range is extracted from the 2D map information, and the 2D map information Providing the position and orientation detecting apparatus characterized by specifying the pixel position using.
- An AR display device that displays content on a display having transparency and reflectivity in association with an object in a scene behind the display, the position and orientation detection device, and the content displayed on the display.
- the generated display content generation unit uses the generated display content generation unit, the position and orientation of the photographing unit determined by the position / orientation detection device, and the pixel position of the object specified by the object detection unit on the display of the generated content.
- an AR display device comprising: a superimposing unit that determines a display position; and a display unit that displays the generated content at a display position determined by the superimposing unit on the display.
- the position and orientation on the user side can be detected with high accuracy from a small amount of information with a simple configuration.
- FIG. 1 It is a functional block diagram of the position and orientation detection apparatus of the first embodiment.
- (A) is a block diagram of the imaging
- (b) is a hardware block diagram of the position and orientation detection apparatus of 1st embodiment.
- (A) is a block diagram of the map server system of 1st embodiment
- (b) is a block diagram of the content server system of 2nd embodiment.
- (A)-(f) is explanatory drawing for demonstrating the position and orientation detection method of 1st embodiment. It is a flowchart of the position and orientation detection processing of the first embodiment.
- (A) And (b) is explanatory drawing for demonstrating pattern matching.
- (A)-(f) is explanatory drawing for demonstrating how to determine the representative point of 1st embodiment.
- (A) And (b) is explanatory drawing for demonstrating the modification of the position and orientation calculation method of 1st embodiment.
- (A)-(c) is explanatory drawing for demonstrating the modification of the position and orientation calculation method of 1st embodiment.
- It is a functional block diagram of AR display device of a second embodiment.
- (A) And (b) is explanatory drawing for demonstrating the display part of 2nd embodiment, and the display position of a related content. It is a flowchart of AR display processing of the second embodiment.
- (A) And (b) is explanatory drawing for demonstrating the example of a display of 2nd embodiment.
- (A) And (b) is explanatory drawing for demonstrating the example of a display of 2nd embodiment.
- (A) And (b) is explanatory drawing for demonstrating the example of a display of 2nd embodiment.
- (A) And (b) is explanatory drawing for demonstrating the example of a display of 2nd embodiment.
- (A) And (b) is explanatory drawing for demonstrating the example of a display of 2nd embodiment.
- the first embodiment is a position / orientation detection apparatus including an imaging unit.
- the position / orientation detection apparatus according to the present embodiment uses the image acquired by the imaging unit to detect its own position and orientation including the imaging unit.
- FIG. 1 is a functional block diagram of a position / orientation detection apparatus 100 according to the present embodiment.
- the position / orientation detection apparatus 100 according to the present embodiment includes a control unit 101, a photographing unit 102, a rough detection sensor 103, an evaluation object extraction unit 104, an object detection unit 105, and a direction calculation unit. 106, a position / orientation calculation unit 107, a map management unit 108, and a gateway 110.
- the position / orientation detection apparatus 100 of the present embodiment includes a captured image holding unit 121 that temporarily holds information, a positioning information holding unit 122, a two-dimensional map information holding unit 123, a map information holding unit 124, Is provided.
- the control unit 101 monitors the operation status of each part of the position / orientation detection apparatus 100 and controls the entire position / orientation detection apparatus 100.
- it is composed of a circuit or the like.
- the CPU may execute the program stored in advance.
- the photographing unit 102 photographs a scene in the real space (actual scene) and acquires a photographed image.
- the captured image is stored in the captured image holding unit 121.
- a shooting range including at least two or more objects is shot.
- the photographing unit 102 includes a lens 131 that is an image forming optical system, and an image sensor 132 that converts the formed image into an electric signal.
- the imaging device 132 is a CMOS (metal-oxide-semiconductor) image sensor, a CCD (Charge-Coupled Device) image sensor, or the like.
- Reference numeral 133 denotes an optical axis of the lens.
- the photographed image holding unit 121 may hold a plurality of photographed images as necessary. This is to extract temporal changes and to select a captured image with a good shooting state.
- data obtained by performing various types of image processing on the acquired data may be stored as a captured image instead of the data itself acquired by the imaging unit 102.
- the image processing performed here is, for example, removal of lens distortion, adjustment of color and brightness, and the like.
- the coarse detection sensor 103 detects the position and orientation in the real space of the position / orientation detection apparatus 100 including the imaging unit 102 with coarse accuracy, and stores the detected position and orientation in the positioning information holding unit 122 as positioning information.
- GPS Global Positioning System
- the coarse detection sensor 103 is a GPS receiver.
- an electronic compass is used to detect the posture.
- the electronic compass is composed of a combination of two or more magnetic sensors. If the magnetic sensor is compatible with three axes, direction detection in a three-dimensional space is possible. If the measurement surface is limited to a horizontal plane, a biaxial magnetic sensor may be used and an electronic compass with reduced cost may be used.
- the positioning information held by the positioning information holding unit 122 is not limited to the positioning information obtained by the coarse detection sensor 103.
- it may be positioning information obtained by a position / orientation calculation unit 107, which will be described later, or both.
- the positioning information is used for extraction of two-dimensional map information and map information described later, calculation of position and orientation, and the like.
- the map management unit 108 acquires two-dimensional map information and map information in a range necessary and sufficient for processing by each unit of the position / orientation detection apparatus 100 including the visual field range of the photographing unit 102, and acquires the two-dimensional map information holding unit 123 and the map Each information is stored in the information holding unit 124.
- the acquisition is performed via the gateway 110 via a network or the like from a server or storage device that holds such information, for example.
- the acquisition range is determined based on the positioning information.
- the 2D map information held in the 2D map information holding unit 123 is object information of each object included in a predetermined area.
- the object information includes the position (position in the map), shape (appearance), feature point, and the like of each object in the area.
- 2D map information is created from images taken with a camera, for example.
- position information map absolute position
- position information of the shooting range is simultaneously acquired as information for specifying a predetermined area.
- the photographed image is analyzed, and the object and its feature point are extracted.
- the pixel position in the image of the extracted object is specified and set as the map position.
- the appearance shape of the object is acquired by using, for example, Google Street View.
- the original shooting range is a two-dimensional map, and for each two-dimensional map, the map absolute position of the shooting range is associated with the position, shape, and feature point of each object in the area, and the two-dimensional map. Information. Since each two-dimensional map information has a map absolute position, the image can be deformed in accordance with the optical characteristics of the photographing unit 102.
- an object registered in the two-dimensional map information is referred to as a registered object.
- the two-dimensional map information may further include attribute information for each registered object.
- the attribute information includes, for example, the type of registered object.
- the type of registered object is, for example, a building, a road, a signboard, or the like.
- the two-dimensional map information may be acquired by analyzing an image obtained by removing distortion based on information related to the camera and modulating the color and brightness of the photographed image.
- the coordinates or addresses of each object in the real space are registered as map information.
- map information for example, a Google map of Google Inc. can be used.
- the coordinates for example, latitude and longitude are used.
- height information may be included.
- three-dimensional position measurement is possible.
- a local coordinate system based on any actual location may be used.
- a data structure in which the position measurement accuracy is increased by specializing in a limited area may be used.
- the evaluation object extraction unit 104 extracts object candidates to be processed (hereinafter referred to as evaluation object candidates) from each registered object registered in the two-dimensional map information. For example, all registered objects in the two-dimensional map information may be set as evaluation object candidates.
- object attribute information is stored, a registered object whose attribute information matches a predetermined condition may be extracted as an evaluation object candidate. For example, only building objects are extracted.
- the 2D map information information held in the 2D map information holding unit 123 is used.
- the minimum two-dimensional map group is stored in the two-dimensional map information storage unit 123.
- the evaluation object extraction unit 104 may further narrow down 2D map information for extracting evaluation object candidates using the positioning information. For example, using the positioning information, the visual field range of the photographing unit 102 in the real space is calculated. Then, only the two-dimensional map information that matches the visual field range is scanned to extract evaluation object candidates.
- the object detection unit 105 identifies the position (pixel position) of each extracted evaluation object candidate in the captured image.
- An evaluation object candidate whose position is specified in the captured image is set as an evaluation object.
- the positions of at least two evaluation objects are specified.
- the position in the captured image is specified by pattern matching.
- a template image used for pattern matching is created using shape information of two-dimensional map information.
- the object detection unit 105 calculates the distance in the horizontal direction (horizontal distance) from the photographed image origin of each evaluation object using the specified pixel position.
- the direction calculation unit 106 calculates the direction (object direction) of each evaluation object.
- the object direction for example, an angle from the direction of the optical axis 133 of the lens 131 of the photographing unit 102 is obtained.
- the angle is calculated using the pixel position calculated by the object detection unit 105, the horizontal distance, the map information of the evaluation object, and the focal length of the lens 131 of the photographing unit 102.
- the direction calculation unit 106 may correct the angle error due to distortion of the lens 131 and calculate the direction.
- the angle error due to distortion is calculated from the relationship between the angle of view of the lens 131 and the image height.
- the relationship between the angle of view and the image height necessary for the calculation is acquired in advance.
- the position / orientation calculation unit 107 calculates the position and orientation of the photographing unit 102 using the object direction of each evaluation object calculated by the direction calculation unit 106 and the map information.
- the position to be calculated is a coordinate in the same coordinate system as the map information.
- the calculated posture is the direction of the optical axis 133.
- the posture can be represented by a direction angle, an elevation angle, a pitch angle, a yaw angle, and a low angle.
- the gateway 110 is a communication interface.
- the position / orientation detection apparatus 100 transmits / receives data to / from, for example, a server connected to a network via the gateway 110.
- a server connected to a network via the gateway 110 for example, two-dimensional map information and map information are downloaded from a server connected to the Internet, or the generated two-dimensional map information is uploaded to the server as will be described later.
- FIG. 3A shows an example of the configuration of the server (map server) system 620 from which the two-dimensional map information is acquired.
- the map server system 620 includes a map server 621 that controls operations, a map information storage unit 622 that stores map information, a 2D map information storage unit 623 that stores 2D map information, And a communication I / F 625.
- the map server 621 receives a request from the map management unit 108, and transmits map information and two-dimensional map information in the requested range from each storage unit to the request source.
- each object 624 included in the two-dimensional map information may be held independently.
- each object 624 is held in association with the two-dimensional map information including the object 624.
- the map server 621 has a function of extracting feature points of the object 624, analyzing the two-dimensional map information 210 transmitted from the position / orientation detection apparatus 100, extracting an object, and registering the two-dimensional map information 210 as necessary. May be.
- map server system 620 that manages the map information and the two-dimensional map information is not limited to one system. Such information may be divided and managed in a plurality of server systems on the network.
- the position / orientation detection apparatus 100 of the present embodiment includes a CPU 141, a memory 142, a storage device 143, an input / output interface (I / F) 144, and a communication I / F 145.
- the program is stored in advance in the storage device 143 by the CPU 141 loaded into the memory 142 and executed. All or some of the functions may be realized by hardware or a circuit such as ASIC (Application Specific Integrated Circuit) or FPGA (Field-programmable gate array).
- ASIC Application Specific Integrated Circuit
- FPGA Field-programmable gate array
- various data used for the processing of each function and various data generated during the processing are stored in the memory 142 or the storage device 143.
- the captured image holding unit 121, the positioning information holding unit 122, the two-dimensional map information holding unit 123, and the map information holding unit 124 are constructed in, for example, the memory 142 provided in the position and orientation detection device 100.
- Each holding unit may be realized by a memory 142 provided separately, or may be configured by a single memory 142 that is partially or entirely integrated. You may divide according to required capacity and speed. Note that the two-dimensional map information is particularly used in processing for extracting an object. For this reason, it is desirable to construct the two-dimensional map information holding unit 123 that holds this information in a memory area that can be accessed at a relatively high speed.
- FIG. 4 (a) to 4 (f) are diagrams for explaining the position detection processing of the present embodiment, and FIG. 5 is a processing flow.
- initial processing, shooting processing, object detection processing, direction calculation processing, and position / posture calculation processing are performed in this order.
- the rough detection sensor 103 acquires positioning information as the approximate position and orientation of the photographing unit 102, and acquires two-dimensional map information and map information used for processing.
- the rough detection sensor 103 acquires the rough position and the rough posture of the position / orientation detection apparatus 100 as positioning information (step S1101).
- the acquired positioning information is stored in the positioning information holding unit 122.
- the map management unit 108 acquires the two-dimensional map information 210 and the map information 230 necessary for processing, and registers them in the two-dimensional map information holding unit 123 and the map information holding unit 124, respectively (step S1102).
- the map management unit 108 calculates coordinates of the visual field range of the imaging unit 102 using the positioning information. Then, two-dimensional map information corresponding to the coordinates calculated by the map absolute position is acquired. Also, map information corresponding to the calculated coordinates is acquired.
- the photographing unit 102 photographs an actual scene including two or more objects, and obtains a photographed image 220 (step S1103).
- object detection processing is performed.
- the object is detected by specifying the pixel position of the evaluation object in the captured image 220.
- the evaluation object extraction unit 104 accesses the two-dimensional map information holding unit 123 and acquires the two-dimensional map information 210. Then, as shown in FIG. 4A, registered objects in the acquired two-dimensional map information 210 are extracted (step S1104) and set as evaluation object candidates 211, 212, and 213.
- FIG. 4A illustrates a case where three evaluation object candidates 211, 212, and 213 are extracted.
- the object detection unit 105 generates a template image for each extracted evaluation object candidate 211, 212, 213 (step S1106). Then, the captured image 220 is scanned with the generated template image, and an evaluation object is specified in the captured image 220 (step S1107). The object detection unit 105 identifies the evaluation object in the captured image 220 for each of the extracted evaluation object candidates 211, 212, and 213, and repeats until at least two evaluation objects are detected (step S1105).
- FIG. 4B shows an example in which two evaluation objects 221 and 222 are detected in the captured image 220.
- the evaluation object 221 corresponds to the evaluation object candidate 211
- the evaluation object 222 corresponds to the evaluation object candidate 212, and is detected.
- no evaluation object corresponding to the evaluation object candidate 213 has been detected.
- the object detection unit 105 calculates the horizontal distances PdA and PdB of the evaluation objects 221 and 222 from the origin of the captured image 220, respectively (step S1108).
- the horizontal distances PdA and PdB are the number of pixels on the image sensor 132.
- the origin of the captured image 220 indicated by a black dot in FIG. 4B is the point of the image sensor 132 that coincides with the direction in which the image capturing unit 102 faces, that is, the center of the optical axis 133 of the lens 131.
- the horizontal distance is a horizontal distance between the representative point in each evaluation object 221 and 222 and the origin of the captured image.
- the representative point is, for example, a rectangular center point that constitutes the evaluation objects 221 and 222.
- the midpoint of the shape associated with each of the evaluation objects 221 and 222 may be used as the representative point. Details of how to determine the representative points will be described later.
- the object detection unit 105 refers to the map information 230 of FIG. 4C and acquires the map information of the evaluation objects 221 and 222 (step S1109).
- the object detection unit 105 uses the map absolute position of the two-dimensional map information 210 that is the basis of the evaluation objects 221 and 222 and the map position of the evaluation object candidates 211 and 212 to generate map information.
- the objects (real objects) 231 and 232 in 230 are associated with each other.
- map information ((XA, YA), (XB, YB)) of the associated real objects 231 and 232 is acquired.
- FIG. 4C a two-dimensional map is illustrated for convenience, but a three-dimensional map may be used.
- the direction calculation unit 106 performs a direction calculation process. That is, the direction calculation unit 106 calculates the object direction of each evaluation object 221 and 222 (step S1110). As shown in FIGS. 4D and 4E, the direction calculation unit 106 calculates angles ⁇ A and ⁇ B with respect to the optical axis direction of the lens 131 of the photographing unit 102 as object directions, respectively.
- FIG. 4D is a diagram for explaining a method of estimating the existence direction of the photographing unit 102 using the horizontal distance PdA in the photographed image 220 of the evaluation object 221 corresponding to the real object 231.
- FIG. 4E is a diagram for explaining a method of estimating the object direction using the horizontal distance PdB of the evaluation object 221 corresponding to the real object 232.
- the image of the real object 231 is formed on the image sensor 132 through the lens 131. Therefore, the angle ⁇ A formed by the real object 231 and the optical axis 133 of the lens 131 of the photographing unit 102 is calculated geometrically and optically by the following expression (1) using the position PdA on the image sensor.
- f is the focal length of the lens 131.
- an angle ⁇ B formed by the real object 232 and the optical axis 133 of the lens 131 of the photographing unit 102 shown in FIG. 4E is similarly calculated by the following equation (2).
- the lens 131 is distorted, errors are added to the calculated angles ⁇ A and ⁇ B. For this reason, it is desirable to use the lens 131 with little distortion. In addition, it is desirable that the relationship between the angle of view of the lens 131 and the image height is acquired in advance, and the angle error due to distortion of the lens 131 is corrected.
- the position / orientation calculation unit 107 performs position / orientation calculation processing.
- the map information A (XA, YA) and B (XB, YB) of the matching evaluation objects 221 and 222 and the object directions ⁇ A and ⁇ B are respectively referred to, and the existence range of the photographing unit 102 in the real space is determined. Ask for each.
- photography part 102 are specified from several existing ranges (step S1111).
- the position to be calculated is map information of the photographing unit 102.
- the calculated posture is the direction of the optical axis 133 of the lens 131.
- the optical axis 133 of the lens 131 forms an angle ⁇ A with the real object 231 and forms an angle ⁇ B with the real object 232.
- the trajectory of the position that satisfies these two conditions is a trajectory of a point having a constant circumferential angle, and becomes a circle 241 passing through the real object 231 and the real object 232.
- the radius R of the circle 241 is calculated by the following equation (3) using the positions A (XA, YA) and B (XB, YB) of the real object 231 and the real object 232 in the real space.
- the locus where the imaging unit 102 exists can be specified as an arc AB indicated by a solid line of a circle 241.
- the relationship between ⁇ A and ⁇ B is reversed left and right. For this reason, it does not hold as a locus where the photographing unit 102 exists.
- the position / orientation calculation unit 107 specifies the direction of the optical axis 133 of the lens 131 of the photographing unit 102 using the direction detected by the rough detection sensor 103.
- the position of the imaging unit 102 existing on the arc AB is uniquely determined.
- the detected position of the imaging unit 102 is the principal point of the lens 131 provided in the imaging unit 102.
- the detection position is not limited to the principal point. Any position in the photographing unit 102 may be used.
- the position and orientation of the photographing unit 102 can be detected.
- the position / orientation detection apparatus 100 repeats the above processing at predetermined time intervals, and always detects the latest position and orientation of the imaging unit 102.
- the captured image acquisition process in step S1103 may be the first.
- the position / orientation detection apparatus 100 starts the above process when the photographing unit 102 acquires a photographed image.
- each evaluation object candidate 211 and 212 may be registered in another two-dimensional map information 210.
- FIGS. 6 (a) to 6 (c) show a template image 310
- FIG. 6B shows a captured image 220.
- the pattern matching process is a process for determining whether or not there is the same image as the template image in a certain image. If there is the same image, the pixel position of the image can be specified.
- the object detection unit 105 generates a plurality of template images 311 to 316 having different inclinations and sizes as shown in FIG.
- a case where six types of template images 311 to 316 are generated is illustrated. If there is no need to distinguish, the template image 311 is used as a representative.
- the object detection unit 105 scans the captured image 220 in the direction indicated by the arrow in FIG. 6B for each generated template image 311 and evaluates the degree of similarity.
- a method for comparing the similarity of images there is a method of taking a difference and evaluating the histogram.
- 0 is obtained as a difference if they completely match, and a value far from 0 is obtained as the degree of matching decreases.
- the evaluation result of the similarity between the template image 311 and the applied area in the captured image 220 is recorded. This is repeated for each of the template images 311 to 316. The entire area of the captured image 220 is evaluated with respect to all prepared template images 311, and an area with the smallest difference is determined as a matching area.
- a region 225 in the captured image 220 is a region where the similarity evaluation with the template image 311 is maximized. Therefore, the object detection unit 105 determines that an evaluation object exists in the captured image 220, and specifies the region 225 as the region (pixel position) of the evaluation object.
- the object detection unit 105 can simultaneously acquire the pixel position of the evaluation object from the evaluation result. Using this, the horizontal distance PdA can be determined.
- the orientation of the evaluation object is specified in advance and the template image to be used is specified, one template image is extracted and only that is necessary. Further, when the approximate existence area of the evaluation object in the captured image 220 is known, the evaluation object may be performed only for the area.
- the matching region that is, the pixel position in the captured image of the evaluation object can be easily specified.
- the shadow conditions of the two are different, the images cannot be exactly the same.
- countermeasures related to image information related to brightness and color and countermeasures related to object distortion depending on the shooting direction may be applied to the captured image.
- the distortion of the object due to the photographing direction means, for example, a deformation in which a three-dimensional space has a rectangular plane when an observer is facing the object, and looks almost trapezoid when viewed from an oblique direction.
- the two-dimensional map information 210 and the captured image 220 are converted to gray scale, and the brightness is made uniform from the respective histograms.
- a countermeasure for distortion of the object for example, as shown in FIG. 6A, when generating the template image 311 from the object, it is virtually deformed three-dimensionally. In other words, in order to detect an object even if it is deformed as described above, shape conversion is performed on the object so that a rectangle becomes a trapezoid, and template images 311 to 316 are obtained.
- a template image is generated by performing shape conversion corresponding to each of the assumed shooting directions, and template matching is performed. Thereby, it can respond to arbitrary imaging directions.
- a template image is generated by applying deformation such as rotation and enlargement / reduction to the object.
- the pattern matching process described above is a general technique and has an advantage that it can be processed at high speed.
- the determination is made within the template image used when specifying the evaluation objects 221 and 222.
- the center of the outer shape is set as the representative point 331.
- a rectangle indicated by a broken line is defined from the outermost shape, and the center of the rectangle is set as the representative point 331.
- a rectangle indicated by a broken line from the outermost shape may be defined as shown in FIG. 7D, and the center of the rectangle may be set as the representative point 331.
- a part of the rectangle may be defined by a rectangle, and the center of the rectangle may be used as the representative point 331.
- the outermost shape is defined and the center of the rectangle is set as a representative point 331 as shown in FIG.
- the method of defining a rectangle from the outermost shape and setting the center to the representative point 331 is simple and most desirable.
- the outermost diameter may not be used as shown in FIG.
- any corner of the object may be used.
- the representative point 331 is used when calculating the horizontal distance from the origin of the evaluation object in the position and orientation detection process. For this reason, it is desirable that the representative point 331 is configured so as to be accurately associated with the map information.
- the position / orientation detection apparatus 100 captures a predetermined photographing range including two or more objects, and the object in the photographed image photographed by the photographing unit 102.
- An object detection unit 105 that identifies each pixel position; an object direction that is a direction of each of the objects with respect to the shooting unit 102; a pixel position; map information of each of the objects; a focal length of the shooting unit;
- a position / orientation calculation unit 107 that calculates the position and orientation of the photographing unit using the object direction of each object and the map information.
- the object detection unit 105 extracts the two-dimensional map information corresponding to the shooting range from the two-dimensional map information in which the information on the positions and shapes of a plurality of objects in a predetermined area is stored.
- the pixel position is specified using dimension map information.
- the position and orientation of the photographing unit 102 are calculated using an image actually taken by the photographing unit 102. At this time, two objects are extracted from the captured image and calculated using them. For this reason, image processing using a three-dimensional map or a three-dimensional model corresponding to the real space is unnecessary. Moreover, it does not depend on the accuracy of an external GPS or the like. Therefore, according to the present embodiment, the position and orientation of the photographing unit 102 can be detected with high accuracy with a simple configuration and a small amount of information processing. Therefore, the position and orientation of various devices whose relative position and relative direction with respect to the imaging unit 102 are known can be estimated with high accuracy.
- the position and orientation of the photographing unit 102 included in the computer can be obtained by a pattern matching process that the computer is good at and simple geometric calculation. That is, the position / orientation detection apparatus 100 that detects its own position and orientation can be realized with a small amount of information.
- the position and orientation detection by the above method is repeated at predetermined time intervals. Accordingly, the position and orientation of the position / orientation detection apparatus 100 including the imaging unit 102 can be identified with high accuracy and constantly with a small amount of information.
- the position / orientation calculation unit 107 calculates the position and orientation of the photographing unit 102 using the two evaluation objects 221 and 222.
- the calculation of the position and orientation by the position and orientation calculation unit 107 is not limited to this method. For example, three evaluation objects may be used.
- the map information ((XA, YA), (XB, YB), (XC, YC)) of the real objects 231, 232, 233 corresponding to the three evaluation objects, and the light of the lens 131
- the angles ( ⁇ A, ⁇ B, ⁇ C) with respect to the axis 133 are acquired.
- the circle 241 where the photographing unit 102 exists is determined.
- a circle 242 where the photographing unit 102 exists is determined from information on the real object 232 and the real object 233.
- the intersection of both the circles 241 and 242 is the position of the photographing unit 102, and the direction satisfying the angle of the lens 131 with respect to the optical axis 133 is the direction of the photographing unit 102, that is, the posture.
- the position and orientation of the photographing unit 102 can be obtained only from the detection result of the evaluation object.
- the circle 241 and the circle 242 be as far apart as possible. Therefore, for example, in the example of FIG. 8A, the position of the photographing unit 102 is determined using the circle 241 and the circle 242 instead of the circle passing through the circle 241, the real object 231 and the real object 233. It is desirable to do.
- the number of evaluation objects used when specifying the position and orientation of the imaging unit 102 is not limited to three. Three or more may be sufficient.
- N is an integer of 1 or more.
- N is an integer of 1 or more.
- the position and orientation of the photographing unit 102 can be specified if three or more objects can be detected.
- the position and orientation detection accuracy is further improved.
- the position / orientation calculation unit 107 may calculate the position and orientation of the photographing unit 102 using two evaluation objects and information about the traffic infrastructure around the photographing unit 102, for example. This method will be described with reference to FIG. Here, the case where the information on the road 234 is used as the information on the traffic infrastructure will be described as an example. In addition, the photographing unit 102 is assumed to be mounted on a moving body traveling on the road 234.
- the shape of the traffic infrastructure such as the road 234 is defined by the standard.
- the imaging unit 102 can recognize the road 234 on which the mounted moving body is traveling, and obtain the traveling direction (the optical axis direction of the imaging unit 102).
- the obtained information on the optical axis direction is used in place of the result of the rough detection sensor 103 of the above embodiment, and the position and orientation of the photographing unit 102 are determined.
- route information such as bridges and intersections and position information such as signs and traffic lights may be used.
- the position and orientation of the photographing unit 102 may be detected using the two-dimensional map information 210 using the road 234 itself as an evaluation object. Thereby, the evaluation object to be used can be reduced.
- the position and orientation can be specified without using the orientation information of the coarse detection sensor 103. For this reason, accuracy can be improved.
- FIG. 9A to FIG. 9C are diagrams for explaining this modification. Here, two or more evaluation objects are used.
- the appearance of the real object 231 corresponding to the evaluation object 221 has a rectangular shape 251 as shown in FIG.
- the external shape obtained from the captured image 220 is a trapezoid 252 as shown in FIG.
- the positional relationship between the real object 231 and the photographing unit 102 in the real space can be specified from the deformation amount.
- the object detection unit 105 When pattern matching the evaluation object in the captured image, the object detection unit 105 generates a template image by adding deformation parameters such as scaling, deformation, rotation, and distortion based on the shape 251 viewed from the front.
- the evaluation object 221 is pattern-matched with the generated template image.
- a normal line (indicated by an arrow in the figure) of the front shape 261 of the real object 231 corresponding to the evaluation object 221 is obtained using the deformation parameter of the template image used for pattern matching.
- the direction of the normal of the front shape 261 that is a plane is determined from the deformation amount of the shape in the captured image 220. Can be sought. Then, using this, the direction of the photographing unit 102 (direction of the optical axis 133) can be obtained.
- the direction of the building surface in the real space may be obtained from the map information 230 or the like.
- the position and orientation detection method according to the present modification does not directly use the values of the coarse detection sensor 103 such as the GPS or the electronic compass, and therefore the position and orientation are not affected by the accuracy of the coarse detection sensor 103. Can be obtained with high accuracy.
- each position and orientation calculation method described above is preferably selected according to the required accuracy.
- the position and orientation of the photographing unit 102 are calculated using the horizontal distance of each evaluation object has been described.
- the vertical distance may also be measured and the height direction of the object may be estimated.
- the map management unit 108 uses the positioning information acquired by the rough detection sensor 103 to extract two-dimensional map information and map information necessary for the process, and stores them in the holding unit. Yes.
- the map management unit 108 uses the positioning information acquired by the rough detection sensor 103 to extract two-dimensional map information and map information necessary for the process, and stores them in the holding unit. Yes.
- it is not limited to this.
- the information may be used to extract necessary two-dimensional map information and map information.
- the map management unit 108 can extract necessary and sufficient two-dimensional map information with higher accuracy, and the processing accuracy is improved.
- the position and orientation can be calculated even in a place where a signal from the GPS cannot be received.
- the position / orientation detection apparatus 100 may further include a two-dimensional map generation unit 109.
- the two-dimensional map generation unit 109 analyzes the captured image 220 acquired by the imaging unit 102 using the position and orientation of the imaging unit 102, and generates two-dimensional map information 210. At this time, the pixel position of the object in the captured image 220 and the appearance of the object are used. That is, the pixel position of each object detected by the object detection unit 105 is set as a map position, and the shape used for pattern matching is set as an object shape.
- the two-dimensional map information 210 may be used as it is.
- the two-dimensional map generation unit 109 associates the position of the object in the real space with the appearance of the photographed object, and generates two-dimensional map information.
- an unknown object that is not registered in the two-dimensional map for example, a newly constructed building, is newly registered as two-dimensional map information by associating the appearance with the position information. be able to.
- Non-known objects include, for example, buildings that are not registered in the two-dimensional map information 210 and the map information 230, and buildings that have been renovated and have changed appearance.
- a plurality of two-dimensional map information 210 including the same object is acquired.
- learning is performed using the information stored in the server through the network and the two-dimensional map information 210 stored in the two-dimensional map information holding unit 123 to calculate the feature points of the object.
- the result is used for object pattern matching. If it is the method of matching by a feature point, even if it is a case where a shield exists between an object and the imaging
- the position of the object may be specified using a plurality of captured images 220 having different acquisition times. Using the position of the target object in the captured image 220 and the position and orientation of the imaging unit 102, it is possible to specify an area where an object in real space may exist.
- the specified area is a straight line.
- the position / orientation detection apparatus 100 moves, if a similar process is performed after a predetermined time, another straight line is obtained as the object existence region.
- the intersection of the two straight lines obtained by the two processes is the position where the object actually exists (position in real space, latitude, longitude, coordinates, etc.).
- the position of the object candidate in the real space may be calculated from the change in the position and orientation of the photographing unit 102 and the change in the horizontal distance of the object candidate.
- 2D map information 210 can be updated in real time by additionally registering the obtained object information in the existing 2D map information 210.
- the appearance can be updated if the object is already registered in the two-dimensional map and the position is known. If the object moves, it can be updated to the latest position information.
- the position / orientation detection apparatus 100 obtains the position of the object in the real space using information used for detecting the position / orientation. Therefore, there is no need to calculate again to generate the two-dimensional map information. For this reason, it is possible to keep the two-dimensional map information up-to-date while reducing the calculation cost.
- a 3D map data model may be used.
- map data indicating the actual state of a building by its position and height
- the shape of the building in the real world is known.
- an image taken from the side surface may be pasted on map data having a three-dimensional shape to obtain two-dimensional map information.
- Google Street View can be used for images taken from the side.
- an image taken from the side may be pasted on 3D map data, and a virtual camera may be placed in the map data model for CG rendering.
- a virtual camera may be placed in the map data model for CG rendering.
- the present embodiment is an AR display device 500 including the position and orientation detection device 100 of the first embodiment.
- the present embodiment a case where the position / orientation detection apparatus 100 is mounted on an automobile and AR display is performed on the windshield of the automobile will be described as an example.
- FIG. 10 is a functional block diagram of the AR display device 500 of the present embodiment.
- the AR display device 500 of the present embodiment includes a position / orientation detection device 100, a display content generation unit 501, a display content selection unit 502, a superimposition unit 503, a display unit 504, and instruction reception.
- the case where the control unit 101 and the gateway 110 are shared with the position / orientation detection apparatus 100 will be described as an example.
- the content holding unit 511 is a memory that temporarily holds content including AR content, and is configured by a memory that can be accessed at high speed.
- the normal content and the AR content are collectively referred to as content unless it is particularly necessary to distinguish them.
- the position / orientation detection apparatus 100 basically has the same configuration as the position / orientation detection apparatus 100 of the first embodiment. However, in the position / orientation detection apparatus 100 of the present embodiment, the object detection unit 105 may be configured to detect and hold the pixel positions of all the evaluation objects extracted by the evaluation object extraction unit 104. This information is used when the superimposing unit 503, which will be described later, determines the position where the content is superimposed. In addition, the control unit 101 of the position / orientation detection apparatus 100 controls the operation of each unit of the entire AR display device 500.
- the gateway 110 functions as a communication interface for the AR display device 500.
- the instruction receiving unit 506 receives an operation input from the user (driver) 530.
- selection of content or a condition of content to be displayed is received.
- the reception is performed via an existing operation device such as an operation button, a touch panel, a keyboard, or a mouse.
- the extraction unit 507 acquires content that may be used in the processing of the AR display device 500 from the content stored in the server or other storage device.
- the acquired content is stored in the content holding unit 511. Since the information processed by the AR display device 500 is mainly peripheral information, the processing cost is reduced by acquiring and processing information limited to information that may be processed.
- the content to be acquired may be determined in consideration of information such as the traveling direction and speed of the vehicle on which the AR display device 500 is mounted. For example, when the speed of the moving object is v and the time required to download and hold the information in the holding unit is T, information in at least a circle with a radius Tv centered on the photographing unit 102 is acquired. Stored in the content holding unit 511. Further, when the travel route of the mounted vehicle is specified, information around the route may be extracted and stored in the content holding unit 511.
- FIG. 3B is a diagram for explaining the content server system 610.
- advertisements, contents, and AR contents are held and provided as information.
- the content server system 610 includes a content server 611, a content storage unit 612, an AR content storage unit 613, an advertisement storage unit 614, and a communication I / F 615.
- the advertisement is the text, image, video, etc. of the advertisement provided by the advertiser.
- the content is information according to the service content. For example, a moving image for entertainment, a game, etc. are mentioned.
- the AR content is content that is assumed to be AR-superposed, and includes meta information such as a display position and a posture in the real space in addition to the content itself.
- the meta information may be defined in advance or may be dynamically updated according to an instruction from the AR display device 500.
- the content server 611 outputs information held in each holding unit to the outside via the communication I / F 615 in response to an instruction received via the communication I / F 615. Further, the information received via the communication I / F 615 is held in the corresponding holding unit.
- the content server 611 may integrate the above-described information and provide the AR display device 500 through a network. As an example of integrating each information, for example, an advertisement is inserted into entertainment content.
- the AR content may be selected and extracted by a method similar to the method by which the position / orientation detection apparatus 100 extracts the two-dimensional map information and the map information.
- the content that the user desires to view may be instructed via the instruction receiving unit 506 and extracted according to the instruction.
- the display content selection unit 502 selects the content to be displayed from the content held in the content holding unit 511 according to the content display condition. Thereafter, the display conditions may be determined in advance or may be designated by the user via the instruction receiving unit 506.
- the instruction receiving unit 506 may be a motion recognition device that recognizes the user's motion.
- a recognition device such as gesture recognition that detects a user's movement by a camera, voice recognition that is detected by a microphone, and gaze recognition that detects a gaze can be used.
- gesture recognition that detects a user's movement by a camera
- voice recognition that is detected by a microphone
- gaze recognition that detects a gaze
- a voice recognition device can handle a variety of operations because it can give relatively detailed instructions. If it is a line-of-sight recognition apparatus, even if it operates, it is hard to be perceived by others around it, and it can consider surrounding environment.
- an operator's voice may be registered in advance so that the user who has input voice can be identified.
- the user may be configured to limit the contents accepted by voice recognition.
- the display unit 504 displays the content to be displayed on the windshield according to the instruction of the superimposing unit 503.
- the display unit 504 of the present embodiment includes a projector 521 and a display (projection area) 522 as shown in FIG.
- the display 522 is realized by combining optical components having transparency and reflectivity, and is disposed on the windshield.
- the scene in real space behind the display 522 is transmitted through the display 522.
- the video (image) generated by the projector 521 is reflected by the display 522.
- the user 530 views an image in which a scene in real space that has passed through the display 522 and an image reflected by the display 522 are superimposed.
- the display 522 is the entire area of the windshield, the content of the user 530 who is looking forward can be covered, and the content can be displayed in a wide range over the actual scene spreading forward.
- HUD Head-Up Display
- the display content generation unit 501 generates display content to be displayed on the windshield as a display destination from the selected content.
- the content is generated in a display mode suitable for superimposed display on a scene that is seen by the user's eyes through the windshield. For example, the size, color, brightness, etc. are determined.
- the display mode is determined in accordance with a user instruction or in accordance with a predetermined rule.
- the superimposing unit 503 determines the display position on the display 522 of the display content generated by the display content generating unit 501. First, the superimposing unit 503 specifies the display position (arrangement position) of the object on the display 522. Based on the display position of the object, the display position of the related content of the object is determined. The display position of the object is calculated using the position and orientation of the image capturing unit 102 detected by the position / orientation detection apparatus 100 and the pixel position of each object on the captured image 220 captured by the image capturing unit 102.
- the superimposing unit 503 of the present embodiment includes a geometric relationship between a location (in-vehicle reference position) as a position reference of an automobile and the position and orientation of the photographing unit 102, an in-vehicle reference position, and a user (driver).
- the geometrical relationship with the average visual field range 531 of 530 is held in advance.
- the position on the display 522 corresponding to the pixel position in the captured image of the corresponding evaluation object is calculated using the correspondence relationship.
- the display position of the related content may be set at the intersection of the line-of-sight direction in which the user 530 looks at the evaluation object 223 and the display 522.
- the photographing unit 102 photographs the evaluation object 223. From the photographing position of the evaluation object 223 in the photographed image, the direction in which the evaluation object 223 exists with respect to the photographing unit 102 is known.
- the relative position of the evaluation object 223 with respect to the photographing unit 102 is obtained.
- a line of sight 551 for viewing the evaluation object 223 from the eye position of the user 530 is obtained.
- a point 552 where the line of sight 551 intersects the display 522 may be a display position of related content.
- the related content may be a display offset by a designated position with respect to the evaluation object 223.
- the AR display device 500 is also provided with a CPU 141, a memory 142, a storage device 143, an input / output interface (I / F) 144, and a communication I / F 145, similar to the position and orientation detection device 100. It is realized with.
- the program is stored in advance in the storage device 143 by the CPU 141 loaded into the memory 142 and executed. All or some of the functions may be realized by hardware or a circuit such as ASIC (Application Specific Integrated Circuit) or FPGA (Field-programmable gate array).
- various data used for the processing of each function and various data generated during the processing are stored in the memory 142 or the storage device 143.
- the content holding unit 511 is constructed in the memory 142 or the like, for example.
- FIG. 12 is a processing flow of the AR display processing of the present embodiment.
- the AR display process may be performed in synchronization with the position / orientation detection process performed by the position / orientation detection apparatus 100, or may be configured to be performed independently.
- the instruction receiving unit 506 receives in advance the conditions (display conditions) of the content to be displayed.
- the position / orientation detection apparatus 100 detects the position and orientation of the photographing unit 102 (step S2101). Note that the position / orientation detection apparatus 100 determines the position and orientation of the imaging unit 102 by the same method as in the first embodiment. This process is executed at predetermined time intervals.
- the display content selection unit 502 selects the content to be displayed from the contents held in the content holding unit 511 (step S2102). Here, it is determined whether or not to display each content according to the display condition. Only content that matches the display conditions is displayed. The selection may be performed for each object to be superimposed, for example.
- the display content generation unit 501, the superimposition unit 503, and the display unit 504 repeat the following processing for all selected display contents (step S2103).
- the display content generation unit 501 determines the display mode of the selected content (step S2104).
- the superimposing unit 503 determines the display position of the selected display content (step S2105). At this time, the superimposing unit 503 uses the position and orientation of the imaging unit 102 detected by the position / orientation detection apparatus 100 in step S2101 and the position of each object in the captured image 220.
- the display unit 504 displays the content in the display mode determined by the display content generation unit 501 at the position calculated by the superimposing unit 503 (step S2106). The above processing is repeated for all contents.
- the superimposing unit 503 displays the latest position and orientation of the photographing unit 102 at that time and the captured image 220.
- the display position is determined using the position information of each object.
- the AR display device 500 is an AR display device that displays content in association with an object in a scene behind the display 522 on the display 522 having transparency and reflection.
- the position and orientation detection apparatus 100 according to the first embodiment, the display content generation unit 501 that generates the content to be displayed on the display 522, and the position and orientation of the photographing unit 102 determined by the position and orientation detection apparatus 100.
- a display unit 504 that displays the generated content at the display position determined.
- the position of the object is specified in the captured image acquired by the imaging unit 102, and the display position of the related content is determined using the specified position. Therefore, if only the amount of deviation between the photographing range 541 by the photographing unit 102 and the user's viewpoint is corrected, the display position of the object on the display can be accurately specified.
- content can be displayed without using the relative position between the object and the AR display device 500.
- the position and orientation of the photographing unit 102 are acquired by the position and orientation detection apparatus 100 of the first embodiment. That is, the value of the coarse detection sensor 103 such as GPS or an electronic compass is not directly used. As a result, high accuracy can be obtained regardless of the accuracy of the coarse detection sensor 103. Therefore, it is possible to provide the AR display device 500 that can superimpose the AR with high accuracy.
- the coarse detection sensor 103 such as GPS or an electronic compass
- the display 522 has transparency
- the display 522 may be opaque.
- the synthesized image may be displayed on the captured image 220 captured by the capturing unit 102.
- the brightness of the captured image 220 may be reduced, for example.
- an image with reduced brightness dark image
- the case where the AR display device 500 is mounted on an automobile has been described as an example.
- the usage form of the AR display device 500 is not limited to this.
- the form which a user wears and uses may be sufficient.
- a small AR display device 500 can be provided.
- An example of such a configuration is HMD (Head Mounted Display).
- the position / orientation detection apparatus 100 is also mounted on the HMD.
- a position and orientation detection apparatus 100 for detecting the position of the HMD may be provided separately from the HMD.
- the position / orientation detection apparatus 100 captures an HMD. Then, using the method of generating the two-dimensional map information using the HMD as an object, the position (latitude and longitude, coordinates) of the HMD in the real space is calculated. Then, the display position of the content is determined using this position information. Thereby, the content can be displayed at a desired position with higher accuracy.
- the AR display device 500 can detect the absolute position and orientation even when mounted on a moving body, it is highly accurate even if the scene viewed by the user constantly changes.
- the content can be displayed at a desired position.
- the AR display device 500 can calculate the position and orientation of the photographing unit 102 with high accuracy and can superimpose AR with high accuracy. If high-precision AR superimposition becomes possible, the expression accuracy of AR content will increase and the expressive power will be enriched.
- the AR display device 500 does not have to include the photographing unit 102 in itself.
- image data captured by another image capturing device for example, a navigation device or a drive recorder in the case of an in-vehicle device
- a navigation device for example, a navigation device or a drive recorder in the case of an in-vehicle device
- a drive recorder for example, a navigation device or a drive recorder in the case of an in-vehicle device
- the AR display device 500 of the present embodiment may further include an eye tracking device 508 as shown in FIG.
- the eye tracking device 508 is a device that tracks the user's line of sight.
- the eye tracking device 508 is used to estimate the user's field of view and display the content in consideration of the direction of the user's line of sight. For example, the content is displayed on the display 522 in an area specified by the user's line-of-sight direction and field of view.
- the superimposing unit 503 of the present embodiment calculates the position for displaying the content from the relative position between the photographing unit 102 and the display 522 and the user's line-of-sight direction and field of view calculated by the eye tracking device 508.
- the user's line-of-sight direction calculated by the eye tracking device 508 is used to correct the position for displaying the content.
- an object in the direction in which the user is facing in real space can be identified from the line-of-sight direction of the user detected by the eye tracking device 508 and the position and orientation of the photographing unit 102.
- This object is an object that the user is watching. By using this, for example, it is possible to select and display content related to the object being watched by the user.
- the output of the eye tracking device 508 and the detection result of the object detection unit 105 are input to the display content selection unit 502.
- the display content selection unit 502 uses the user's line-of-sight direction and the pixel position of each object to identify the object that the user is watching and selects an object related to the object.
- FIG. 14 (a) shows an example of the display position.
- a building 711, a sign 712, and a guide plate 713 are illustrated as an example of an object (real object) that exists in real space. These real objects have known positions and appearances.
- the superimposing unit 503 determines the display position so that the related contents 811, 812, and 813 are displayed on the display 522 based on the pixel position of the evaluation object corresponding to each real object.
- FIG. 14A shows an example of superimposing display on a real object.
- the display content generation unit 501 and the superimposition unit 503 may classify the display content according to the meta information, and determine the display mode and the display position according to the classification result. For example, the display mode and the display position may be determined based on the update frequency, the required timing, the importance level, and the like. A display example in this case is shown in FIG.
- a distant display area (distant display area) 731 such as the sky or a display area where the bonnet can be seen (front side)
- the display mode and the display position are determined so as to be displayed in (display area) 732.
- a display mode and a display position are determined so as to be displayed in a road display area 733 that is a display area along the road. To do.
- the display mode and the display position are determined so as to be displayed in the side display area 734 positioned on the side with respect to the user's traveling direction.
- the display mode and the display position are determined so that the signal and the sign are displayed in the aerial display area 735 where the signal and the sign are not visually observed.
- the display mode and the display position may be determined so that the content is displayed in the display area (front display area) 736 in front of the windshield.
- the far region 731 and the near region 732 are not affected by the movement and the scene and hardly change. That is, the far region 731 and the near region 732 are regions (small change regions) in which the change in the appearance of the real space with the movement of the automobile is small.
- regions small change regions
- the processing by the superimposing unit 503 that calculates the display position according to the real space can be reduced. For this reason, if it comprises so that a static content may be displayed on a small change area
- the AR superimposition process can be efficiently reduced without giving the user a sense of incongruity. Thereby, the processing load of the AR display device 500 can be reduced.
- the side display area 734 remains as it is even when the object flows and moves away from the front by moving. Therefore, by displaying information that needs to be tracked in the side surface display area 734, the user can track the information even if the automobile has moved. Also, by displaying important information in the aerial display area 735, high readability can be obtained without impairing the visibility of the scene in the real space. In addition, by displaying in the front display area 736, the user can view the content without greatly removing the line of sight from the front.
- AR overlay display with high visibility and readability for the user is possible. Moreover, this AR superimposed display can be realized without any instruction from the user.
- the classification of display content is not limited to that based on meta information. For example, it may be performed according to a predetermined time for maintaining the display of the content.
- the operation mode of the moving body may be determined, and the display mode and / or display position may be determined according to the determination result. For example, content is displayed in the front display area 736 only in the automatic operation mode.
- the display content generation unit 501 and / or the superimposition unit 503 are configured to receive a signal indicating whether or not the automatic operation mode is set, for example, from the ECU or the like of the moving body.
- the automatic operation mode of the moving object is a mode in which the user does not need to actively operate and the moving object automatically operates. At this time, the user does not need to pay attention to the actual surrounding scene. However, during actual driving, it may be necessary for the user to actively drive depending on the surrounding traffic environment, time, place, etc., and there are cases where attention must be paid to the actual surrounding scene.
- the line of sight is facing away from the front.
- the content is displayed in front of the user. Therefore, the content can be viewed without removing the line of sight from the front. Even when the content is viewed during automatic driving, the user's line of sight is facing forward. For this reason, even when the automatic driving mode is canceled and it is necessary to call attention to the actual scene, it is possible to smoothly draw attention forward.
- the display mode and display position of the content displayed on the display 522 may be changed depending on the level of alerting required by the user.
- the display content generation unit 501 and / or the superimposition unit 503 receives signals such as a user's consciousness level, fatigue level, and driving state from a moving object or a sensor attached to the user. Furthermore, you may receive information, such as the surrounding traffic condition which the navigation mounted in the motor vehicle or the said vehicle grasps
- the display content generation unit 501 and / or the superimposition unit 503 combine these to determine the alert level.
- the content display content is not limited. Then, the display mode and / or the display position are determined so that the display content of the content is simplified as the alert level increases and the alert is required.
- the amount of information of the content decreases as the alerting becomes necessary, but the readability increases. For this reason, since the necessity for gazing decreases, the user can pay attention to others while obtaining information from the content.
- the retreat area 737 is an area that does not interfere with driving. For example, the outside of the windshield from the front.
- the save area 737 may be set smaller than the original display area.
- the display in the retreat area 737 is performed, for example, when the automatic operation mode is switched to the normal operation mode.
- the image may be continuously changed when retracted or reduced.
- the display content generation unit 501 and / or the superimposition unit 503 may be configured to detect in advance a situation such as switching to the normal operation mode using a navigation system or the like, and notify the evacuation. .
- the advance notice is performed, for example, by displaying a warning text in the warning area 743.
- a warning text instead of displaying the warning text, a change in color tone, a countdown display, or an audio warning may be used.
- the content display may be completely canceled when the alert level reaches a predetermined level or when a predetermined condition is satisfied.
- the case where the predetermined condition is satisfied is, for example, a case where the automatic operation mode is canceled.
- the display content generation unit 501 and / or the superimposition unit 503 increases the transparency of the display content. Thereby, it appears to the user that the displayed content fades out. This allows the user to see only the actual scene in front. In addition, before raising transparency, you may comprise so that a notice display may be performed by control of the display content production
- the display restriction may be canceled as shown in FIG.
- the case where the predetermined condition in this case is satisfied is, for example, a case where the automatic operation mode is set.
- the display content generation unit 501 and / or the superimposition unit 503 cancels the display restriction such as in the automatic operation mode, enlarges the size of the front display area 736, and determines the display mode and the display position to display the content.
- meta information may be used to display entertainment content such as a document or video that requires attention on a large screen.
- the display area change control may be performed by detecting the driving location, traffic conditions, and user conditions.
- the user's driving skill level may be controlled in addition to the criterion.
- the content display area and the display method can be changed according to the environmental condition, the driving condition, and the like.
- the AR display device 500 of the present modification since the display of content can be switched, both forward alerting and readability can be achieved.
- FIG. 16A is a diagram for explaining a method of associating a real object with content.
- FIG. 16A illustrates, for example, a case where the image is displayed in the display area 741 facing the user.
- a ribbon-like drawing effect 751 in which the content is drawn from the real object 714 is displayed.
- the superimposing unit 503 first obtains the coordinates of the display position of the real object 714 on the display 522 from the direction of the user's line of sight and the relative position of the AR display device 500 and the real object 714. Then, a ribbon-like image is generated so as to connect the display position of the real object 714 and the display coordinates of the display area 741.
- the drawer effect 751 may be a string.
- the drawing effect 751 may be translucent. When translucent, the relationship can be clarified without disturbing the actual scene.
- the string may be tied around the real object 714 and the display area 741 so as not to disturb the field of view.
- the content to be displayed is information that the user is interested in, it is possible to suppress a decrease in readability to other AR content.
- the user's line-of-sight direction detected by the eye tracking device 508 is used.
- the degree of interest of the user is specified based on the degree of coincidence between the viewing direction of the user and the display position of the real object.
- the degree of interest of the user may be determined by an active selection operation by the user. In this case, the intention of the user can be reflected.
- this method may be used to accumulate user selection results and use them for other processing.
- a reference marker 761 may be used as another method for associating a real object with content.
- the display content generation unit 501 and / or the superimposition unit 503 superimpose the reference marker 761 on the real object 714. Then, the content related to the real object 714 is displayed in an arbitrary information display area 742. At this time, the reference marker 761 is also displayed in the information display area 742.
- the configuration in which AR content related to an object is displayed using the above-described extraction effect 751 is particularly useful when the object itself has information.
- the real object 715 is a signboard.
- the signboard contains information.
- the content has further information.
- the display content generation unit 501 and / or the superimposition unit 503 determine the display mode and the display position so that the content related to the real object 715 is displayed in the information display area 742 set at an arbitrary position. At this time, the display content generation unit 501 and / or the superimposition unit 503 displays the extraction effect 751 between the real object 715 and the information display area 742.
- the user can obtain more information than viewing the signboard.
- FIG. 17A an image obtained by photographing the guide plate 713 is displayed in the display area 744 as content related to the guide plate 713.
- the display area 744 may be at an arbitrary position, but the size is larger than that of the guide plate 713. This makes it easier for the user to grasp the information on the guide plate 713.
- a reference marker 761 may be displayed on both.
- the object detection unit 105 detects the guide plate 713 from the captured image captured by the imaging unit 102. Then, the display content selection unit 502 selects an image of the guide board 713 as the display content. In addition, the display content generation unit 501 and / or the superposition unit 503 determine a display mode and a display position so as to realize the display.
- the AR content to be displayed may be hierarchized. As shown in FIG. 17B, a group of related contents 821 to 825 are displayed in the same display area 745. Here, a case where five contents are displayed is illustrated. However, the number of contents displayed in one display area 745 is arbitrary.
- a set of contents to be displayed is selected and generated by the display content selection unit 502.
- the display content selection unit 502 generates a set of contents using, for example, meta information.
- Each content includes a company name, an icon indicating the company, a product description, a price display, a reference URL, and the like.
- the display content selection unit 502, the display content generation unit 501, and the superimposition unit 503 determine the selection timing, the display mode, and the display position so that the display content selection unit 502, the display content generation unit 501, and the superimposition unit 503 display them according to the user's selection and timing. .
- the content display is often updated in real time. For this reason, if the amount of information to be displayed at a time is large, the user's understanding may be hindered. In order to avoid this, the contents to be presented to the user at once are simplified and sequentially displayed. In such a case, this hierarchical set of contents is used. Thereby, the information included in the content can be provided with good readability.
- the content included in the content set may be tailored as a story, and the story may be advanced in time series.
- content with mascots may be displayed.
- the content with a mascot includes a content 831 and a mascot 841.
- a popular mascot 841 is added and displayed.
- the eye tracking device 508 determines whether or not the user's line of sight has remained in the display area 746 of the content 831 for a predetermined period.
- the display content generation unit 501 and the superposition unit 503 display the mascot 841 in the display area 746. Thereby, it is possible to operate such that the mascot 841 is displayed only when the user reads the content 831. By performing such an operation, the user's interest can be effectively attracted to the content 831.
- the viewing rate of the content 831 increases. In this way, high-value-added content display can be realized with good readability for the user.
- the display content may be transmitted to the outside and stored.
- the transmission destination may be an information processing terminal 910 as shown in FIG. Further, it may be another storage device connected to the network. Transmission is performed via the gateway 110. If information is transmitted to another storage device and stored, the capacity of the information processing terminal 910 is not compressed. With this configuration, the displayed information can be reused. The displayed content can be utilized later, which increases convenience for the user.
- the display content selection unit 502 transmits the instructed content from the selected content.
- indication may be received via the above-mentioned motion recognition apparatus.
- indication may be received via the above-mentioned motion recognition apparatus.
- the identifier is given by, for example, the display content selection unit 502 or the display content generation unit 501.
- the display content selection unit 502 or the display content generation unit 501 adds an identifier by using meta information, sequentially assigning predetermined characters and numbers, or the like.
- the content may be one that gives decoration to the object.
- the decoration may be a thing with high entertainment property.
- Fig. 10 shows a display example in this case.
- contents 816 and 817 are respectively displayed at positions corresponding to a real building 716 and a car 717, and these are decorated.
- the atmosphere can be raised from driving.
- the display content selection unit 502 selects the contents 816 and 817 to be displayed.
- the AR display apparatus 500 was mounted in moving bodies, such as a motor vehicle, for example, it is not limited to this.
- it may be mounted on the HMD and used by pedestrians.
- a normal screen or the like may be used for the display 522, superimposed on other images, and used indoors.
- this invention is not limited to the above-mentioned Example, Various modifications are included.
- the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.
- a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment.
- SYMBOLS 100 Position and orientation detection apparatus, 101: Control part, 102: Imaging
Landscapes
- Engineering & Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Remote Sensing (AREA)
- Computer Graphics (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Automation & Control Theory (AREA)
- Studio Devices (AREA)
- Processing Or Creating Images (AREA)
Abstract
The purpose of the present invention is to detect with a simple configuration the position and posture of a user with high accuracy from a small amount of information. Provided is a position and posture detection device comprising: an image-capture unit 102 for image-capturing a predetermined image-capture range including two or more objects; an object detection unit 105 for specifying a pixel position of each of the objects in the image that is image-captured by the image-capture unit 102; an orientation calculation unit 106 for calculating an orientation of each of the objects with respect to the image-capture unit 102 by using the pixel position, map information on each of the objects, and a focal distance of the image-capture unit; and a position and posture calculation unit 107 for calculating the position and posture of the image-capture unit 102 by using the orientation of each of the objects and the map information. The object detection unit 105 extracts two-dimensional map information corresponding to the image-capture range from two-dimensional map information, in which information on positions and shapes of a plurality of objects within a predetermined region are stored, and specifies a pixel position by using the two-dimensional map information.
Description
本発明は、位置姿勢検出技術およびAR(Augmented Reality:拡張現実)表示技術に関する。特に、撮影部を備える装置の位置姿勢検出技術およびAR表示技術に関する。
The present invention relates to a position / orientation detection technique and an AR (Augmented Reality) display technique. In particular, the present invention relates to a position / orientation detection technique and an AR display technique of an apparatus including an imaging unit.
本技術分野の背景技術として、特許文献1がある。特許文献1には、「三次元地図画像上で、ユーザがカーソルで三次元オブジェクトを指定すると、その時点でカメラが撮影している画像から、三次元オブジェクトに対応する現実の建造物が写り込んでいる画像部分がテクスチャ画像データとして抽出され、三次元オブジェクトのテクスチャ画像データとして登録される。以降、レンダリング処理部によって、三次元オブジェクトは、登録されているテクスチャ画像データが表面テクスチャとしてテクスチャマッピングされた形態で三次元地図画像上に描画される(要約抜粋)」ナビゲーション装置が開示される。
There is Patent Document 1 as background art in this technical field. Japanese Patent Laid-Open No. 2004-133867 describes that “when a user designates a three-dimensional object with a cursor on a three-dimensional map image, an actual building corresponding to the three-dimensional object is captured from the image taken by the camera at that time. The extracted image portion is extracted as texture image data and registered as texture image data of the three-dimensional object.After that, the rendering processing unit texture-maps the registered texture image data as the surface texture. A navigation device is disclosed that is drawn on a three-dimensional map image (summary excerpt).
また、他の背景技術として、特許文献2がある。特許文献2には、「認識対象の画像特徴Faを記憶する画像特徴と、認識対象のプレビュー画像から画像特徴Fbを検出する画像特徴検出部と、画像特徴FaおよびFbのマッチング結果に基づいて認識対象の初期姿勢を推定する姿勢推定部と、初期姿勢に基づいて画像特徴Faから追跡点Feを選択する追跡点選択部と、初期姿勢の推定結果に基づいて認識対象のテンプレート画像を生成するテンプレート生成部と、追跡点Feに関してテンプレート画像とプレビュー画像とのマッチングを行うマッチング部と、マッチングに成功した追跡点Feに基づいて、プレビュー画像における認識対象の姿勢を追跡する姿勢追跡部と、を具備した(要約抜粋)」画像処理装置が開示される。
Moreover, there is Patent Document 2 as another background art. Japanese Patent Laid-Open No. 2004-151867 discloses that “the image feature that stores the image feature Fa to be recognized, the image feature detection unit that detects the image feature Fb from the preview image to be recognized, and the recognition based on the matching result of the image features Fa and Fb”. A posture estimation unit that estimates an initial posture of a target, a tracking point selection unit that selects a tracking point Fe from an image feature Fa based on the initial posture, and a template that generates a template image of a recognition target based on the estimation result of the initial posture A generating unit; a matching unit that matches the template image and the preview image with respect to the tracking point Fe; and a posture tracking unit that tracks the posture of the recognition target in the preview image based on the tracking point Fe that has been successfully matched. (Summary Extract) "is disclosed.
AR表示は、ユーザが見ている実在する光景(実在光景)に、その光景に関連する画像やデータ等の情報を重畳して表示する技術である。実在光景に関連情報をずれなく重畳表示するためには、ユーザ側の位置と向いている方向(視線方向)とを高精度に特定する必要がある。
AR display is a technique for superimposing and displaying information such as images and data related to a real scene (real scene) viewed by a user. In order to superimpose and display related information in a real scene without deviation, it is necessary to specify the position on the user side and the direction (line-of-sight direction) with high accuracy.
特許文献1に開示の技術では、実在の光景内の関連情報を重畳表示する対象のオブジェクトはユーザが指定している。このため、手間がかかる。また、車周辺の光景画像をテクスチャ画像として、三次元モデルに貼りつけて三次元地図を表示する。このため、車周辺の光景に対応する三次元地図と三次元モデルが必須である。現実空間に対応した三次元地図と三次元モデルを扱って画像処理するため、処理すべき情報量が多くなる。
In the technology disclosed in Patent Document 1, a user designates an object to be displayed with related information in an actual scene superimposed. For this reason, it takes time and effort. Also, a 3D map is displayed by pasting a sight image around the car as a texture image on a 3D model. For this reason, a three-dimensional map and a three-dimensional model corresponding to the scene around the vehicle are essential. Since image processing is performed using a 3D map and a 3D model corresponding to the real space, the amount of information to be processed increases.
また、特許文献2に開示の技術は、画像を用いて認識対象の姿勢変化を追跡している。認識対象、すなわち、オブジェクトの姿勢変化は検出できるが、ユーザ側の位置は検出できない。
Also, the technique disclosed in Patent Document 2 tracks the posture change of the recognition target using an image. Although the recognition target, that is, the posture change of the object can be detected, the position on the user side cannot be detected.
本発明は、上記事情に鑑みてなされたもので、簡易な構成で、少ない情報から、ユーザ側の位置および姿勢を高精度に検出する位置検出技術を提供することを目的とする。上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。
The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a position detection technique for detecting the position and orientation on the user side with high accuracy from a small amount of information with a simple configuration. Problems, configurations, and effects other than those described above will be clarified by the following description of embodiments.
本発明は、2以上のオブジェクトを含む予め定めた撮影範囲を撮影する撮影部と、前記撮影部で撮影された撮影画像内の前記オブジェクトそれぞれの画素位置を特定するオブジェクト検出部と、前記オブジェクトそれぞれの前記撮影部に対する方向であるオブジェクト方向を、前記画素位置と、前記オブジェクトそれぞれの地図情報と、前記撮影部の焦点距離と、を用いて算出する方向算出部と、前記各オブジェクトのオブジェクト方向と前記地図情報とを用い、前記撮影部の位置および姿勢を算出する位置姿勢算出部と、を備え、前記オブジェクト検出部は、予め定めた領域内の複数のオブジェクトの位置および形状の情報が格納される二次元マップ情報から、前記撮影範囲に対応する二次元マップ情報を抽出し、当該二次元マップ情報を用いて前記画素位置を特定することを特徴とする位置姿勢検出装置を提供する。
The present invention provides a photographing unit that photographs a predetermined photographing range including two or more objects, an object detection unit that identifies the pixel position of each of the objects in a photographed image photographed by the photographing unit, and each of the objects A direction calculation unit that calculates an object direction that is a direction with respect to the shooting unit using the pixel position, map information of each object, and a focal length of the shooting unit, and an object direction of each object A position / orientation calculation unit that calculates the position and orientation of the photographing unit using the map information, and the object detection unit stores information on positions and shapes of a plurality of objects in a predetermined area. 2D map information corresponding to the shooting range is extracted from the 2D map information, and the 2D map information Providing the position and orientation detecting apparatus characterized by specifying the pixel position using.
また、透過性および反射性を有するディスプレイに、当該ディスプレイの背後の光景内のオブジェクトに関連づけてコンテンツを表示するAR表示装置であって、前記位置姿勢検出装置と、前記ディスプレイに表示する前記コンテンツを生成する表示コンテンツ生成部と、前記位置姿勢検出装置が決定した前記撮影部の位置および姿勢と、前記オブジェクト検出部が特定した前記オブジェクトの画素位置と、を用い、生成した前記コンテンツの前記ディスプレイ上の表示位置を決定する重畳部と、前記ディスプレイ上の、前記重畳部が決定した表示位置に、生成した前記コンテンツを表示する表示部と、を備えることを特徴とするAR表示装置を提供する。
An AR display device that displays content on a display having transparency and reflectivity in association with an object in a scene behind the display, the position and orientation detection device, and the content displayed on the display. Using the generated display content generation unit, the position and orientation of the photographing unit determined by the position / orientation detection device, and the pixel position of the object specified by the object detection unit on the display of the generated content There is provided an AR display device comprising: a superimposing unit that determines a display position; and a display unit that displays the generated content at a display position determined by the superimposing unit on the display.
本発明によれば、簡易な構成で、少ない情報から、ユーザ側の位置および姿勢を高精度に検出できる。
According to the present invention, the position and orientation on the user side can be detected with high accuracy from a small amount of information with a simple configuration.
以下、本発明の実施の形態について、図面を用いて説明する。以下、本明細書において、同一機能を有するものは、特に断らない限り同一の符号を付し、繰り返しの説明は省略する。なお、本発明はここで説明する実施形態に限定されるものではない。
Hereinafter, embodiments of the present invention will be described with reference to the drawings. Hereinafter, in this specification, those having the same function are denoted by the same reference numerals unless otherwise specified, and repeated description is omitted. The present invention is not limited to the embodiment described here.
<<第一の実施形態>>
第一の実施形態は、撮影部を備える位置姿勢検出装置である。本実施形態の位置姿勢検出装置は、撮影部で取得した画像を用い、撮影部を含む自身の位置と姿勢とを検出する。 << First Embodiment >>
The first embodiment is a position / orientation detection apparatus including an imaging unit. The position / orientation detection apparatus according to the present embodiment uses the image acquired by the imaging unit to detect its own position and orientation including the imaging unit.
第一の実施形態は、撮影部を備える位置姿勢検出装置である。本実施形態の位置姿勢検出装置は、撮影部で取得した画像を用い、撮影部を含む自身の位置と姿勢とを検出する。 << First Embodiment >>
The first embodiment is a position / orientation detection apparatus including an imaging unit. The position / orientation detection apparatus according to the present embodiment uses the image acquired by the imaging unit to detect its own position and orientation including the imaging unit.
まず、本実施形態の位置姿勢検出装置100の機能構成を説明する。図1は、本実施形態の位置姿勢検出装置100の機能ブロック図である。本図に示すように、本実施形態の位置姿勢検出装置100は、制御部101と、撮影部102と、粗検出センサ103と、評価オブジェクト抽出部104と、オブジェクト検出部105と、方向算出部106と、位置姿勢算出部107と、マップ管理部108と、ゲートウェイ110と、を備える。
First, the functional configuration of the position / orientation detection apparatus 100 of this embodiment will be described. FIG. 1 is a functional block diagram of a position / orientation detection apparatus 100 according to the present embodiment. As shown in the figure, the position / orientation detection apparatus 100 according to the present embodiment includes a control unit 101, a photographing unit 102, a rough detection sensor 103, an evaluation object extraction unit 104, an object detection unit 105, and a direction calculation unit. 106, a position / orientation calculation unit 107, a map management unit 108, and a gateway 110.
また、本実施形態の位置姿勢検出装置100は、情報を一時的に保持する撮影画像保持部121と、測位情報保持部122と、二次元マップ情報保持部123と、地図情報保持部124と、を備える。
Further, the position / orientation detection apparatus 100 of the present embodiment includes a captured image holding unit 121 that temporarily holds information, a positioning information holding unit 122, a two-dimensional map information holding unit 123, a map information holding unit 124, Is provided.
制御部101は、位置姿勢検出装置100の各部の動作状況を監視し、位置姿勢検出装置100全体を制御する。例えば、回路等で構成される。後述するようにCPUが予め保持されたプログラムを実行することにより実現されてもよい。
The control unit 101 monitors the operation status of each part of the position / orientation detection apparatus 100 and controls the entire position / orientation detection apparatus 100. For example, it is composed of a circuit or the like. As will be described later, the CPU may execute the program stored in advance.
撮影部102は、実空間における光景(実光景)を撮影して撮影画像を取得する。撮影画像は、撮影画像保持部121に格納される。本実施形態では、少なくとも2以上のオブジェクトを含む撮影範囲を撮影する。
The photographing unit 102 photographs a scene in the real space (actual scene) and acquires a photographed image. The captured image is stored in the captured image holding unit 121. In the present embodiment, a shooting range including at least two or more objects is shot.
撮影部102の構成を、図2(a)に示す。本図に示すように、撮影部102は、結像光学系であるレンズ131と、結像した像を電気信号に変換する撮像素子132と、を備える。撮像素子132は、CMOS(metal‐oxide‐semiconductor)イメージセンサ、CCD(Charge-Coupled Device)イメージセンサ等である。なお、133は、レンズの光軸である。
The configuration of the photographing unit 102 is shown in FIG. As shown in the figure, the photographing unit 102 includes a lens 131 that is an image forming optical system, and an image sensor 132 that converts the formed image into an electric signal. The imaging device 132 is a CMOS (metal-oxide-semiconductor) image sensor, a CCD (Charge-Coupled Device) image sensor, or the like. Reference numeral 133 denotes an optical axis of the lens.
撮影画像保持部121は、必要に応じて、複数の撮影画像を保持してもよい。これは、時間的な変化の抽出や、撮影状態が良好な撮影画像を選択するためである。汎用性を持たせるため、撮影部102が取得したデータそのものではなく、取得データに対して各種の画像処理を施した後のデータを撮影画像として保持してもよい。ここで施す画像処理は、例えば、レンズの歪みの除去、色や明るさの調整等である。
The photographed image holding unit 121 may hold a plurality of photographed images as necessary. This is to extract temporal changes and to select a captured image with a good shooting state. In order to provide versatility, data obtained by performing various types of image processing on the acquired data may be stored as a captured image instead of the data itself acquired by the imaging unit 102. The image processing performed here is, for example, removal of lens distortion, adjustment of color and brightness, and the like.
粗検出センサ103は、粗い精度で、撮影部102を含む位置姿勢検出装置100の、実空間における、位置と姿勢とを検出し、測位情報として、測位情報保持部122に記憶する。
The coarse detection sensor 103 detects the position and orientation in the real space of the position / orientation detection apparatus 100 including the imaging unit 102 with coarse accuracy, and stores the detected position and orientation in the positioning information holding unit 122 as positioning information.
位置の検出には、例えば、GPS(Grobal Positioning System)を使用する。GPSを利用する場合、粗検出センサ103は、GPS受信機である。
For detecting the position, for example, GPS (Global Positioning System) is used. When using GPS, the coarse detection sensor 103 is a GPS receiver.
姿勢の検出には、例えば、電子コンパスを使用する。電子コンパスは2個以上の磁気センサの組み合わせで構成される。3軸対応の磁気センサであれば、三次元空間での方向検知ができる。測定面が水平面に限られる用途であれば、2軸の磁気センサを用い、コストを抑えた電子コンパスを利用しても良い。
* For example, an electronic compass is used to detect the posture. The electronic compass is composed of a combination of two or more magnetic sensors. If the magnetic sensor is compatible with three axes, direction detection in a three-dimensional space is possible. If the measurement surface is limited to a horizontal plane, a biaxial magnetic sensor may be used and an electronic compass with reduced cost may be used.
なお、測位情報保持部122が保持する測位情報は、粗検出センサ103により得られた測位情報に限定されない。例えば、後述する位置姿勢算出部107によって得られた測位情報であってもよいし、また、その両方であっても良い。測位情報は、後述する二次元マップ情報および地図情報の抽出、位置姿勢の算出等に用いられる。
Note that the positioning information held by the positioning information holding unit 122 is not limited to the positioning information obtained by the coarse detection sensor 103. For example, it may be positioning information obtained by a position / orientation calculation unit 107, which will be described later, or both. The positioning information is used for extraction of two-dimensional map information and map information described later, calculation of position and orientation, and the like.
マップ管理部108は、撮影部102の視野範囲を含む、位置姿勢検出装置100の各部による処理に必要十分な範囲の二次元マップ情報および地図情報を取得し、二次元マップ情報保持部123および地図情報保持部124にそれぞれ格納する。
The map management unit 108 acquires two-dimensional map information and map information in a range necessary and sufficient for processing by each unit of the position / orientation detection apparatus 100 including the visual field range of the photographing unit 102, and acquires the two-dimensional map information holding unit 123 and the map Each information is stored in the information holding unit 124.
取得は、例えば、これらの情報を保持するサーバや記憶装置等から、ネットワーク等を経由し、ゲートウェイ110を介して行う。取得範囲は、測位情報に基づいて決定する。
The acquisition is performed via the gateway 110 via a network or the like from a server or storage device that holds such information, for example. The acquisition range is determined based on the positioning information.
二次元マップ情報保持部123に保持される二次元マップ情報は、所定の領域内に含まれるオブジェクトそれぞれの、オブジェクト情報である。オブジェクト情報は、各オブジェクトの、当該領域内の位置(マップ内位置)、形状(外観)、特徴点等である。
The 2D map information held in the 2D map information holding unit 123 is object information of each object included in a predetermined area. The object information includes the position (position in the map), shape (appearance), feature point, and the like of each object in the area.
二次元マップ情報は、例えば、カメラ等で撮影された画像から作成される。撮影時に、所定の領域を特定する情報として、撮影範囲の位置情報(マップ絶対位置)を同時に取得する。そして、撮影した画像を解析し、オブジェクトおよびその特徴点を抽出する。そして、抽出したオブジェクトの、画像内の画素位置を特定し、マップ内位置とする。オブジェクトの外観形状は、例えば、グーグル社のストリートビューなどを利用し、取得する。
2D map information is created from images taken with a camera, for example. At the time of shooting, position information (map absolute position) of the shooting range is simultaneously acquired as information for specifying a predetermined area. Then, the photographed image is analyzed, and the object and its feature point are extracted. Then, the pixel position in the image of the extracted object is specified and set as the map position. The appearance shape of the object is acquired by using, for example, Google Street View.
元とする撮影範囲を二次元マップとし、二次元マップ毎に、当該撮影範囲のマップ絶対位置と、その領域内の各オブジェクトの、マップ内位置、形状、特徴点を対応づけて、二次元マップ情報とする。なお、各二次元マップ情報は、マップ絶対位置を有するため、撮影部102の光学特性に合わせて画像変形が可能である。
The original shooting range is a two-dimensional map, and for each two-dimensional map, the map absolute position of the shooting range is associated with the position, shape, and feature point of each object in the area, and the two-dimensional map. Information. Since each two-dimensional map information has a map absolute position, the image can be deformed in accordance with the optical characteristics of the photographing unit 102.
以下、二次元マップ情報に登録されるオブジェクトを、登録オブジェクトと呼ぶ。なお、二次元マップ情報は、さらに、登録オブジェクト毎の属性情報を備えていてもよい。属性情報は、例えば、登録オブジェクトの種別等を含む。登録オブジェクトの種別は、例えば、建物、道路、看板等、である。
Hereinafter, an object registered in the two-dimensional map information is referred to as a registered object. Note that the two-dimensional map information may further include attribute information for each registered object. The attribute information includes, for example, the type of registered object. The type of registered object is, for example, a building, a road, a signboard, or the like.
なお、二次元マップ情報は、撮影した画像に対してカメラに関わる情報に基づいて歪みを取り除き、色や明るさを変調して画像処理を施した画像を解析し、取得しても良い。
Note that the two-dimensional map information may be acquired by analyzing an image obtained by removing distortion based on information related to the camera and modulating the color and brightness of the photographed image.
地図情報保持部124に保持される地図情報には、各オブジェクトの、実空間上の座標、もしくは住所が、地図情報として登録される。地図情報には、例えば、グーグル社のグーグルマップを利用することができる。データ量を小さくするため、オブジェクトと、当該オブジェクトの座標とを関連付けたリストを用いるのが望ましい。座標には、例えば、緯度、経度を用いる。また、高さの情報を含んでもよい。この場合、三次元の位置計測が可能になる。また、実在する任意の個所を基準としたローカルな座標系を用いてもよい。この場合、限定した領域に特化して位置測定精度を上げたデータ構成としてもよい。
In the map information held in the map information holding unit 124, the coordinates or addresses of each object in the real space are registered as map information. As the map information, for example, a Google map of Google Inc. can be used. In order to reduce the amount of data, it is desirable to use a list in which an object is associated with the coordinates of the object. For the coordinates, for example, latitude and longitude are used. Also, height information may be included. In this case, three-dimensional position measurement is possible. Further, a local coordinate system based on any actual location may be used. In this case, a data structure in which the position measurement accuracy is increased by specializing in a limited area may be used.
評価オブジェクト抽出部104は、二次元マップ情報に登録されている各登録オブジェクトの中から、処理対象とするオブジェクト候補(以下、評価オブジェクト候補と呼ぶ。)を抽出する。例えば、二次元マップ情報内の全ての登録オブジェクトを評価オブジェクト候補としてもよい。また、オブジェクトの属性情報が格納されている場合は、属性情報が、予め定めた条件に合致した登録オブジェクトを、評価オブジェクト候補として抽出してもよい。例えば、建物のオブジェクトのみを抽出する等である。
The evaluation object extraction unit 104 extracts object candidates to be processed (hereinafter referred to as evaluation object candidates) from each registered object registered in the two-dimensional map information. For example, all registered objects in the two-dimensional map information may be set as evaluation object candidates. When object attribute information is stored, a registered object whose attribute information matches a predetermined condition may be extracted as an evaluation object candidate. For example, only building objects are extracted.
二次元マップ情報は、二次元マップ情報保持部123に保持されているものを用いる。ここで、二次元マップ情報保持部123には、必要最小限の二次元マップ群が保持されている。しかし、評価オブジェクト抽出部104は、抽出時に、測位情報を用いて、さらに、評価オブジェクト候補を抽出する二次元マップ情報を絞りこんでもよい。例えば、測位情報を用い、実空間における撮影部102の視野範囲を算出する。そして、その視野範囲に合致する二次元マップ情報内のみ走査し、評価オブジェクト候補を抽出する。
As the 2D map information, information held in the 2D map information holding unit 123 is used. Here, the minimum two-dimensional map group is stored in the two-dimensional map information storage unit 123. However, at the time of extraction, the evaluation object extraction unit 104 may further narrow down 2D map information for extracting evaluation object candidates using the positioning information. For example, using the positioning information, the visual field range of the photographing unit 102 in the real space is calculated. Then, only the two-dimensional map information that matches the visual field range is scanned to extract evaluation object candidates.
オブジェクト検出部105は、抽出された各評価オブジェクト候補の、撮影画像内の位置(画素位置)を特定する。撮影画像内で位置を特定した評価オブジェクト候補を、評価オブジェクトとする。ここでは、少なくとも2つの評価オブジェクトの位置を特定する。撮影画像内の位置の特定は、パターンマッチングにより行う。パターンマッチングに用いるテンプレート画像は、二次元マップ情報の、形状情報を用いて作成する。
The object detection unit 105 identifies the position (pixel position) of each extracted evaluation object candidate in the captured image. An evaluation object candidate whose position is specified in the captured image is set as an evaluation object. Here, the positions of at least two evaluation objects are specified. The position in the captured image is specified by pattern matching. A template image used for pattern matching is created using shape information of two-dimensional map information.
また、オブジェクト検出部105は、特定した画素位置を用い、各評価オブジェクトの、撮影画像原点からの、水平方向の距離(水平距離)を算出する。
Also, the object detection unit 105 calculates the distance in the horizontal direction (horizontal distance) from the photographed image origin of each evaluation object using the specified pixel position.
方向算出部106は、各評価オブジェクトの方向(オブジェクト方向)を算出する。オブジェクト方向として、例えば、撮影部102のレンズ131の光軸133方向からの角度を求める。角度は、オブジェクト検出部105が算出した画素位置と、水平距離と、評価オブジェクトの地図情報と、撮影部102のレンズ131の焦点距離とを用いて算出する。
The direction calculation unit 106 calculates the direction (object direction) of each evaluation object. As the object direction, for example, an angle from the direction of the optical axis 133 of the lens 131 of the photographing unit 102 is obtained. The angle is calculated using the pixel position calculated by the object detection unit 105, the horizontal distance, the map information of the evaluation object, and the focal length of the lens 131 of the photographing unit 102.
なお、このとき、方向算出部106は、レンズ131が持つ歪みによる角度の誤差を補正し、上記方向を算出してもよい。歪みによる角度の誤差は、レンズ131の画角と像高との関係から算出する。算出に必要な画角と像高との関係は、事前に取得しておく。
At this time, the direction calculation unit 106 may correct the angle error due to distortion of the lens 131 and calculate the direction. The angle error due to distortion is calculated from the relationship between the angle of view of the lens 131 and the image height. The relationship between the angle of view and the image height necessary for the calculation is acquired in advance.
位置姿勢算出部107は、方向算出部106が算出した各評価オブジェクトのオブジェクト方向と、地図情報と、を用い、撮影部102の位置および姿勢を算出する。算出する位置は、地図情報と同じ座標系の座標である。また、算出する姿勢は、光軸133の向きである。姿勢は、方角や仰角、およびピッチ角、ヨー角、ロー角で表すことができる。なお、位置および姿勢の算出には、測位情報を用いてもよい。
The position / orientation calculation unit 107 calculates the position and orientation of the photographing unit 102 using the object direction of each evaluation object calculated by the direction calculation unit 106 and the map information. The position to be calculated is a coordinate in the same coordinate system as the map information. Further, the calculated posture is the direction of the optical axis 133. The posture can be represented by a direction angle, an elevation angle, a pitch angle, a yaw angle, and a low angle. In addition, you may use positioning information for calculation of a position and an attitude | position.
ゲートウェイ110は、通信インタフェースである。位置姿勢検出装置100は、ゲートウェイ110を介して、例えば、ネットワークに接続されたサーバ等とデータの送受信を行う。本実施形態では、例えば、二次元マップ情報および地図情報をインターネットに接続されたサーバからダウンロードしたり、後述するように、生成した二次元マップ情報をサーバへアップロードしたりする。
The gateway 110 is a communication interface. The position / orientation detection apparatus 100 transmits / receives data to / from, for example, a server connected to a network via the gateway 110. In the present embodiment, for example, two-dimensional map information and map information are downloaded from a server connected to the Internet, or the generated two-dimensional map information is uploaded to the server as will be described later.
ここで、二次元マップ情報の取得元のサーバ(マップサーバ)システム620の構成の一例を図3(a)に示す。本図に示すように、マップサーバシステム620は、動作を制御するマップサーバ621と、地図情報を格納する地図情報格納部622と、二次元マップ情報を格納する二次元マップ情報格納部623と、通信I/F625とを備える。
Here, FIG. 3A shows an example of the configuration of the server (map server) system 620 from which the two-dimensional map information is acquired. As shown in the figure, the map server system 620 includes a map server 621 that controls operations, a map information storage unit 622 that stores map information, a 2D map information storage unit 623 that stores 2D map information, And a communication I / F 625.
マップサーバ621は、マップ管理部108からの要求を受け、要求された範囲の地図情報および二次元マップ情報を、各格納部から、要求元へ送信する。
The map server 621 receives a request from the map management unit 108, and transmits map information and two-dimensional map information in the requested range from each storage unit to the request source.
なお、例えば、二次元マップ情報に含まれるオブジェクト624それぞれの、実在する位置、姿勢、形状の情報を、独立して保持するよう構成してもよい。この場合、各オブジェクト624は、当該オブジェクト624が含まれる二次元マップ情報に対応づけて保持される。
Note that, for example, information on the actual position, posture, and shape of each object 624 included in the two-dimensional map information may be held independently. In this case, each object 624 is held in association with the two-dimensional map information including the object 624.
マップサーバ621は、必要に応じて、オブジェクト624の特徴点抽出、位置姿勢検出装置100から送信された二次元マップ情報210の解析およびオブジェクトの抽出、二次元マップ情報210の登録機能を有していても良い。
The map server 621 has a function of extracting feature points of the object 624, analyzing the two-dimensional map information 210 transmitted from the position / orientation detection apparatus 100, extracting an object, and registering the two-dimensional map information 210 as necessary. May be.
なお、地図情報および二次元マップ情報を管理するマップサーバシステム620は、1つのシステムに限定されない。これらの情報は、ネットワーク上の複数のサーバシステムに分割して管理されていてもよい。
Note that the map server system 620 that manages the map information and the two-dimensional map information is not limited to one system. Such information may be divided and managed in a plurality of server systems on the network.
なお、本実施形態の位置姿勢検出装置100は、図2(b)に示すように、CPU141と、メモリ142と、記憶装置143と、入出力インタフェース(I/F)144と、通信I/F145と、を備える情報処理装置で実現される。例えば、記憶装置143に予め保持するプログラムを、CPU141がメモリ142にロードして実行することにより実現される。なお、全部または一部の機能は、ASIC(Application Specific Integrated Circuit)、FPGA(field-programmable gate array)などのハードウェア、回路によって実現されてもよい。
As shown in FIG. 2B, the position / orientation detection apparatus 100 of the present embodiment includes a CPU 141, a memory 142, a storage device 143, an input / output interface (I / F) 144, and a communication I / F 145. Is realized by an information processing apparatus. For example, the program is stored in advance in the storage device 143 by the CPU 141 loaded into the memory 142 and executed. All or some of the functions may be realized by hardware or a circuit such as ASIC (Application Specific Integrated Circuit) or FPGA (Field-programmable gate array).
また、各機能の処理に用いる各種のデータ、処理中に生成される各種のデータは、メモリ142、または、記憶装置143に格納される。撮影画像保持部121と、測位情報保持部122と、二次元マップ情報保持部123と、地図情報保持部124と、は、例えば、位置姿勢検出装置100が備えるメモリ142等に構築される。
Further, various data used for the processing of each function and various data generated during the processing are stored in the memory 142 or the storage device 143. The captured image holding unit 121, the positioning information holding unit 122, the two-dimensional map information holding unit 123, and the map information holding unit 124 are constructed in, for example, the memory 142 provided in the position and orientation detection device 100.
各保持部は、それぞれ、別々に設けられたメモリ142で実現されても良いし、一部、または全てが統合された一つのメモリ142で構成されていても良い。必要な容量、速度によって切り分けていても良い。なお、二次元マップ情報は、特に、オブジェクトを抽出する処理で用いられる。このため、この情報を保持する二次元マップ情報保持部123は、比較的高速にアクセス可能なメモリ領域に構築することが望ましい。
Each holding unit may be realized by a memory 142 provided separately, or may be configured by a single memory 142 that is partially or entirely integrated. You may divide according to required capacity and speed. Note that the two-dimensional map information is particularly used in processing for extracting an object. For this reason, it is desirable to construct the two-dimensional map information holding unit 123 that holds this information in a memory area that can be accessed at a relatively high speed.
次に、位置姿勢検出装置100の各部による、撮影部102の位置姿勢検出処理の詳細および流れを説明する。図4(a)~図4(f)は、本実施形態の位置検出処理を説明するための図であり、図5は、処理フローである。
Next, the details and flow of the position / orientation detection process of the photographing unit 102 by each unit of the position / orientation detection apparatus 100 will be described. 4 (a) to 4 (f) are diagrams for explaining the position detection processing of the present embodiment, and FIG. 5 is a processing flow.
本実施形態では、初期処理、撮影処理、オブジェクト検出処理、方向算出処理、位置・姿勢算出処理の順に行う。
In this embodiment, initial processing, shooting processing, object detection processing, direction calculation processing, and position / posture calculation processing are performed in this order.
初期処理では、粗検出センサ103により、撮影部102の概略の位置と姿勢として測位情報を取得し、処理に用いる二次元マップ情報および地図情報を取得する。
In the initial processing, the rough detection sensor 103 acquires positioning information as the approximate position and orientation of the photographing unit 102, and acquires two-dimensional map information and map information used for processing.
具体的には、まず、粗検出センサ103は、位置姿勢検出装置100の、粗位置および粗姿勢を測位情報として取得する(ステップS1101)。取得した測位情報は、測位情報保持部122に格納される。
Specifically, first, the rough detection sensor 103 acquires the rough position and the rough posture of the position / orientation detection apparatus 100 as positioning information (step S1101). The acquired positioning information is stored in the positioning information holding unit 122.
次に、マップ管理部108は、処理に必要な二次元マップ情報210および地図情報230を取得し、それぞれ二次元マップ情報保持部123および地図情報保持部124に登録する(ステップS1102)。ここでは、マップ管理部108は、測位情報を用いて、撮影部102の視野範囲の座標を算出する。そして、マップ絶対位置が算出した座標に該当する二次元マップ情報を取得する。また、算出した座標に該当する地図情報を取得する。
Next, the map management unit 108 acquires the two-dimensional map information 210 and the map information 230 necessary for processing, and registers them in the two-dimensional map information holding unit 123 and the map information holding unit 124, respectively (step S1102). Here, the map management unit 108 calculates coordinates of the visual field range of the imaging unit 102 using the positioning information. Then, two-dimensional map information corresponding to the coordinates calculated by the map absolute position is acquired. Also, map information corresponding to the calculated coordinates is acquired.
次に、撮影処理では、撮影部102が、2以上のオブジェクトを含む、実際の光景を撮影し、撮影画像220を取得する(ステップS1103)。
Next, in the photographing process, the photographing unit 102 photographs an actual scene including two or more objects, and obtains a photographed image 220 (step S1103).
次に、オブジェクト検出処理を行う。ここでは、撮影画像220内の評価オブジェクトの画素位置を特定することにより、オブジェクトを検出する。
Next, object detection processing is performed. Here, the object is detected by specifying the pixel position of the evaluation object in the captured image 220.
具体的には、評価オブジェクト抽出部104は、二次元マップ情報保持部123にアクセスし、二次元マップ情報210を取得する。そして、図4(a)に示すように、取得した二次元マップ情報210内の登録オブジェクトを抽出し(ステップS1104)、評価オブジェクト候補211、212、213とする。
Specifically, the evaluation object extraction unit 104 accesses the two-dimensional map information holding unit 123 and acquires the two-dimensional map information 210. Then, as shown in FIG. 4A, registered objects in the acquired two-dimensional map information 210 are extracted (step S1104) and set as evaluation object candidates 211, 212, and 213.
なお、このとき、全ての登録オブジェクトを、評価オブジェクト候補として抽出してもよいし、全ての登録オブジェクトの中から、所定の条件に従って、評価オブジェクト候補を選択し、抽出してもよい。図4(a)では、3つの評価オブジェクト候補211、212、213が抽出された場合を例示する。
At this time, all registered objects may be extracted as evaluation object candidates, or evaluation object candidates may be selected and extracted from all registered objects according to a predetermined condition. FIG. 4A illustrates a case where three evaluation object candidates 211, 212, and 213 are extracted.
次に、オブジェクト検出部105は、抽出した各評価オブジェクト候補211、212,213について、テンプレート画像を生成する(ステップS1106)。そして、生成したテンプレート画像で撮影画像220を走査し、撮影画像220内で評価オブジェクトを特定する(ステップS1107)。オブジェクト検出部105は、抽出した評価オブジェクト候補211、212、213ごとに、撮影画像220内の評価オブジェクトの特定を行い、少なくとも2つ評価オブジェクトを検出するまで繰り返す(ステップS1105)。
Next, the object detection unit 105 generates a template image for each extracted evaluation object candidate 211, 212, 213 (step S1106). Then, the captured image 220 is scanned with the generated template image, and an evaluation object is specified in the captured image 220 (step S1107). The object detection unit 105 identifies the evaluation object in the captured image 220 for each of the extracted evaluation object candidates 211, 212, and 213, and repeats until at least two evaluation objects are detected (step S1105).
図4(b)に、撮影画像220内で、2つの評価オブジェクト221と222とが、検出された例を示す。ここでは、評価オブジェクト221は、評価オブジェクト候補211に、評価オブジェクト222は、評価オブジェクト候補212に、それぞれ対応し、検出されたものとする。また、評価オブジェクト候補213に対応する評価オブジェクトは検出されていないものとする。本実施形態では、2つの評価オブジェクトが検出できればよいため、2つの評価オブジェクトが検出できた時点で、検出処理を終了する。
FIG. 4B shows an example in which two evaluation objects 221 and 222 are detected in the captured image 220. Here, it is assumed that the evaluation object 221 corresponds to the evaluation object candidate 211, and the evaluation object 222 corresponds to the evaluation object candidate 212, and is detected. Assume that no evaluation object corresponding to the evaluation object candidate 213 has been detected. In the present embodiment, it is sufficient that two evaluation objects can be detected, and thus the detection process is terminated when two evaluation objects can be detected.
オブジェクト検出部105は、撮影画像220内で、評価オブジェクト221および222を検出すると、各評価オブジェクト221および222の、撮影画像220の原点からの水平距離PdAおよびPdBをそれぞれ算出する(ステップS1108)。
When the object detection unit 105 detects the evaluation objects 221 and 222 in the captured image 220, the object detection unit 105 calculates the horizontal distances PdA and PdB of the evaluation objects 221 and 222 from the origin of the captured image 220, respectively (step S1108).
水平距離PdAおよびPdBは、撮像素子132上のピクセル数とする。図4(b)中の黒点で示す撮影画像220の原点は、撮影部102が向いている方向、すなわちレンズ131の光軸133中心に一致する撮像素子132の点とする。また、水平距離は、各評価オブジェクト221、222内の代表点と、撮影画像の原点との水平方向の距離である。代表点は、例えば、評価オブジェクト221、222を構成する矩形の中央点とする。なお、評価オブジェクト221、222それぞれに関連づけられた形状の中点を代表点としても良い。代表点の決め方等詳細については、後述する。
The horizontal distances PdA and PdB are the number of pixels on the image sensor 132. The origin of the captured image 220 indicated by a black dot in FIG. 4B is the point of the image sensor 132 that coincides with the direction in which the image capturing unit 102 faces, that is, the center of the optical axis 133 of the lens 131. The horizontal distance is a horizontal distance between the representative point in each evaluation object 221 and 222 and the origin of the captured image. The representative point is, for example, a rectangular center point that constitutes the evaluation objects 221 and 222. The midpoint of the shape associated with each of the evaluation objects 221 and 222 may be used as the representative point. Details of how to determine the representative points will be described later.
そして、オブジェクト検出部105は、図4(c)の地図情報230を参照し、評価オブジェクト221および222それぞれの地図情報を取得する(ステップS1109)。ここでは、オブジェクト検出部105は、各評価オブジェクト221、222の元となった、二次元マップ情報210のマップ絶対位置と、各評価オブジェクト候補211、212のマップ内位置とを用いて、地図情報230内のオブジェクト(実オブジェクト)231、232とを対応づける。そして、対応づけた実オブジェクト231、232の地図情報((XA,YA)、(XB,YB))を取得する。
Then, the object detection unit 105 refers to the map information 230 of FIG. 4C and acquires the map information of the evaluation objects 221 and 222 (step S1109). Here, the object detection unit 105 uses the map absolute position of the two-dimensional map information 210 that is the basis of the evaluation objects 221 and 222 and the map position of the evaluation object candidates 211 and 212 to generate map information. The objects (real objects) 231 and 232 in 230 are associated with each other. Then, map information ((XA, YA), (XB, YB)) of the associated real objects 231 and 232 is acquired.
なお、図4(c)では便宜上、二次元の地図で図示するが、三次元の地図であってもよい。
In FIG. 4C, a two-dimensional map is illustrated for convenience, but a three-dimensional map may be used.
次に、方向算出部106が、方向算出処理を行う。すなわち、方向算出部106は、各評価オブジェクト221、222の、オブジェクト方向を算出する(ステップS1110)。図4(d)および図4(e)に示すように、方向算出部106は、オブジェクト方向として、撮影部102のレンズ131の光軸方向に対する角度θA、θBを、それぞれ算出する。
Next, the direction calculation unit 106 performs a direction calculation process. That is, the direction calculation unit 106 calculates the object direction of each evaluation object 221 and 222 (step S1110). As shown in FIGS. 4D and 4E, the direction calculation unit 106 calculates angles θA and θB with respect to the optical axis direction of the lens 131 of the photographing unit 102 as object directions, respectively.
図4(d)は、実オブジェクト231に対応する評価オブジェクト221の撮影画像220内の水平距離PdAを用いて、撮影部102の存在方向を推定する方法を説明するための図である。また、図4(e)は、実オブジェクト232に対応する評価オブジェクト221の水平距離PdBを用いて、オブジェクト方向を推定する方法を説明するための図である。
FIG. 4D is a diagram for explaining a method of estimating the existence direction of the photographing unit 102 using the horizontal distance PdA in the photographed image 220 of the evaluation object 221 corresponding to the real object 231. FIG. 4E is a diagram for explaining a method of estimating the object direction using the horizontal distance PdB of the evaluation object 221 corresponding to the real object 232.
実オブジェクト231の像は、レンズ131を通して撮像素子132上で結像する。このため、実オブジェクト231と撮影部102のレンズ131の光軸133とがなす角度θAは、撮像素子上の位置PdAを用いて、幾何光学的に、以下の式(1)で算出される。
ここで、fは、レンズ131の焦点距離である。 The image of thereal object 231 is formed on the image sensor 132 through the lens 131. Therefore, the angle θA formed by the real object 231 and the optical axis 133 of the lens 131 of the photographing unit 102 is calculated geometrically and optically by the following expression (1) using the position PdA on the image sensor.
Here, f is the focal length of thelens 131.
ここで、fは、レンズ131の焦点距離である。 The image of the
Here, f is the focal length of the
同様に、図4(e)に示す、実オブジェクト232と撮影部102のレンズ131の光軸133とがなす角度θBも、同様に、以下の式(2)で算出される。
Similarly, an angle θB formed by the real object 232 and the optical axis 133 of the lens 131 of the photographing unit 102 shown in FIG. 4E is similarly calculated by the following equation (2).
なお、レンズ131に歪みがあると、算出される角度θA、θBに誤差が乗る。このため、歪みが少ないレンズ131を用いることが望ましい。また、レンズ131の画角と像高の関係を事前に取得して、レンズ131が持つ歪みによる角度の誤差を補正することが望ましい。
Note that if the lens 131 is distorted, errors are added to the calculated angles θA and θB. For this reason, it is desirable to use the lens 131 with little distortion. In addition, it is desirable that the relationship between the angle of view of the lens 131 and the image height is acquired in advance, and the angle error due to distortion of the lens 131 is corrected.
最後に、位置姿勢算出部107が、位置姿勢算出処理を行う。ここでは、マッチング済みの評価オブジェクト221、222の地図情報A(XA,YA)、B(XB,YB)と、オブジェクト方向θA、θBとをそれぞれ参照し、実空間における撮影部102の存在範囲をそれぞれ求める。そして、複数の存在範囲から撮影部102の、実空間での位置と姿勢を特定する(ステップS1111)。算出する位置は、撮影部102の地図情報である。算出する姿勢は、レンズ131の光軸133の向きである。
Finally, the position / orientation calculation unit 107 performs position / orientation calculation processing. Here, the map information A (XA, YA) and B (XB, YB) of the matching evaluation objects 221 and 222 and the object directions θA and θB are respectively referred to, and the existence range of the photographing unit 102 in the real space is determined. Ask for each. And the position and attitude | position in the real space of the imaging | photography part 102 are specified from several existing ranges (step S1111). The position to be calculated is map information of the photographing unit 102. The calculated posture is the direction of the optical axis 133 of the lens 131.
これらの算出手法を、図4(f)を用いて説明する。レンズ131の光軸133は、実オブジェクト231とは角度θAを、実オブジェクト232とは角度θBを、それぞれ成す。この2つの条件を満たす位置の軌跡は、円周角が一定の点の軌跡であり、実オブジェクト231と実オブジェクト232を通る円241になる。
These calculation methods will be described with reference to FIG. The optical axis 133 of the lens 131 forms an angle θA with the real object 231 and forms an angle θB with the real object 232. The trajectory of the position that satisfies these two conditions is a trajectory of a point having a constant circumferential angle, and becomes a circle 241 passing through the real object 231 and the real object 232.
円241の半径Rは、実オブジェクト231と実オブジェクト232の実空間上の位置A(XA、YA)、B(XB,YB)を用い、以下の式(3)で算出される。
The radius R of the circle 241 is calculated by the following equation (3) using the positions A (XA, YA) and B (XB, YB) of the real object 231 and the real object 232 in the real space.
なお、光軸133に対するθA、θBの関係から、撮影部102が存在する軌跡は、円241の実線で示す弧ABに特定できる。破線で示す弧AB上は、θA、θBの関係が左右反転する。このため、撮影部102が存在する軌跡として成り立たない。
Note that, from the relationship of θA and θB with respect to the optical axis 133, the locus where the imaging unit 102 exists can be specified as an arc AB indicated by a solid line of a circle 241. On the arc AB indicated by a broken line, the relationship between θA and θB is reversed left and right. For this reason, it does not hold as a locus where the photographing unit 102 exists.
次に、位置姿勢算出部107は、粗検出センサ103が検出した方向を用い、撮影部102のレンズ131の光軸133の方向を特定する。光軸133の方向が特定されると、弧AB上に存在する撮影部102の位置は、一意に決定される。
Next, the position / orientation calculation unit 107 specifies the direction of the optical axis 133 of the lens 131 of the photographing unit 102 using the direction detected by the rough detection sensor 103. When the direction of the optical axis 133 is specified, the position of the imaging unit 102 existing on the arc AB is uniquely determined.
なお、検出される撮影部102の位置は、撮影部102が備えるレンズ131の主点とする。もちろん、検出位置は、主点に限定されない。撮影部102内のいずれの位置であってもよい。
Note that the detected position of the imaging unit 102 is the principal point of the lens 131 provided in the imaging unit 102. Of course, the detection position is not limited to the principal point. Any position in the photographing unit 102 may be used.
以上により、撮影部102の位置および姿勢を検出できる。本実施形態の位置姿勢検出装置100は、以上の処理を、所定の時間間隔で繰り返し、常に、最新の、撮影部102の位置および姿勢を検出する。なお、上記処理フローにおいて、ステップS1103の撮影画像取得処理が最初であってもよい。この場合、位置姿勢検出装置100は、撮影部102が撮影画像を取得したことを契機に、上記の処理を開始する。
As described above, the position and orientation of the photographing unit 102 can be detected. The position / orientation detection apparatus 100 according to the present embodiment repeats the above processing at predetermined time intervals, and always detects the latest position and orientation of the imaging unit 102. In the above processing flow, the captured image acquisition process in step S1103 may be the first. In this case, the position / orientation detection apparatus 100 starts the above process when the photographing unit 102 acquires a photographed image.
なお、上記処理フローでは、評価オブジェクトの例として、評価オブジェクト候補211と評価オブジェクト候補212とが、1つの二次元マップ情報210に登録されている場合を例にあげて説明した。しかし、それぞれの評価オブジェクト候補211、212は、別の二次元マップ情報210に登録されていても良い。
In the above processing flow, the case where the evaluation object candidate 211 and the evaluation object candidate 212 are registered in one two-dimensional map information 210 has been described as an example of the evaluation object. However, each evaluation object candidate 211 and 212 may be registered in another two-dimensional map information 210.
次に、評価オブジェクト221、222を特定する際に行う、テンプレート画像を用いたパターンマッチング処理について、図6(a)~図6(c)を用いて説明する。図6(a)は、テンプレート画像310、図6(b)は、撮影画像220を示す。
Next, pattern matching processing using a template image performed when specifying the evaluation objects 221 and 222 will be described with reference to FIGS. 6 (a) to 6 (c). 6A shows a template image 310, and FIG. 6B shows a captured image 220.
パターンマッチング処理は、ある画像内において、テンプレート画像と同一の画像があるかを判定する処理であり、同一の画像があった場合には、その画像の画素位置を特定できる。
The pattern matching process is a process for determining whether or not there is the same image as the template image in a certain image. If there is the same image, the pixel position of the image can be specified.
オブジェクトの所定の面は、撮影部102が撮影する方向により、形状が変わる。このため、オブジェクト検出部105は、図6(a)に示すように、傾きやサイズが異なる複数のテンプレート画像311~316を生成する。ここでは、6種のテンプレート画像311~316を生成する場合を例示する。また、区別する必要が無い場合は、テンプレート画像311で代表する。
The shape of the predetermined surface of the object changes depending on the direction in which the photographing unit 102 photographs. Therefore, the object detection unit 105 generates a plurality of template images 311 to 316 having different inclinations and sizes as shown in FIG. Here, a case where six types of template images 311 to 316 are generated is illustrated. If there is no need to distinguish, the template image 311 is used as a representative.
次に、オブジェクト検出部105は、生成したテンプレート画像311毎に、図6(b)に矢印で示す方向に撮影画像220内を走査し、類似度を評価する。
Next, the object detection unit 105 scans the captured image 220 in the direction indicated by the arrow in FIG. 6B for each generated template image 311 and evaluates the degree of similarity.
画像の類似度の比較方法の一例として、差分を取り、そのヒストグラムを評価する方法がある。この手法では、完全に一致すれば差分として0が得られ、一致度が低下するほど0から離れた値が得られる。撮影画像220内で、テンプレート画像311を当てはめる領域を変えながら、テンプレート画像311と、撮影画像220内の当てはめた領域との類似度の評価結果を記録する。これを各テンプレート画像311~316について、繰り返す。撮影画像220の全域を、用意した全テンプレート画像311について評価して、最も差分が小さい領域を、一致領域と判定する。
As an example of a method for comparing the similarity of images, there is a method of taking a difference and evaluating the histogram. In this method, 0 is obtained as a difference if they completely match, and a value far from 0 is obtained as the degree of matching decreases. While changing the area to which the template image 311 is applied in the captured image 220, the evaluation result of the similarity between the template image 311 and the applied area in the captured image 220 is recorded. This is repeated for each of the template images 311 to 316. The entire area of the captured image 220 is evaluated with respect to all prepared template images 311, and an area with the smallest difference is determined as a matching area.
図6(b)では、撮影画像220内の領域225が、テンプレート画像311との類似度評価が最大となる領域である。よって、オブジェクト検出部105は、撮影画像220内に、評価オブジェクトが存在していると判定し、領域225を、評価オブジェクトの領域(画素位置)と特定する。
6B, a region 225 in the captured image 220 is a region where the similarity evaluation with the template image 311 is maximized. Therefore, the object detection unit 105 determines that an evaluation object exists in the captured image 220, and specifies the region 225 as the region (pixel position) of the evaluation object.
この手法を用いる場合、オブジェクト検出部105は、評価結果から、評価オブジェクトの画素位置も同時に取得できる。これを用い、水平距離PdAを決定できる。
When this method is used, the object detection unit 105 can simultaneously acquire the pixel position of the evaluation object from the evaluation result. Using this, the horizontal distance PdA can be determined.
なお、評価オブジェクトの向きが予め特定され、用いるべきテンプレート画像が特定されている場合は、1つのテンプレート画像を抽出し、それについてのみ行えばよい。また、評価オブジェクトの、撮影画像220内でのおおよその存在領域が既知である場合は、当該領域についてのみ行ってもよい。
Note that when the orientation of the evaluation object is specified in advance and the template image to be used is specified, one template image is extracted and only that is necessary. Further, when the approximate existence area of the evaluation object in the captured image 220 is known, the evaluation object may be performed only for the area.
また、テンプレート画像311作成の元となる二次元マップ情報210と撮影画像220とが、全く同じ画像であれば、マッチング領域、すなわち、評価オブジェクトの撮影画像内の画素位置を容易に特定できる。しかし、両者の影条件が異なる場合、全く同じ画像になりえない。
Also, if the two-dimensional map information 210 that is the basis for creating the template image 311 and the captured image 220 are exactly the same image, the matching region, that is, the pixel position in the captured image of the evaluation object can be easily specified. However, if the shadow conditions of the two are different, the images cannot be exactly the same.
従って、テンプレートマッチングに先立ち、明るさや色に関する画像情報に関する対策と、撮影方向によるオブジェクトの歪みに関する対策とを撮影画像に施してもよい。撮影方向によるオブジェクトの歪みとは、例えば、三次元空間において、観測者が正対している場合に矩形の平面を有する物体が、斜めから見ると略台形に見える変形を意味する。
Therefore, prior to template matching, countermeasures related to image information related to brightness and color and countermeasures related to object distortion depending on the shooting direction may be applied to the captured image. The distortion of the object due to the photographing direction means, for example, a deformation in which a three-dimensional space has a rectangular plane when an observer is facing the object, and looks almost trapezoid when viewed from an oblique direction.
画像情報に関する対策では、例えば、二次元マップ情報210と撮影画像220とを、グレースケールに変換し、それぞれのヒストグラムから明るさを揃える。
In the measures relating to image information, for example, the two-dimensional map information 210 and the captured image 220 are converted to gray scale, and the brightness is made uniform from the respective histograms.
また、オブジェックトの歪みに関する対策では、例えば、図6(a)で示すように、オブジェクトからテンプレート画像311を生成するときに、仮想的に三次元的に変形させる。すなわち、前述のような変形があってもオブジェクトを検出するためには、オブジェクトに対し、矩形が台形になるような形状変換を施し、テンプレート画像311~316とする。
Further, as a countermeasure for distortion of the object, for example, as shown in FIG. 6A, when generating the template image 311 from the object, it is virtually deformed three-dimensionally. In other words, in order to detect an object even if it is deformed as described above, shape conversion is performed on the object so that a rectangle becomes a trapezoid, and template images 311 to 316 are obtained.
テンプレートマッチングする前には、撮影画像220内のオブジェクトの撮影方向は特定できていない。そのため、想定される撮影方向全てに対し、それぞれの方向に対応する形状変換を施してテンプレート画像を生成し、テンプレートマッチングを行う。これにより、任意の撮影方向に対応できる。
Before the template matching, the shooting direction of the object in the shot image 220 cannot be specified. Therefore, a template image is generated by performing shape conversion corresponding to each of the assumed shooting directions, and template matching is performed. Thereby, it can respond to arbitrary imaging directions.
また、撮影条件の違いに対応する他の対策として、オブジェクトに、回転、拡大縮小等の変形を施してテンプレート画像を生成する。これにより、撮影距離の違いによる、オブジェクトの大きさの違い、撮影に用いたカメラの光軸に対する回転の違い、等の影響を排除できる。これにより、マッチング性能を向上できる。
Also, as another countermeasure corresponding to the difference in shooting conditions, a template image is generated by applying deformation such as rotation and enlargement / reduction to the object. Thereby, it is possible to eliminate influences such as a difference in object size due to a difference in shooting distance, a difference in rotation with respect to the optical axis of the camera used for shooting, and the like. Thereby, matching performance can be improved.
なお、上記したパターンマッチング処理は、一般的な技術であり、高速に処理できる利点を有している。
The pattern matching process described above is a general technique and has an advantage that it can be processed at high speed.
次に、水平距離PdA、PdBを算出する際の、評価オブジェクト221、222の代表点の決め方について説明する。本実施形態では、評価オブジェクト221、222を特定する際に用いるテンプレート画像内で決定する。
Next, how to determine the representative points of the evaluation objects 221 and 222 when calculating the horizontal distances PdA and PdB will be described. In the present embodiment, the determination is made within the template image used when specifying the evaluation objects 221 and 222.
図7(a)~図7(d)は、それぞれテンプレート画像321~325を示す。各テンプレート画像321~325内の白丸が代表点331を示す。
7 (a) to 7 (d) show template images 321 to 325, respectively. White circles in the template images 321 to 325 indicate the representative points 331.
図7(a)に示すようにオブジェクトが正面を向いたテンプレート画像321の場合は、外形の中心を代表点331とする。図7(b)や図7(c)に示すようにオブジェクトが傾いたテンプレート画像311、313の場合は、最外形から破線で示す矩形を定義して、その矩形の中心を代表点331とする。
As shown in FIG. 7A, in the case of the template image 321 in which the object faces the front, the center of the outer shape is set as the representative point 331. In the case of the template images 311 and 313 in which the object is inclined as shown in FIGS. 7B and 7C, a rectangle indicated by a broken line is defined from the outermost shape, and the center of the rectangle is set as the representative point 331. .
一部突起があるオブジェクトのテンプレート画像324の場合は、図7(d)に示すように最外形から破線で示す矩形を定義してその矩形の中心を代表点331としても良い。また、図7(e)に示すように一部を破線で示す矩形で定義して、その矩形の中心を代表点331としても良い。また、三角形のオブジェクトのテンプレート画像325の場合は、図7(f)に示すように、最外形を定義してその矩形の中心を代表点331とする。
In the case of the template image 324 of an object having a partial protrusion, a rectangle indicated by a broken line from the outermost shape may be defined as shown in FIG. 7D, and the center of the rectangle may be set as the representative point 331. Alternatively, as shown in FIG. 7E, a part of the rectangle may be defined by a rectangle, and the center of the rectangle may be used as the representative point 331. Further, in the case of a template image 325 of a triangular object, the outermost shape is defined and the center of the rectangle is set as a representative point 331 as shown in FIG.
上記したように最外形から矩形を定義してその中心を代表点331にする手法が、簡易で最も望ましい。もちろん、パターンマッチング処理が可能であれば、図7(e)に示すように、最外径を用いなくてもよい。例えば、オブジェクトの任意の角としても良い。
As described above, the method of defining a rectangle from the outermost shape and setting the center to the representative point 331 is simple and most desirable. Of course, if pattern matching processing is possible, the outermost diameter may not be used as shown in FIG. For example, any corner of the object may be used.
上述のように代表点331は、位置姿勢検出処理において、評価オブジェクトの原点からの水平距離を算出する際に用いられる。このため、代表点331は、地図情報に正確に関連付け可能なように構成されることが望ましい。
As described above, the representative point 331 is used when calculating the horizontal distance from the origin of the evaluation object in the position and orientation detection process. For this reason, it is desirable that the representative point 331 is configured so as to be accurately associated with the map information.
以上説明したように、本実施形態の位置姿勢検出装置100は、2以上のオブジェクトを含む予め定めた撮影範囲を撮影する撮影部102と、前記撮影部102で撮影された撮影画像内の前記オブジェクトそれぞれの画素位置を特定するオブジェクト検出部105と、前記オブジェクトそれぞれの前記撮影部102に対する方向であるオブジェクト方向を、前記画素位置と、前記オブジェクトそれぞれの地図情報と、前記撮影部の焦点距離と、を用いて算出する方向算出部106と、前記各オブジェクトのオブジェクト方向と前記地図情報とを用い、前記撮影部の位置および姿勢を算出する位置姿勢算出部107と、を備える。そして、前記オブジェクト検出部105は、予め定めた領域内の複数のオブジェクトの位置および形状の情報が格納される二次元マップ情報から、前記撮影範囲に対応する二次元マップ情報を抽出し、当該二次元マップ情報を用いて前記画素位置を特定する。
As described above, the position / orientation detection apparatus 100 according to the present embodiment captures a predetermined photographing range including two or more objects, and the object in the photographed image photographed by the photographing unit 102. An object detection unit 105 that identifies each pixel position; an object direction that is a direction of each of the objects with respect to the shooting unit 102; a pixel position; map information of each of the objects; a focal length of the shooting unit; And a position / orientation calculation unit 107 that calculates the position and orientation of the photographing unit using the object direction of each object and the map information. Then, the object detection unit 105 extracts the two-dimensional map information corresponding to the shooting range from the two-dimensional map information in which the information on the positions and shapes of a plurality of objects in a predetermined area is stored. The pixel position is specified using dimension map information.
このように、本実施形態によれば、撮影部102が実際に撮影した画像を用い、撮影部102の位置と姿勢とを算出する。このとき、撮影画像内で2つのオブジェクトを抽出し、それらを用いて算出する。このため、現実空間に対応した三次元地図や三次元モデルを用いた画像処理が不要である。また、外部のGPS等の精度にもよらない。従って、本実施形態によれば、簡易な構成で、少ない情報処理量で、撮影部102の位置および姿勢を高精度に検出できる。従って、撮影部102との相対位置および相対方向が既知の各種の装置の位置および姿勢を高精度の推定できる。
As described above, according to the present embodiment, the position and orientation of the photographing unit 102 are calculated using an image actually taken by the photographing unit 102. At this time, two objects are extracted from the captured image and calculated using them. For this reason, image processing using a three-dimensional map or a three-dimensional model corresponding to the real space is unnecessary. Moreover, it does not depend on the accuracy of an external GPS or the like. Therefore, according to the present embodiment, the position and orientation of the photographing unit 102 can be detected with high accuracy with a simple configuration and a small amount of information processing. Therefore, the position and orientation of various devices whose relative position and relative direction with respect to the imaging unit 102 are known can be estimated with high accuracy.
本実施形態によれば、コンピュータが得意なパターンマッチング処理と、簡易な幾何計算で、自身が備える撮影部102の位置と姿勢とを求めることができる。すなわち、自身の位置および姿勢を検出する位置姿勢検出装置100を、少ない情報量で実現できる。
According to this embodiment, the position and orientation of the photographing unit 102 included in the computer can be obtained by a pattern matching process that the computer is good at and simple geometric calculation. That is, the position / orientation detection apparatus 100 that detects its own position and orientation can be realized with a small amount of information.
また、本実施形態によれば、上記手法による位置姿勢の検出を、所定の時間間隔で繰り返す。従って、常時、安定して少ない情報量で、高精度に、撮影部102を備える位置姿勢検出装置100の位置と姿勢を特定できる。
Further, according to the present embodiment, the position and orientation detection by the above method is repeated at predetermined time intervals. Accordingly, the position and orientation of the position / orientation detection apparatus 100 including the imaging unit 102 can be identified with high accuracy and constantly with a small amount of information.
なお、上記実施形態では、位置姿勢算出部107は、2つの評価オブジェクト221、222を用い、撮影部102の位置および姿勢を算出している。しかしながら、位置姿勢算出部107による位置および姿勢の算出は、この手法に限定されない。例えば、3個の評価オブジェクトを用いてもよい。
In the above embodiment, the position / orientation calculation unit 107 calculates the position and orientation of the photographing unit 102 using the two evaluation objects 221 and 222. However, the calculation of the position and orientation by the position and orientation calculation unit 107 is not limited to this method. For example, three evaluation objects may be used.
3個の評価オブジェクトを用いて、撮影部102の位置および姿勢を検出する手順を、図8(a)を用いて説明する。
A procedure for detecting the position and orientation of the photographing unit 102 using the three evaluation objects will be described with reference to FIG.
上記同様の手順で、3個の評価オブジェクトに対応する実オブジェクト231、232、233の地図情報((XA,YA)、(XB,YB)、(XC,YC))、および、レンズ131の光軸133に対する角度(θA,θB、θC)を取得する。
In the same procedure as described above, the map information ((XA, YA), (XB, YB), (XC, YC)) of the real objects 231, 232, 233 corresponding to the three evaluation objects, and the light of the lens 131 The angles (θA, θB, θC) with respect to the axis 133 are acquired.
実オブジェクト231と実オブジェクト232との情報から、撮影部102の存在する円241が定まる。同様に、実オブジェクト232と実オブジェクト233との情報から、撮影部102の存在する円242が定まる。両円241と242との交点を撮影部102の位置であり、レンズ131の光軸133に対する角度を満たす方向が、撮影部102の向き、すなわち、姿勢である。
From the information of the real object 231 and the real object 232, the circle 241 where the photographing unit 102 exists is determined. Similarly, a circle 242 where the photographing unit 102 exists is determined from information on the real object 232 and the real object 233. The intersection of both the circles 241 and 242 is the position of the photographing unit 102, and the direction satisfying the angle of the lens 131 with respect to the optical axis 133 is the direction of the photographing unit 102, that is, the posture.
このように3個の評価オブジェクトを用いれば、評価オブジェクトの検出結果のみから撮影部102の位置と姿勢とを得ることができる。
If the three evaluation objects are used in this way, the position and orientation of the photographing unit 102 can be obtained only from the detection result of the evaluation object.
なお、同一円上にある3個の評価オブジェクトが選定されると、上記手法で撮影部102の位置を特定できない。このため、この場合、同一円上にない、3個の評価オブジェクトを選定する。
Note that when three evaluation objects on the same circle are selected, the position of the photographing unit 102 cannot be specified by the above method. Therefore, in this case, three evaluation objects that are not on the same circle are selected.
また、撮影部102の位置を高い精度で求めるためには、円241と円242とは、できるだけ離れていることが望ましい。このため、例えば、図8(a)の例では、円241と実オブジェクト231と実オブジェクト233とを通る円とではなく、上記円241と円242とを用いて、撮影部102の位置を決定することが望ましい。
Further, in order to obtain the position of the photographing unit 102 with high accuracy, it is desirable that the circle 241 and the circle 242 be as far apart as possible. Therefore, for example, in the example of FIG. 8A, the position of the photographing unit 102 is determined using the circle 241 and the circle 242 instead of the circle passing through the circle 241, the real object 231 and the real object 233. It is desirable to do.
なお、撮影部102の位置および姿勢を特定する際に用いる評価オブジェクトの数は3個に限定されない。3個以上であってもよい。
Note that the number of evaluation objects used when specifying the position and orientation of the imaging unit 102 is not limited to three. Three or more may be sufficient.
用いる評価オブジェクトの数をN(Nは、1以上の整数)とする。この場合、撮影部102が存在する可能性がある軌跡の弧は、最大でN(N-1)/2個求められる。これらの軌跡は全て撮影部102の位置を通過するため、複数の弧の交点を求めることで撮影部の位置を求めることができる。弧の交点は、その弧をなす円の弦を通過するため、弦に相当する直線の交点を求めることに等しい。つまり、弦が2本以上あれば、円と直線の交点を求める2次方程式ではなく、直線の交点を求める1次方程式を解けばよい。このため、計算量を削減することができる。
Suppose that the number of evaluation objects to be used is N (N is an integer of 1 or more). In this case, a maximum of N (N−1) / 2 trajectory arcs where the photographing unit 102 may exist is obtained. Since all of these trajectories pass through the position of the photographing unit 102, the position of the photographing unit can be obtained by obtaining the intersection of a plurality of arcs. Since the intersection of the arcs passes through the circular chord forming the arc, it is equivalent to finding the intersection of the straight lines corresponding to the chord. That is, if there are two or more strings, it is sufficient to solve a linear equation for obtaining the intersection of the straight lines instead of a quadratic equation for obtaining the intersection of the circle and the straight line. For this reason, the amount of calculation can be reduced.
直線が3本以上であれば、2個以上の交点が求まる。全てが一致する点があれば、それが撮影部102の位置である。実際にはマッチング精度や、ピクセル座標に丸められる際の誤差により、全てが一致しない可能性がある。その場合は、得られた撮影部102の位置の候補点から、明らかなノイズ点を除外後の候補点に統計的な処理を施して、決定する。施す統計的処理は、最頻出点、平均点、等である。最頻出点を選択する場合、新規の座標計算は不要である。平均位置を選択する場合、単純な計算であり処理速度が速い。単純平均を使えば、真の解に対して最もばらつきを小さくできる。必要となる精度と処理速度に応じて選択することが望ましい。
If there are 3 or more straight lines, 2 or more intersections can be obtained. If there is a point where they all match, that is the position of the photographing unit 102. In reality, all may not match due to matching accuracy and errors in rounding to pixel coordinates. In this case, the candidate point after the exclusion of an apparent noise point is subjected to statistical processing from the obtained candidate point of the position of the photographing unit 102 and determined. The statistical processing to be performed is the most frequent point, the average point, and the like. When selecting the most frequent point, new coordinate calculation is not required. When selecting the average position, it is a simple calculation and the processing speed is fast. Using the simple average will minimize the variation for the true solution. It is desirable to select according to the required accuracy and processing speed.
このように、本構成であれば、3個以上のオブジェクトが検出できれば、撮影部102の位置と姿勢を特定することが可能である。この場合、粗検出センサ103による測位情報を利用せずに、位置と姿勢を特定できるため、さらに、位置姿勢の検出精度が向上する。
Thus, with this configuration, the position and orientation of the photographing unit 102 can be specified if three or more objects can be detected. In this case, since the position and orientation can be specified without using the positioning information from the coarse detection sensor 103, the position and orientation detection accuracy is further improved.
また、位置姿勢算出部107は、2つの評価オブジェクトと、撮影部102の周囲の、例えば、交通インフラの情報を用いて、撮影部102の位置および姿勢を算出してもよい。この手法を、図8(b)を用いて説明する。ここでは、交通インフラの情報として、道路234の情報を用いる場合を例にあげて説明する。また、撮影部102は、道路234を走行している移動体に搭載されているものとする。
Further, the position / orientation calculation unit 107 may calculate the position and orientation of the photographing unit 102 using two evaluation objects and information about the traffic infrastructure around the photographing unit 102, for example. This method will be described with reference to FIG. Here, the case where the information on the road 234 is used as the information on the traffic infrastructure will be described as an example. In addition, the photographing unit 102 is assumed to be mounted on a moving body traveling on the road 234.
道路234などの交通インフラの形状は規格で定められている。道路234を構成する要素をオブジェクト情報として利用することにより、高精度に姿勢が検知できる。すなわち、既存の手法で、撮影部102は、搭載されている移動体が走行している道路234を認識し、走行方向(撮影部102の光軸方向)を得ることができる。得られた光軸方向の情報を、上記実施形態の粗検出センサ103の結果の代わりに用い、撮影部102の位置および姿勢を決定する。
The shape of the traffic infrastructure such as the road 234 is defined by the standard. By using the elements constituting the road 234 as object information, the posture can be detected with high accuracy. That is, with the existing method, the imaging unit 102 can recognize the road 234 on which the mounted moving body is traveling, and obtain the traveling direction (the optical axis direction of the imaging unit 102). The obtained information on the optical axis direction is used in place of the result of the rough detection sensor 103 of the above embodiment, and the position and orientation of the photographing unit 102 are determined.
なお、交通インフラの情報として道路234を用いる場合を例に挙げたが、これに限定されない。例えば、橋、交差点などの路線情報や標識や信号機などの位置情報等を利用してもよい。
In addition, although the case where the road 234 was used as information of traffic infrastructure was mentioned as an example, it is not limited to this. For example, route information such as bridges and intersections and position information such as signs and traffic lights may be used.
また、道路234自体を評価オブジェクトとし、二次元マップ情報210を用いて撮影部102の位置および姿勢を検出してもよい。これにより、用いる評価オブジェクトを減らすことができる。
Alternatively, the position and orientation of the photographing unit 102 may be detected using the two-dimensional map information 210 using the road 234 itself as an evaluation object. Thereby, the evaluation object to be used can be reduced.
本手法によれば、粗検出センサ103の姿勢情報を利用せずに、位置と姿勢を特定できる。このため、精度を向上することができる。
According to this method, the position and orientation can be specified without using the orientation information of the coarse detection sensor 103. For this reason, accuracy can be improved.
次に、他の変形例として、オブジェクトの変形を利用した位置姿勢算出方法を説明する。図9(a)~図9(c)は、本変形例を説明するための図である。ここでは、2個以上の評価オブジェクトを用いる。
Next, a position / orientation calculation method using deformation of an object will be described as another modification. FIG. 9A to FIG. 9C are diagrams for explaining this modification. Here, two or more evaluation objects are used.
例えば、評価オブジェクト221に対応する実オブジェクト231の外観が、正面から見た場合、図9(c)に示すような矩形の形状251を有しているものとする。また、撮影画像220で得られた外観形状が、図9(b)に示すような台形252とする。この場合、その変形量から、図9(a)に示すように、実オブジェクト231と、撮影部102との、実空間上での位置関係を特定できる。
For example, it is assumed that the appearance of the real object 231 corresponding to the evaluation object 221 has a rectangular shape 251 as shown in FIG. Further, the external shape obtained from the captured image 220 is a trapezoid 252 as shown in FIG. In this case, as shown in FIG. 9A, the positional relationship between the real object 231 and the photographing unit 102 in the real space can be specified from the deformation amount.
評価オブジェクトを撮影画像内でパターンマッチングするとき、オブジェクト検出部105は、正面から見た形状251を元に、スケーリング、変形、回転、歪みなどの変形パラメータを加えてテンプレート画像を生成する。
When pattern matching the evaluation object in the captured image, the object detection unit 105 generates a template image by adding deformation parameters such as scaling, deformation, rotation, and distortion based on the shape 251 viewed from the front.
評価オブジェクト221は、生成されたテンプレート画像でパターンマッチングされる。パターンマッチングの際に用いたテンプレート画像の、変形パラメータを利用して評価オブジェクト221に対応する実オブジェクト231の正面形状261の法線(図中矢印で記載)を求める。
The evaluation object 221 is pattern-matched with the generated template image. A normal line (indicated by an arrow in the figure) of the front shape 261 of the real object 231 corresponding to the evaluation object 221 is obtained using the deformation parameter of the template image used for pattern matching.
このように、評価オブジェクト221の、実空間上での正面から見た場合の形状251が既知であれば、撮影画像220内の形状の変形量から平面である正面形状261の法線の向きを求めることができる。そして、これを用いて、撮影部102の向き(光軸133の方向)を求めることができる。
In this way, if the shape 251 of the evaluation object 221 when viewed from the front in real space is known, the direction of the normal of the front shape 261 that is a plane is determined from the deformation amount of the shape in the captured image 220. Can be sought. Then, using this, the direction of the photographing unit 102 (direction of the optical axis 133) can be obtained.
なお、評価オブジェクト221が建築物である場合、実空間上での建築物の面の向きは地図情報230などから求めてもよい。
When the evaluation object 221 is a building, the direction of the building surface in the real space may be obtained from the map information 230 or the like.
本変形例の手法によれば、抽出されたオブジェクトの数が2個でも位置と姿勢を求めることが可能になる。以上のように、本変形例の位置と姿勢の検出方法であれば、GPSや電子コンパスなどの粗検出センサ103の値を直接使わないため、粗検出センサ103の精度によらず、位置と姿勢を高精度に求めることができる。
手法 According to the method of this modification, it is possible to obtain the position and orientation even if the number of extracted objects is two. As described above, the position and orientation detection method according to the present modification does not directly use the values of the coarse detection sensor 103 such as the GPS or the electronic compass, and therefore the position and orientation are not affected by the accuracy of the coarse detection sensor 103. Can be obtained with high accuracy.
なお、上述の、各位置姿勢算出手法は、求められる精度に応じて選択されることが望ましい。
It should be noted that each position and orientation calculation method described above is preferably selected according to the required accuracy.
なお、上記実施形態および各変形例では、各評価オブジェクトの水平距離を用いて、撮影部102の位置および姿勢を算出する場合を説明した。ここで、さらに、垂直距離も測定し、オブジェクトの高さ方向も推定するよう構成してもよい。
In the above embodiment and each modification, the case where the position and orientation of the photographing unit 102 are calculated using the horizontal distance of each evaluation object has been described. Here, the vertical distance may also be measured and the height direction of the object may be estimated.
なお、マップ管理部108は、評価オブジェクト抽出処理に先立ち、粗検出センサ103が取得した測位情報を用いて、処理に必要な二次元マップ情報および地図情報を抽出し、それぞれ保持部に格納している。しかしながら、これに限定されない。
Prior to the evaluation object extraction process, the map management unit 108 uses the positioning information acquired by the rough detection sensor 103 to extract two-dimensional map information and map information necessary for the process, and stores them in the holding unit. Yes. However, it is not limited to this.
例えば、一旦、位置姿勢算出部107が、撮影部102の位置および姿勢を算出した後は、この情報を用いて、必要な二次元マップ情報および地図情報を抽出するよう構成してもよい。これにより、マップ管理部108は、より高精度に必要十分な二次元マップ情報を抽出でき、処理の精度が高まる。また、常時、粗検出センサ103から位置情報を取得する必要がないため、GPSからの信号が受信できない場所でも位置および姿勢を算出できる。
For example, once the position / orientation calculation unit 107 calculates the position and orientation of the photographing unit 102, the information may be used to extract necessary two-dimensional map information and map information. Thereby, the map management unit 108 can extract necessary and sufficient two-dimensional map information with higher accuracy, and the processing accuracy is improved. In addition, since there is no need to always acquire position information from the rough detection sensor 103, the position and orientation can be calculated even in a place where a signal from the GPS cannot be received.
本実施形態の位置姿勢検出装置100は、さらに、二次元マップ生成部109を備えてもよい。
The position / orientation detection apparatus 100 according to the present embodiment may further include a two-dimensional map generation unit 109.
二次元マップ生成部109は、撮影部102の位置および姿勢を用いて撮影部102が取得した撮影画像220を解析し、二次元マップ情報210を生成する。このとき、撮影画像220内のオブジェクトの画素位置と、オブジェクトの外観とを用いる。すなわち、オブジェクト検出部105が検出した各オブジェクトの画素位置をマップ内位置とし、パターンマッチングに用いた形状をオブジェクト形状等とする。
The two-dimensional map generation unit 109 analyzes the captured image 220 acquired by the imaging unit 102 using the position and orientation of the imaging unit 102, and generates two-dimensional map information 210. At this time, the pixel position of the object in the captured image 220 and the appearance of the object are used. That is, the pixel position of each object detected by the object detection unit 105 is set as a map position, and the shape used for pattern matching is set as an object shape.
撮影画像220は、二次元情報であるため、そのまま二次元マップ情報210としてもよい。二次元マップ生成部109は、オブジェクトの、実空間上での位置と、撮影されたオブジェクトの外観とを関連付け、二次元マップ情報を生成する。
Since the captured image 220 is two-dimensional information, the two-dimensional map information 210 may be used as it is. The two-dimensional map generation unit 109 associates the position of the object in the real space with the appearance of the photographed object, and generates two-dimensional map information.
二次元マップ生成部109を有することにより、二次元マップに登録されていない未知のオブジェクト、例えば、新規に建設された建物を、外観と位置情報とを関連付けて、二次元マップ情報として新規登録することができる。
By having the two-dimensional map generation unit 109, an unknown object that is not registered in the two-dimensional map, for example, a newly constructed building, is newly registered as two-dimensional map information by associating the appearance with the position information. be able to.
撮影画像220内に含まれる、既知ではないオブジェクトを二次元マップ情報210に登録する場合は、パターンマッチングだけでなく、人工知能やディープラーニングなどの手法を用いてもよい。既知ではないオブジェクトには、例えば、二次元マップ情報210や地図情報230に登録されていない建造物や、改装されて外観が変わってしまった建造物などが含まれる。
In the case of registering an unknown object included in the captured image 220 in the two-dimensional map information 210, not only pattern matching but also a technique such as artificial intelligence or deep learning may be used. Non-known objects include, for example, buildings that are not registered in the two-dimensional map information 210 and the map information 230, and buildings that have been renovated and have changed appearance.
この場合、例えば、同じオブジェクトを含む二次元マップ情報210を複数取得する。そして、ネットワークを通じてサーバに蓄積された情報や、二次元マップ情報保持部123に蓄積された二次元マップ情報210を用いて学習し、オブジェクトの特徴点を算出する。その結果を、オブジェクトのパターンマッチングに利用する。特徴点でマッチングする方法であれば、オブジェクトと撮影部102との間に遮蔽物が存在する場合であっても、マッチングの成功率を高めることができる。
In this case, for example, a plurality of two-dimensional map information 210 including the same object is acquired. Then, learning is performed using the information stored in the server through the network and the two-dimensional map information 210 stored in the two-dimensional map information holding unit 123 to calculate the feature points of the object. The result is used for object pattern matching. If it is the method of matching by a feature point, even if it is a case where a shield exists between an object and the imaging | photography part 102, the success rate of a matching can be raised.
また、取得時間の異なる複数の撮影画像220を用いて、オブジェクトの位置を特定してもよい。対象となるオブジェクトの撮影画像220内での位置と、撮影部102の位置および姿勢とを用いて、実空間上のオブジェクトが存在する可能性がある領域を特定できる。特定される領域は、直線となる。位置姿勢検出装置100が移動する場合、所定の時間後に同様の処理を行うと、オブジェクトの存在領域として、他の直線が得られる。2回の処理で得られた2直線の交点が、オブジェクトの実在する位置(実空間上の位置、緯度、経度、座標等)である。なお、撮影部102の位置および姿勢の変化とオブジェクト候補の水平距離の変化とから、実空間上でのオブジェクト候補の存在位置を算出してもよい。
Further, the position of the object may be specified using a plurality of captured images 220 having different acquisition times. Using the position of the target object in the captured image 220 and the position and orientation of the imaging unit 102, it is possible to specify an area where an object in real space may exist. The specified area is a straight line. When the position / orientation detection apparatus 100 moves, if a similar process is performed after a predetermined time, another straight line is obtained as the object existence region. The intersection of the two straight lines obtained by the two processes is the position where the object actually exists (position in real space, latitude, longitude, coordinates, etc.). Note that the position of the object candidate in the real space may be calculated from the change in the position and orientation of the photographing unit 102 and the change in the horizontal distance of the object candidate.
得られたオブジェクト情報を、既存の二次元マップ情報210に追加登録することにより、二次元マップ情報210をリアルタイムに更新できる。
2D map information 210 can be updated in real time by additionally registering the obtained object information in the existing 2D map information 210.
なお、オブジェクトが二次元マップに既に登録され、位置が既知であれば、外観を更新することができる。オブジェクトが移動するものであれば、最新の位置情報に更新することが可能である。
Note that the appearance can be updated if the object is already registered in the two-dimensional map and the position is known. If the object moves, it can be updated to the latest position information.
二次元マップ生成部109では、位置姿勢検出装置100が、位置姿勢を検出するために用いた情報を使ってオブジェクトの実空間内での位置を求める。従って、二次元マップ情報を生成するために改めて計算する必要がない。このため、計算コストを抑えて二次元マップ情報を最新の状態に維持できる。
In the two-dimensional map generation unit 109, the position / orientation detection apparatus 100 obtains the position of the object in the real space using information used for detecting the position / orientation. Therefore, there is no need to calculate again to generate the two-dimensional map information. For this reason, it is possible to keep the two-dimensional map information up-to-date while reducing the calculation cost.
二次元マップ情報の生成において、3次元形状の地図データモデルを利用してもよい。建築物の実態を位置と高さとで示す地図データの場合、現実世界における建築物の形状は既知である。建築物の側面の外観データを対応させるため、側面から撮影した画像を3次元形状の地図データに貼りつけて二次元マップ情報としても良い。例えば、側面から撮影した画像はグーグル社のストリートビューを利用することができる。
In generating 2D map information, a 3D map data model may be used. In the case of map data indicating the actual state of a building by its position and height, the shape of the building in the real world is known. In order to correspond the appearance data of the side surface of the building, an image taken from the side surface may be pasted on map data having a three-dimensional shape to obtain two-dimensional map information. For example, Google Street View can be used for images taken from the side.
また、二次元マップ情報を生成するとき、3次元形状の地図データに側面から撮影した画像を貼りつけて、仮想的なカメラを地図データのモデル内に配置して、CGレンダリングしてもよい。このような方法であれば、既存の地図データを活用して効率的に二次元マップ情報を生成することができる。
Also, when generating 2D map information, an image taken from the side may be pasted on 3D map data, and a virtual camera may be placed in the map data model for CG rendering. With such a method, two-dimensional map information can be efficiently generated using existing map data.
<<第二の実施形態>>
次に、本発明を適用する第二の実施形態を説明する。本実施形態は、第一の実施形態の位置姿勢検出装置100を備えるAR表示装置500である。以下、本実施形態では、位置姿勢検出装置100が、自動車に搭載され、自動車のフロントガラスにAR表示がなされる場合を例にあげて説明する。 << Second Embodiment >>
Next, a second embodiment to which the present invention is applied will be described. The present embodiment is anAR display device 500 including the position and orientation detection device 100 of the first embodiment. Hereinafter, in the present embodiment, a case where the position / orientation detection apparatus 100 is mounted on an automobile and AR display is performed on the windshield of the automobile will be described as an example.
次に、本発明を適用する第二の実施形態を説明する。本実施形態は、第一の実施形態の位置姿勢検出装置100を備えるAR表示装置500である。以下、本実施形態では、位置姿勢検出装置100が、自動車に搭載され、自動車のフロントガラスにAR表示がなされる場合を例にあげて説明する。 << Second Embodiment >>
Next, a second embodiment to which the present invention is applied will be described. The present embodiment is an
図10は、本実施形態のAR表示装置500の機能ブロック図である。本図に示すように、本実施形態のAR表示装置500は、位置姿勢検出装置100と、表示コンテンツ生成部501と、表示コンテンツ選択部502と、重畳部503と、表示部504と、指示受付部506と、抽出部507と、を備える。なお、制御部101とゲートウェイ110とは、ここでは、位置姿勢検出装置100と共用する場合を例にあげて説明する。
FIG. 10 is a functional block diagram of the AR display device 500 of the present embodiment. As shown in the figure, the AR display device 500 of the present embodiment includes a position / orientation detection device 100, a display content generation unit 501, a display content selection unit 502, a superimposition unit 503, a display unit 504, and instruction reception. A unit 506 and an extraction unit 507. Here, the case where the control unit 101 and the gateway 110 are shared with the position / orientation detection apparatus 100 will be described as an example.
また、コンテンツ保持部511を備える。コンテンツ保持部511は、ARコンテンツを含むコンテンツを一時的に保持するメモリであり、高速アクセス可能なメモリで構成される。以下、特に区別する必要がない限り、通常のコンテンツとARコンテンツとを、コンテンツと総称する。
Also, a content holding unit 511 is provided. The content holding unit 511 is a memory that temporarily holds content including AR content, and is configured by a memory that can be accessed at high speed. Hereinafter, the normal content and the AR content are collectively referred to as content unless it is particularly necessary to distinguish them.
位置姿勢検出装置100は、基本的に第一の実施形態の位置姿勢検出装置100と同様の構成を有する。ただし、本実施形態の位置姿勢検出装置100では、オブジェクト検出部105は、評価オブジェクト抽出部104が抽出した全評価オブジェクトについて、その画素位置を検出し、保持するよう構成してもよい。この情報は、後述する重畳部503が、コンテンツを重畳する位置を決定する際に用いられる。また、位置姿勢検出装置100の制御部101は、AR表示装置500全体の各部の動作を制御するものとする。また、ゲートウェイ110は、AR表示装置500の通信インタフェースとして機能する。
The position / orientation detection apparatus 100 basically has the same configuration as the position / orientation detection apparatus 100 of the first embodiment. However, in the position / orientation detection apparatus 100 of the present embodiment, the object detection unit 105 may be configured to detect and hold the pixel positions of all the evaluation objects extracted by the evaluation object extraction unit 104. This information is used when the superimposing unit 503, which will be described later, determines the position where the content is superimposed. In addition, the control unit 101 of the position / orientation detection apparatus 100 controls the operation of each unit of the entire AR display device 500. The gateway 110 functions as a communication interface for the AR display device 500.
指示受付部506は、ユーザ(ドライバ)530からの操作入力を受け付ける。本実施形態では、例えば、コンテンツの選択や表示するコンテンツの条件を受け付ける。受け付けは、例えば、操作ボタン、タッチパネル、キーボード、マウス等既存の操作装置を介して行う。また、音声入力、まばたきによる指示等を可能とする装置を備えてもよい。
The instruction receiving unit 506 receives an operation input from the user (driver) 530. In the present embodiment, for example, selection of content or a condition of content to be displayed is received. The reception is performed via an existing operation device such as an operation button, a touch panel, a keyboard, or a mouse. Moreover, you may provide the apparatus which enables the voice input, the instruction | indication by blinking, etc.
抽出部507は、サーバや他の記憶装置に記憶される、コンテンツ中から、AR表示装置500の処理で用いる可能性のあるコンテンツを取得する。取得したコンテンツは、コンテンツ保持部511に格納される。AR表示装置500が処理する情報は主に周辺情報であるため、処理する可能性がある情報に制限して取得、処理することで処理コストを軽減する。
The extraction unit 507 acquires content that may be used in the processing of the AR display device 500 from the content stored in the server or other storage device. The acquired content is stored in the content holding unit 511. Since the information processed by the AR display device 500 is mainly peripheral information, the processing cost is reduced by acquiring and processing information limited to information that may be processed.
AR表示装置500が搭載される自動車の、進行方向、速度等の情報も加味し、取得するコンテンツを決定してもよい。例えば、移動体の速度をv、情報をダウンロードして保持部に保持するために必要な時間をTとすると、撮影部102を中心とした、少なくとも、半径Tvの円内の情報を取得し、コンテンツ保持部511に格納する。また、搭載されている自動車の、走行経路が特定されている場合は、経路周辺の情報を抽出し、コンテンツ保持部511に格納するよう構成してもよい。
The content to be acquired may be determined in consideration of information such as the traveling direction and speed of the vehicle on which the AR display device 500 is mounted. For example, when the speed of the moving object is v and the time required to download and hold the information in the holding unit is T, information in at least a circle with a radius Tv centered on the photographing unit 102 is acquired. Stored in the content holding unit 511. Further, when the travel route of the mounted vehicle is specified, information around the route may be extracted and stored in the content holding unit 511.
次に、ここで、コンテンツ保持部511に保持するコンテンツの、取得元のサーバ(コンテンツサーバ)の構成の一例を説明する。
Next, an example of the configuration of the acquisition source server (content server) of the content held in the content holding unit 511 will be described.
図3(b)は、コンテンツサーバシステム610を説明するための図である。本実施形態では、例えば、広告、コンテンツ、ARコンテンツを情報として保持、提供するものとする。
FIG. 3B is a diagram for explaining the content server system 610. In the present embodiment, for example, advertisements, contents, and AR contents are held and provided as information.
コンテンツサーバシステム610は、コンテンツサーバ611と、コンテンツ格納部612と、ARコンテンツ格納部613と、広告格納部614と、通信I/F615と、を備える。
The content server system 610 includes a content server 611, a content storage unit 612, an AR content storage unit 613, an advertisement storage unit 614, and a communication I / F 615.
広告は、広告主から提供される広告のテキスト、画像、動画などである。また、コンテンツは、サービス内容に応じた情報である。例えば、エンターテイメント向けの動画像や、ゲーム等が挙げられる。ARコンテンツは、AR重畳することを想定したコンテンツであり、コンテンツそのものに加え、実空間上の表示位置や姿勢等のメタ情報を備える。メタ情報は、予め定義されてもよいし、AR表示装置500からの指示で動的に更新されてもよい。
The advertisement is the text, image, video, etc. of the advertisement provided by the advertiser. The content is information according to the service content. For example, a moving image for entertainment, a game, etc. are mentioned. The AR content is content that is assumed to be AR-superposed, and includes meta information such as a display position and a posture in the real space in addition to the content itself. The meta information may be defined in advance or may be dynamically updated according to an instruction from the AR display device 500.
コンテンツサーバ611は、通信I/F615を介して受け付けた指示に応じて、各保持部に保持される情報を、通信I/F615を介して外部に出力する。また、通信I/F615を介して受信した情報を、対応する保持部に保持する。出力する際、コンテンツサーバ611は、上述の各情報を統合し、ネットワークを通じてAR表示装置500に提供してもよい。各情報を統合する例として、例えば、エンターテイメントコンテンツに広告を挿入する等である。
The content server 611 outputs information held in each holding unit to the outside via the communication I / F 615 in response to an instruction received via the communication I / F 615. Further, the information received via the communication I / F 615 is held in the corresponding holding unit. When outputting, the content server 611 may integrate the above-described information and provide the AR display device 500 through a network. As an example of integrating each information, for example, an advertisement is inserted into entertainment content.
なお、ARコンテンツは、位置姿勢検出装置100が、二次元マップ情報および地図情報を抽出する手法と同様の手法で、選択抽出してもよい。一方、コンテンツについては、ユーザが視聴を所望するコンテンツを、指示受付部506を介して指示し、当該指示に応じて抽出するよう構成してもよい。
The AR content may be selected and extracted by a method similar to the method by which the position / orientation detection apparatus 100 extracts the two-dimensional map information and the map information. On the other hand, the content that the user desires to view may be instructed via the instruction receiving unit 506 and extracted according to the instruction.
表示コンテンツ選択部502は、コンテンツの表示条件に応じて、コンテンツ保持部511に保持されているコンテンツの中から、表示するコンテンツを選択する。以後、表示条件は、予め定められていてもよいし、指示受付部506を介して、ユーザが指定してもよい。
The display content selection unit 502 selects the content to be displayed from the content held in the content holding unit 511 according to the content display condition. Thereafter, the display conditions may be determined in advance or may be designated by the user via the instruction receiving unit 506.
この場合、指示受付部506は、ユーザの動作を認識する動作認識装置であってもよい。
In this case, the instruction receiving unit 506 may be a motion recognition device that recognizes the user's motion.
動作認識装置として、例えば、カメラによりユーザの動きを検出するジェスチャー認識、マイクにより検出する音声認識、視線を検出する視線認識等の認識装置を利用することができる。特に、ユーザが運転中の場合には、運転のため、手による動作が制限される。このため、視線や音声など、手の動作無しに指示が出せる動作認識装置が望ましい。音声認識装置であれば、比較的細かい指示ができるため多彩な操作に対応できる。視線認識装置であれば、操作しても周囲の他者に知覚されにくく、周囲環境に配慮できる。
As the motion recognition device, for example, a recognition device such as gesture recognition that detects a user's movement by a camera, voice recognition that is detected by a microphone, and gaze recognition that detects a gaze can be used. In particular, when the user is driving, the operation by hand is limited for driving. For this reason, it is desirable to have a motion recognition device that can give instructions without a hand motion, such as a line of sight or voice. A voice recognition device can handle a variety of operations because it can give relatively detailed instructions. If it is a line-of-sight recognition apparatus, even if it operates, it is hard to be perceived by others around it, and it can consider surrounding environment.
なお、音声認識の場合、予め操作者の声を登録しておき、音声入力したユーザを判別可能に構成してもよい。ユーザにより、音声認識により受け付ける内容を制限するよう構成してもよい。
In the case of voice recognition, an operator's voice may be registered in advance so that the user who has input voice can be identified. The user may be configured to limit the contents accepted by voice recognition.
これにより、ユーザが、能動的に、表示するコンテンツを選択できる。従って、ユーザが希望するコンテンツを適切に選択し、表示できる。
This allows the user to actively select the content to be displayed. Therefore, the content desired by the user can be appropriately selected and displayed.
表示部504は、表示するコンテンツを、重畳部503の指示に従って、フロントガラスに表示する。
The display unit 504 displays the content to be displayed on the windshield according to the instruction of the superimposing unit 503.
本実施形態の表示部504は、図11(a)に示すように、プロジェクタ521とディスプレイ(投射領域)522とを備える。ディスプレイ522は、透過性と反射性とを備える光学部品を組み合わせて実現され、フロントガラス上に配置される。ディスプレイ522の背後の実空間の光景は、ディスプレイ522を透過する。一方、プロジェクタ521によって生成される映像(画像)は、ディスプレイ522で反射する。ユーザ530は、ディスプレイ522を透過した実空間の光景と、ディスプレイ522で反射した映像とが重畳された画像を見る。
The display unit 504 of the present embodiment includes a projector 521 and a display (projection area) 522 as shown in FIG. The display 522 is realized by combining optical components having transparency and reflectivity, and is disposed on the windshield. The scene in real space behind the display 522 is transmitted through the display 522. On the other hand, the video (image) generated by the projector 521 is reflected by the display 522. The user 530 views an image in which a scene in real space that has passed through the display 522 and an image reflected by the display 522 are superimposed.
ディスプレイ522をフロントガラスの全領域とすれば、前方を見ているユーザ530の視界をカバーし、かつ、前方に広がる実際の光景に対して広い範囲でコンテンツの重畳表示が可能になる。
If the display 522 is the entire area of the windshield, the content of the user 530 who is looking forward can be covered, and the content can be displayed in a wide range over the actual scene spreading forward.
また、ディスプレイ522を、フロントガラスとは別部品とすれば、AR表示装置500に最適化した専用設計ができる。このため、設計が容易になる。このような構成の一例として、HUD(Head‐Up Display)がある。
Also, if the display 522 is a separate part from the windshield, a dedicated design optimized for the AR display device 500 can be achieved. For this reason, design becomes easy. An example of such a configuration is HUD (Head-Up Display).
表示コンテンツ生成部501は、選択されたコンテンツから、表示先であるフロントガラスに表示する表示コンテンツを生成する。本実施形態では、コンテンツを、フロントガラスを通してユーザの目に映る光景への重畳表示に適した表示態様に生成する。例えば、サイズ、色、明度等を決定する。なお、表示態様は、ユーザの指示に応じて、あるいは、予め定められた規則に従って、決定する。
The display content generation unit 501 generates display content to be displayed on the windshield as a display destination from the selected content. In the present embodiment, the content is generated in a display mode suitable for superimposed display on a scene that is seen by the user's eyes through the windshield. For example, the size, color, brightness, etc. are determined. The display mode is determined in accordance with a user instruction or in accordance with a predetermined rule.
重畳部503は、表示コンテンツ生成部501が生成した表示コンテンツの、ディスプレイ522上の表示位置を決定する。重畳部503は、まず、ディスプレイ522上の、オブジェクトの表示位置(配置位置)を特定する。そして、オブジェクトの表示位置に基づき、当該オブジェクトの関連コンテンツの表示位置を決定する。オブジェクトの表示位置は、位置姿勢検出装置100が検出した撮影部102の位置および姿勢と、撮影部102が撮影した撮影画像220上の、各オブジェクトの画素位置とを用い算出する。
The superimposing unit 503 determines the display position on the display 522 of the display content generated by the display content generating unit 501. First, the superimposing unit 503 specifies the display position (arrangement position) of the object on the display 522. Based on the display position of the object, the display position of the related content of the object is determined. The display position of the object is calculated using the position and orientation of the image capturing unit 102 detected by the position / orientation detection apparatus 100 and the pixel position of each object on the captured image 220 captured by the image capturing unit 102.
なお、本実施形態の重畳部503は、自動車の位置基準とする個所(車内基準位置)と、撮影部102の位置および姿勢との幾何的な関係、および、車内基準位置と、ユーザ(ドライバ)530の平均的な視野範囲531との幾何学的な関係を予め保持しておく。
Note that the superimposing unit 503 of the present embodiment includes a geometric relationship between a location (in-vehicle reference position) as a position reference of an automobile and the position and orientation of the photographing unit 102, an in-vehicle reference position, and a user (driver). The geometrical relationship with the average visual field range 531 of 530 is held in advance.
例えば、関連コンテンツを、実空間に存在するオブジェクト(実オブジェクト)に重畳させる場合、上記対応関係を用い、対応する評価オブジェクトの撮影画像内の画素位置に対応するディスプレイ522上の位置を算出する。
For example, when related content is superimposed on an object (real object) existing in the real space, the position on the display 522 corresponding to the pixel position in the captured image of the corresponding evaluation object is calculated using the correspondence relationship.
関連コンテンツの表示位置は、ユーザ530が評価オブジェクト223を見込んだ視線方向と、ディスプレイ522との交点に設定すればよい。まず、撮影部102が評価オブジェクト223を撮影する。撮影画像中の評価オブジェクト223の撮影位置から、撮影部102に対して評価オブジェクト223が存在する方向が分かる。
The display position of the related content may be set at the intersection of the line-of-sight direction in which the user 530 looks at the evaluation object 223 and the display 522. First, the photographing unit 102 photographs the evaluation object 223. From the photographing position of the evaluation object 223 in the photographed image, the direction in which the evaluation object 223 exists with respect to the photographing unit 102 is known.
図11(a)に示すように、ユーザ530と評価オブジェクト223との距離が比較的近い場合、評価オブジェクト223の撮影部102に対する相対位置を求める。予めわかっているユーザ530の目の位置の、撮影部102に対する相対位置を差し引くことで、ユーザ530の目の位置から評価オブジェクト223を見込む視線551が求まる。視線551がディスプレイ522と交差する点552を、関連コンテンツの表示位置とすればよい。
As shown in FIG. 11A, when the distance between the user 530 and the evaluation object 223 is relatively short, the relative position of the evaluation object 223 with respect to the photographing unit 102 is obtained. By subtracting the relative position of the eye position of the user 530, which is known in advance, with respect to the imaging unit 102, a line of sight 551 for viewing the evaluation object 223 from the eye position of the user 530 is obtained. A point 552 where the line of sight 551 intersects the display 522 may be a display position of related content.
図11(b)に示すように、ユーザ530と評価オブジェクト223との間の距離が、ユーザ530と撮影部102との間の距離に対して十分に大きい場合、撮影部102と評価オブジェクト223とを結ぶ直線553と、ユーザ530の目の位置から評価オブジェクト223を見込む視線551とは、略平行になる。このため、ユーザ530を通り、直線553と並行な直線がディスプレイ522と交差する点552を、関連コンテンツの表示位置とすればよい。
As shown in FIG. 11B, when the distance between the user 530 and the evaluation object 223 is sufficiently larger than the distance between the user 530 and the imaging unit 102, the imaging unit 102 and the evaluation object 223 The line 553 that connects the lines 551 and the line of sight 551 that looks at the evaluation object 223 from the position of the eyes of the user 530 are substantially parallel. Therefore, a point 552 that passes through the user 530 and a straight line parallel to the straight line 553 intersects the display 522 may be set as the display position of the related content.
このように関連コンテンツの表示位置を特定すれば、ユーザ530には、評価オブジェクト223に関連コンテンツが重畳されて表示されていると視認できる。当然、関連コンテンツは評価オブジェクト223に対して、指定した位置だけオフセットした表示であっても良い。
If the display position of the related content is specified in this way, the user 530 can visually recognize that the related content is displayed superimposed on the evaluation object 223. Of course, the related content may be a display offset by a designated position with respect to the evaluation object 223.
表示コンテンツ生成部501による表示態様の決定、および、重畳部503による表示位置の決定の詳細は、後述する。
Details of the determination of the display mode by the display content generation unit 501 and the determination of the display position by the superimposition unit 503 will be described later.
なお、AR表示装置500についても、位置姿勢検出装置100同様、CPU141と、メモリ142と、記憶装置143と、入出力インタフェース(I/F)144と、通信I/F145と、を備える情報処理装置で実現される。例えば、記憶装置143に予め保持するプログラムを、CPU141がメモリ142にロードして実行することにより実現される。なお、全部または一部の機能は、ASIC(Application Specific Integrated Circuit)、FPGA(field-programmable gate array)などのハードウェア、回路によって実現されてもよい。
Note that the AR display device 500 is also provided with a CPU 141, a memory 142, a storage device 143, an input / output interface (I / F) 144, and a communication I / F 145, similar to the position and orientation detection device 100. It is realized with. For example, the program is stored in advance in the storage device 143 by the CPU 141 loaded into the memory 142 and executed. All or some of the functions may be realized by hardware or a circuit such as ASIC (Application Specific Integrated Circuit) or FPGA (Field-programmable gate array).
また、各機能の処理に用いる各種のデータ、処理中に生成される各種のデータは、メモリ142、または、記憶装置143に格納される。コンテンツ保持部511は、例えば、メモリ142等に構築される。
Further, various data used for the processing of each function and various data generated during the processing are stored in the memory 142 or the storage device 143. The content holding unit 511 is constructed in the memory 142 or the like, for example.
次に、本実施形態のAR表示装置500による、AR表示処理の流れを説明する。図12は、本実施形態のAR表示処理の処理フローである。なお、AR表示処理は、位置姿勢検出装置100による、位置姿勢検出処理と同期させて行ってもよいし、別個独立して行うよう構成してもよい。以下、同期して行う場合、すなわち、位置姿勢検出装置100が撮影部102の位置および姿勢を検出する毎に、AR表示処理が実行される場合を例にあげて説明する。また、指示受付部506は、予め、表示するコンテンツの条件(表示条件)を受け付けておく。
Next, the flow of AR display processing by the AR display device 500 of this embodiment will be described. FIG. 12 is a processing flow of the AR display processing of the present embodiment. The AR display process may be performed in synchronization with the position / orientation detection process performed by the position / orientation detection apparatus 100, or may be configured to be performed independently. Hereinafter, a case where the AR display process is executed every time the position and orientation detection device 100 detects the position and orientation of the photographing unit 102 will be described as an example. In addition, the instruction receiving unit 506 receives in advance the conditions (display conditions) of the content to be displayed.
まず、位置姿勢検出装置100が、撮影部102の位置および姿勢を検出する(ステップS2101)。なお、位置姿勢検出装置100は、第一の実施形態と同様の手法で、撮影部102の位置および姿勢を決定する。本処理は、所定の時間間隔で実行される。
First, the position / orientation detection apparatus 100 detects the position and orientation of the photographing unit 102 (step S2101). Note that the position / orientation detection apparatus 100 determines the position and orientation of the imaging unit 102 by the same method as in the first embodiment. This process is executed at predetermined time intervals.
表示コンテンツ選択部502は、コンテンツ保持部511に保持されている各コンテンツの中から、表示するコンテンツを選択する(ステップS2102)。ここでは、各コンテンツについて、表示条件に従って、表示させるか否かを判定する。そして、表示条件に合致したコンテンツのみ、表示するコンテンツとする。選択は、例えば、重畳対象のオブジェクト毎に行ってもよい。
The display content selection unit 502 selects the content to be displayed from the contents held in the content holding unit 511 (step S2102). Here, it is determined whether or not to display each content according to the display condition. Only content that matches the display conditions is displayed. The selection may be performed for each object to be superimposed, for example.
そして、表示コンテンツ生成部501、重畳部503および表示部504は、以下の処理を、選択された全表示コンテンツに対して、繰り返す(ステップS2103)。
Then, the display content generation unit 501, the superimposition unit 503, and the display unit 504 repeat the following processing for all selected display contents (step S2103).
表示コンテンツ生成部501は、選択されたコンテンツの、表示態様を決定する(ステップS2104)。
The display content generation unit 501 determines the display mode of the selected content (step S2104).
重畳部503は、選択された表示コンテンツの、表示位置を決定する(ステップS2105)。このとき、重畳部503は、ステップS2101で位置姿勢検出装置100が検出した撮影部102の位置および姿勢、撮影画像220内の、各オブジェクトの位置を用いる。
The superimposing unit 503 determines the display position of the selected display content (step S2105). At this time, the superimposing unit 503 uses the position and orientation of the imaging unit 102 detected by the position / orientation detection apparatus 100 in step S2101 and the position of each object in the captured image 220.
表示部504は、重畳部503の算出した位置に、表示コンテンツ生成部501が決定した表示態様で、コンテンツを表示する(ステップS2106)。以上の処理を、全コンテンツについて繰り返す。
The display unit 504 displays the content in the display mode determined by the display content generation unit 501 at the position calculated by the superimposing unit 503 (step S2106). The above processing is repeated for all contents.
なお、位置姿勢検出処理とは別個独立に、上記ステップS2102以降の、AR表示処理を行う場合、重畳部503は、その時点の、最新の、撮影部102の位置および姿勢、撮影画像220内の各オブジェクトの位置情報を用い、表示位置の決定を行う。
In addition, when performing the AR display processing after step S2102 independently of the position / orientation detection processing, the superimposing unit 503 displays the latest position and orientation of the photographing unit 102 at that time and the captured image 220. The display position is determined using the position information of each object.
以上説明したように、本実施形態のAR表示装置500は、透過性および反射性を有するディスプレイ522に、当該ディスプレイ522の背後の光景内のオブジェクトに関連づけてコンテンツを表示するAR表示装置である。そして、第一の実施形態の位置姿勢検出装置100と、前記ディスプレイ522に表示する前記コンテンツを生成する表示コンテンツ生成部501と、前記位置姿勢検出装置100が決定した前記撮影部102の位置および姿勢と、前記オブジェクト検出部105が特定した前記オブジェクトの画素位置と、を用い、生成した前記コンテンツの前記ディスプレイ522上の表示位置を決定する重畳部503と、前記ディスプレイ522上の、前記重畳部503が決定した表示位置に、生成した前記コンテンツを表示する表示部504と、を備える。
As described above, the AR display device 500 according to the present embodiment is an AR display device that displays content in association with an object in a scene behind the display 522 on the display 522 having transparency and reflection. The position and orientation detection apparatus 100 according to the first embodiment, the display content generation unit 501 that generates the content to be displayed on the display 522, and the position and orientation of the photographing unit 102 determined by the position and orientation detection apparatus 100. And the pixel position of the object specified by the object detection unit 105, and a superimposition unit 503 for determining a display position of the generated content on the display 522, and the superimposition unit 503 on the display 522. And a display unit 504 that displays the generated content at the display position determined.
このため、本実施形態によれば、撮影部102が取得した撮影画像内でオブジェクトの位置を特定し、それを用いて関連コンテンツの表示位置を決定する。従って、撮影部102による撮影範囲541とユーザの視点とのずれ量のみ補正すれば、正確にオブジェクトの、ディスプレイ上の表示位置を特定できる。
Therefore, according to the present embodiment, the position of the object is specified in the captured image acquired by the imaging unit 102, and the display position of the related content is determined using the specified position. Therefore, if only the amount of deviation between the photographing range 541 by the photographing unit 102 and the user's viewpoint is corrected, the display position of the object on the display can be accurately specified.
すなわち、本実施形態によれば、オブジェクトとAR表示装置500との相対位置を使用することなくコンテンツを表示できる。このため、本実施形態によれば、オブジェクトの存在位置とAR表示装置500との誤差を含まない、計算コストを抑えた高精度なAR重畳表示を実現できる。
That is, according to the present embodiment, content can be displayed without using the relative position between the object and the AR display device 500. For this reason, according to the present embodiment, it is possible to realize a highly accurate AR superimposed display that does not include an error between the presence position of the object and the AR display device 500 and suppresses calculation cost.
また、撮影部102の位置および姿勢は、第一の実施形態の位置姿勢検出装置100により取得している。すなわち、GPSや電子コンパスなどの粗検出センサ103の値を直接使わない。その結果、粗検出センサ103の精度によらず、高精度に求めることができる。従って、高精度にAR重畳できるAR表示装置500を提供できる。
Further, the position and orientation of the photographing unit 102 are acquired by the position and orientation detection apparatus 100 of the first embodiment. That is, the value of the coarse detection sensor 103 such as GPS or an electronic compass is not directly used. As a result, high accuracy can be obtained regardless of the accuracy of the coarse detection sensor 103. Therefore, it is possible to provide the AR display device 500 that can superimpose the AR with high accuracy.
上記実施形態では、ディスプレイ522が透過性を備えている場合を例にあげて説明した。しかしながら、ディスプレイ522は、不透明であってもよい。この場合、撮影部102が撮影した撮影画像220に、コンテンツを合成した映像を表示しても良い。このとき、撮影画像220の輝度を例えば、落とすなどしてもよい。これにより、現実の光景に、輝度を落とした映像(暗い映像)を上書きできる。このため、例えば、実際の光景よりも暗い映像表現が可能になる。
In the above embodiment, the case where the display 522 has transparency has been described as an example. However, the display 522 may be opaque. In this case, the synthesized image may be displayed on the captured image 220 captured by the capturing unit 102. At this time, the brightness of the captured image 220 may be reduced, for example. As a result, an image with reduced brightness (dark image) can be overwritten on the actual scene. For this reason, for example, it is possible to express an image darker than an actual scene.
なお、上記実施形態では、AR表示装置500を自動車に搭載する場合を例にあげて説明した。しかしながら、AR表示装置500の使用形態はこれに限定されない。例えば、ユーザが装着して使用する形態であってもよい。この場合、小型のAR表示装置500を提供できる。このような構成の一例として、HMD(Head Mounted Display)がある。この場合、位置姿勢検出装置100もHMDに搭載する。
In the above embodiment, the case where the AR display device 500 is mounted on an automobile has been described as an example. However, the usage form of the AR display device 500 is not limited to this. For example, the form which a user wears and uses may be sufficient. In this case, a small AR display device 500 can be provided. An example of such a configuration is HMD (Head Mounted Display). In this case, the position / orientation detection apparatus 100 is also mounted on the HMD.
また、さらに、HMDとは別個に、HMDの位置を検出するための位置姿勢検出装置100を備えてもよい。この位置姿勢検出装置100が、HMDを撮影する。そして、HMDをオブジェクトとして、上記二次元マップ情報を生成する手法を用いて、HMDの実空間上での位置(緯度経度、座標)を算出する。そして、この位置情報を用い、コンテンツの表示位置を決定する。これにより、より高い精度でコンテンツを、所望の位置に表示できる。
Further, a position and orientation detection apparatus 100 for detecting the position of the HMD may be provided separately from the HMD. The position / orientation detection apparatus 100 captures an HMD. Then, using the method of generating the two-dimensional map information using the HMD as an object, the position (latitude and longitude, coordinates) of the HMD in the real space is calculated. Then, the display position of the content is determined using this position information. Thereby, the content can be displayed at a desired position with higher accuracy.
以上のように、本変形例のAR表示装置500は、移動体に搭載された場合であっても、絶対的な位置と姿勢を検出できるため、ユーザが見る光景が絶えず変化しても高精度にコンテンツを、所望の位置に表示できる。
As described above, since the AR display device 500 according to the present modification can detect the absolute position and orientation even when mounted on a moving body, it is highly accurate even if the scene viewed by the user constantly changes. The content can be displayed at a desired position.
AR表示装置500は、撮影部102の位置および姿勢を高精度に算出し、かつ高精度にAR重畳することができる。高精度なAR重畳が可能になると、ARコンテンツの表現精度が高まり、表現力が豊かになる。
The AR display device 500 can calculate the position and orientation of the photographing unit 102 with high accuracy and can superimpose AR with high accuracy. If high-precision AR superimposition becomes possible, the expression accuracy of AR content will increase and the expressive power will be enriched.
また、AR表示装置500は、撮影部102を自身に備えなくてもよい。例えば、他の別の撮影装置(車載機器であれば、例えば、ナビゲーション装置やドライブレコーダ等)で撮影した撮影データを、撮影画像として利用してもよい。
Further, the AR display device 500 does not have to include the photographing unit 102 in itself. For example, image data captured by another image capturing device (for example, a navigation device or a drive recorder in the case of an in-vehicle device) may be used as a captured image.
本実施形態のAR表示装置500は、さらに、図13に示すように、アイトラッキング装置508をさらに備えてもよい。
The AR display device 500 of the present embodiment may further include an eye tracking device 508 as shown in FIG.
アイトラッキング装置508は、ユーザの視線を追跡する装置である。本変形例では、このアイトラッキング装置508を利用して、ユーザの視界を推定し、ユーザの視線の向きを考慮して、コンテンツの表示を行う。例えば、ディスプレイ522上の、ユーザの視線方向および視界で特定される領域に、コンテンツを表示する。
The eye tracking device 508 is a device that tracks the user's line of sight. In this modification, the eye tracking device 508 is used to estimate the user's field of view and display the content in consideration of the direction of the user's line of sight. For example, the content is displayed on the display 522 in an area specified by the user's line-of-sight direction and field of view.
本実施形態の重畳部503は、撮影部102とディスプレイ522との相対位置と、アイトラッキング装置508が算出したユーザの視線方向および視界とから、コンテンツを表示する位置を算出する。すなわち、アイトラッキング装置508が算出したユーザの視線方向を用い、コンテンツを表示する位置を補正する。これにより、実空間上の狙った位置にコンテンツが存在しているように重畳できる。また、コンテンツを実空間の光景に高精度に重畳できる。
The superimposing unit 503 of the present embodiment calculates the position for displaying the content from the relative position between the photographing unit 102 and the display 522 and the user's line-of-sight direction and field of view calculated by the eye tracking device 508. In other words, the user's line-of-sight direction calculated by the eye tracking device 508 is used to correct the position for displaying the content. As a result, it is possible to superimpose the content as if it existed at the target position in the real space. Also, it is possible to superimpose content on a real space scene with high accuracy.
なお、アイトラッキング装置508により検知されたユーザの視線方向と、撮影部102の位置および姿勢とから、ユーザが実空間上で向いている方向のオブジェクトを特定できる。このオブジェクトは、ユーザが注視しているオブジェクトである。これを利用し、例えば、ユーザが注視しているオブジェクトに関連するコンテンツを選択し、表示することができる。
It should be noted that an object in the direction in which the user is facing in real space can be identified from the line-of-sight direction of the user detected by the eye tracking device 508 and the position and orientation of the photographing unit 102. This object is an object that the user is watching. By using this, for example, it is possible to select and display content related to the object being watched by the user.
この場合、アイトラッキング装置508の出力と、オブジェクト検出部105の検出結果とを、表示コンテンツ選択部502に入力する。表示コンテンツ選択部502は、ユーザの視線方向と、各オブジェクトの画素位置とを用い、ユーザが注視しているオブジェクトを特定し、当該オブジェクトに関連するオブジェクトを選択する。
In this case, the output of the eye tracking device 508 and the detection result of the object detection unit 105 are input to the display content selection unit 502. The display content selection unit 502 uses the user's line-of-sight direction and the pixel position of each object to identify the object that the user is watching and selects an object related to the object.
このように構成することにより、例えば、指示受付部506からの指示無しに、ユーザが関心を有するオブジェクトに関連するコンテンツを、オブジェクトに対応づけて表示することができる。従って、アイトラッキング装置508により、ユーザの入力方法を拡張することができる。
With this configuration, for example, content related to an object that the user is interested in can be displayed in association with the object without an instruction from the instruction receiving unit 506. Therefore, the user input method can be expanded by the eye tracking device 508.
以下、表示コンテンツ生成部501および重畳部503が決定した表示態様および表示位置による表示例を説明する。
Hereinafter, a display example based on the display mode and the display position determined by the display content generation unit 501 and the superimposition unit 503 will be described.
図14(a)に、表示位置の例を示す。ここでは、実空間に実在するオブジェクト(実オブジェクト)の一例として、建物711、標識712、案内板713を例示する。これらの実オブジェクトは、実在する位置と外観とが既知である。
FIG. 14 (a) shows an example of the display position. Here, a building 711, a sign 712, and a guide plate 713 are illustrated as an example of an object (real object) that exists in real space. These real objects have known positions and appearances.
上記実施形態では、重畳部503は、ディスプレイ522上の、各実オブジェクトに対応する評価オブジェクトの画素位置に基づき、関連コンテンツ811、812、813を表示させるよう、表示位置を決定する。例えば、実オブジェクトに重畳表示する例を図14(a)に示す。
In the above embodiment, the superimposing unit 503 determines the display position so that the related contents 811, 812, and 813 are displayed on the display 522 based on the pixel position of the evaluation object corresponding to each real object. For example, FIG. 14A shows an example of superimposing display on a real object.
表示コンテンツ生成部501および重畳部503は、表示コンテンツを、メタ情報に従って分類し、分類結果に応じて、表示態様および表示位置を決定してもよい。例えば、更新頻度、必要になるタイミング、重要度等で表示態様および表示位置を決定してもよい。この場合の表示例を図14(b)に示す。
The display content generation unit 501 and the superimposition unit 503 may classify the display content according to the meta information, and determine the display mode and the display position according to the classification result. For example, the display mode and the display position may be determined based on the update frequency, the required timing, the importance level, and the like. A display example in this case is shown in FIG.
例えば、メタ情報によって、当該コンテンツが、ユーザの位置によらない静的なコンテンツであると判断された場合、空等の遠方の表示領域(遠方表示領域)731や、ボンネットが見える表示領域(手前表示領域)732に表示されるよう表示態様および表示位置を決定する。
For example, when it is determined by meta information that the content is static content that does not depend on the user's position, a distant display area (distant display area) 731 such as the sky or a display area where the bonnet can be seen (front side) The display mode and the display position are determined so as to be displayed in (display area) 732.
また、当該コンテンツが、これから遭遇するオブジェクトや進路方向にかかわる情報であると判断された場合、例えば、道路に沿った表示領域である道路表示領域733に表示されるよう表示態様および表示位置を決定する。
In addition, when it is determined that the content is information related to an object to be encountered in the future or a course direction, for example, a display mode and a display position are determined so as to be displayed in a road display area 733 that is a display area along the road. To do.
また、当該コンテンツが、追従が必要な情報と判断された場合、例えば、ユーザの進行方向に対して側面に位置する側面表示領域734に表示されるよう表示態様および表示位置を決定する。
Further, when it is determined that the content is information that needs to be followed, for example, the display mode and the display position are determined so as to be displayed in the side display area 734 positioned on the side with respect to the user's traveling direction.
また、当該コンテンツが、重要なコンテンツと判断された場合は、信号や標識が目視されない空中表示領域735に表示されるよう表示態様および表示位置を決定する。
Further, when the content is determined to be important content, the display mode and the display position are determined so that the signal and the sign are displayed in the aerial display area 735 where the signal and the sign are not visually observed.
また、フロントガラス正面の表示領域(正面表示領域)736にコンテンツが表示されるよう表示態様および表示位置を決定してもよい。
Further, the display mode and the display position may be determined so that the content is displayed in the display area (front display area) 736 in front of the windshield.
自動車は移動するため、光景は絶えず変化する。しかし、遠方領域731や、手前領域732は、移動の影響および光景の影響を受けず、変化しにくい。すなわち、遠方領域731や、手前領域732は、自動車の移動に伴う、実空間の見た目の変化が小さい領域(小変化領域)である。ユーザの位置によらない静的なコンテンツを、小変化領域に表示すると、光景の変化に応じたコンテンツの画像補正が不要になり、処理負担を低減できる。
・ Since the car moves, the scene changes constantly. However, the far region 731 and the near region 732 are not affected by the movement and the scene and hardly change. That is, the far region 731 and the near region 732 are regions (small change regions) in which the change in the appearance of the real space with the movement of the automobile is small. When static content that does not depend on the user's position is displayed in the small change area, it is not necessary to perform image correction of the content according to the change in the scene, and the processing burden can be reduced.
特に、ユーザの移動に依存した更新頻度が少ないコンテンツであれば、実空間に合わせて表示位置を計算する重畳部503による処理を低減できる。このため、静的コンテンツを小変化領域に表示するよう構成すると、計算コストを抑えることができる。
Particularly, if the content has a low update frequency depending on the movement of the user, the processing by the superimposing unit 503 that calculates the display position according to the real space can be reduced. For this reason, if it comprises so that a static content may be displayed on a small change area | region, calculation cost can be held down.
複数のARコンテンツを表示する場合でも、更新頻度が少ないARコンテンツを優先的に小変化領域に表示すれば、ユーザに違和感を与えることなくAR重畳処理を効率的に低減できる。これにより、AR表示装置500の処理負荷を軽減できる。
Even when a plurality of AR contents are displayed, if the AR contents with a low update frequency are preferentially displayed in the small change area, the AR superimposition process can be efficiently reduced without giving the user a sense of incongruity. Thereby, the processing load of the AR display device 500 can be reduced.
また、後に必要となる情報を、道路表示領域733に表示することにより、ユーザは、直感的に把握しやすくなる。さらに、側面表示領域734は、移動することによってオブジェクトが流れて前方から外れてしまった場合であっても、そのまま残る。このため、追従が必要な情報を、側面表示領域734に表示することにより、自動車が移動したとしても、ユーザは、情報を追従できる。また、重要な情報を空中表示領域735に表示することにより、実空間上における光景の視認性を損なうことなく、高い可読性を得られる。また、正面表示領域736に表示することにより、ユーザは、視線を正面から大きく外すことなく、コンテンツを見ることができる。
Further, by displaying information required later in the road display area 733, the user can easily grasp intuitively. Further, the side display area 734 remains as it is even when the object flows and moves away from the front by moving. Therefore, by displaying information that needs to be tracked in the side surface display area 734, the user can track the information even if the automobile has moved. Also, by displaying important information in the aerial display area 735, high readability can be obtained without impairing the visibility of the scene in the real space. In addition, by displaying in the front display area 736, the user can view the content without greatly removing the line of sight from the front.
以上のように、メタ情報により、自動的に表示態様および表示位置を決定すると、ユーザにとって視認性や可読性が高い、AR重畳表示が可能となる。また、このAR重畳表示を、ユーザの指示無しに実現できる。
As described above, when the display mode and the display position are automatically determined based on the meta information, AR overlay display with high visibility and readability for the user is possible. Moreover, this AR superimposed display can be realized without any instruction from the user.
なお、表示コンテンツの分類は、メタ情報によるものに限定されない。例えば、予め定められた、当該コンテンツの表示を維持する時間に応じて行ってもよい。
Note that the classification of display content is not limited to that based on meta information. For example, it may be performed according to a predetermined time for maintaining the display of the content.
また、移動体の運転モードを判定し、判定結果に応じて、表示態様および/または表示位置を決定するよう構成してもよい。例えば、自動運転モードである場合のみ、正面表示領域736にコンテンツを表示させる。この場合、表示コンテンツ生成部501および/または重畳部503は、移動体の、例えば、ECU等から、自動運転モードであるか否かを示す信号を受信するよう構成する。
Also, the operation mode of the moving body may be determined, and the display mode and / or display position may be determined according to the determination result. For example, content is displayed in the front display area 736 only in the automatic operation mode. In this case, the display content generation unit 501 and / or the superimposition unit 503 are configured to receive a signal indicating whether or not the automatic operation mode is set, for example, from the ECU or the like of the moving body.
なお、移動体の自動運転モードとは、ユーザが能動的に運転する必要がなく、移動体が自動的に運転するモードである。このとき、ユーザは実際の周囲の光景に注意する必要はない。ただし、実際の運転時は、周囲の交通環境、時間、場所などによって、ユーザが能動的に運転する必要が生じ、実際の周囲の光景に注意しなくてはならない場合もある。
Note that the automatic operation mode of the moving object is a mode in which the user does not need to actively operate and the moving object automatically operates. At this time, the user does not need to pay attention to the actual surrounding scene. However, during actual driving, it may be necessary for the user to actively drive depending on the surrounding traffic environment, time, place, etc., and there are cases where attention must be paid to the actual surrounding scene.
例えば、自動運転モードにおいて、手元の情報端末でコンテンツを視聴する場合は、視線が手元を向き、前方から外れる。本構成によれば、当該コンテンツが、ユーザの前方正面に表示される。従って、視線を前方から外すことなくコンテンツを視聴することができる。また、自動運転時にコンテンツを見ている場合であっても、ユーザの視線は前方を向いている。このため、自動運転モードが解除され、実際の光景に注意喚起が必要になった場合であっても、スムーズに前方に注意を促すことができる。
For example, in the automatic driving mode, when viewing content on the information terminal at hand, the line of sight is facing away from the front. According to this configuration, the content is displayed in front of the user. Therefore, the content can be viewed without removing the line of sight from the front. Even when the content is viewed during automatic driving, the user's line of sight is facing forward. For this reason, even when the automatic driving mode is canceled and it is necessary to call attention to the actual scene, it is possible to smoothly draw attention forward.
別の形態として、ユーザに求められる注意喚起のレベルによって、ディスプレイ522に表示するコンテンツの表示態様および表示位置を変化させてもよい。
As another form, the display mode and display position of the content displayed on the display 522 may be changed depending on the level of alerting required by the user.
この場合、表示コンテンツ生成部501および/または重畳部503は、移動体、もしくはユーザに取り付けられたセンサから、ユーザの意識レベル、疲労度、運転状態などの信号を受信する。さらに、自動車や当該自動車に搭載されたナビが把握している周りの交通状況等の情報を受信してもよい。表示コンテンツ生成部501および/または重畳部503は、これらを組み合わせて、注意喚起レベルを判定する。
In this case, the display content generation unit 501 and / or the superimposition unit 503 receives signals such as a user's consciousness level, fatigue level, and driving state from a moving object or a sensor attached to the user. Furthermore, you may receive information, such as the surrounding traffic condition which the navigation mounted in the motor vehicle or the said vehicle grasps | ascertains. The display content generation unit 501 and / or the superimposition unit 503 combine these to determine the alert level.
例えば、自動運転モード時のように注意喚起が不要であれば、コンテンツの表示内容を制限しない。そして、注意喚起レベルが上がり、注意喚起が必要になるに従って、コンテンツの表示内容が簡素化されるよう、表示態様および/または表示位置を決定する。このように構成することにより、注意喚起が必要になるに従って、コンテンツの情報量は減るものの可読性が高くなる。このため、注視する必要性が低下するため、ユーザは、コンテンツから情報入手しつつ、他に注意を払うことができる。
For example, if alerting is not required as in the automatic operation mode, the content display content is not limited. Then, the display mode and / or the display position are determined so that the display content of the content is simplified as the alert level increases and the alert is required. By configuring in this way, the amount of information of the content decreases as the alerting becomes necessary, but the readability increases. For this reason, since the necessity for gazing decreases, the user can pay attention to others while obtaining information from the content.
また、例えば、図15(a)に示すように、注意喚起レベルが所定以上のレベルになった場合、コンテンツを退避領域737に表示させる。退避領域737は、運転の邪魔にならない領域とする。例えば、フロントガラスの、正面より外側等である。退避領域737は、元の表示領域よりも小さく設定してもよい。
Also, for example, as shown in FIG. 15A, when the alert level becomes a predetermined level or higher, the content is displayed in the save area 737. The retreat area 737 is an area that does not interfere with driving. For example, the outside of the windshield from the front. The save area 737 may be set smaller than the original display area.
退避領域737への表示は、例えば、自動運転モードから、通常運転モードに切り替わった場合などに行う。
The display in the retreat area 737 is performed, for example, when the automatic operation mode is switched to the normal operation mode.
視認している映像が急激に変化してユーザが混乱するのを防ぐため、退避や縮小する際には、連続的に変化させてもよい。また、例えば、表示コンテンツ生成部501および/または重畳部503は、は、通常運転モードに切り替わるような状況を、ナビゲーションシステム等により事前に検知し、退避することを予告するよう構成してもよい。
In order to prevent the user from being confused by a sudden change in the image being viewed, the image may be continuously changed when retracted or reduced. In addition, for example, the display content generation unit 501 and / or the superimposition unit 503 may be configured to detect in advance a situation such as switching to the normal operation mode using a navigation system or the like, and notify the evacuation. .
予告は、例えば、警告領域743に警告文を表示することによって行う。なお、警告文表示の代わりに、色調の変化や、カウントダウン表示、音声による警告であっても良い。
The advance notice is performed, for example, by displaying a warning text in the warning area 743. Instead of displaying the warning text, a change in color tone, a countdown display, or an audio warning may be used.
なお、注意喚起レベルが所定以上になった場合、あるいは、所定の条件を満たした場合、コンテンツの表示を完全に取りやめるよう構成してもよい。所定の条件を満たした場合とは、例えば、自動運転モードが解除された場合などである。
It should be noted that the content display may be completely canceled when the alert level reaches a predetermined level or when a predetermined condition is satisfied. The case where the predetermined condition is satisfied is, for example, a case where the automatic operation mode is canceled.
表示コンテンツ生成部501および/または重畳部503は、表示コンテンツの透明度を上げる。これにより、ユーザには、表示されるコンテンツがフェードアウトするように見える。これにより、ユーザは、前方の実際の光景のみを見ることができる。なお、透明度を上げる前に、上述のように、表示コンテンツ生成部501および/または重畳部503の制御により、予告表示を行うよう構成してもよい。
The display content generation unit 501 and / or the superimposition unit 503 increases the transparency of the display content. Thereby, it appears to the user that the displayed content fades out. This allows the user to see only the actual scene in front. In addition, before raising transparency, you may comprise so that a notice display may be performed by control of the display content production | generation part 501 and / or the superimposition part 503 as mentioned above.
逆に、注意喚起レベルが所定未満になった場合、あるいは、所定の条件を満たした場合、図15(b)に示すように、表示制限を解除するよう構成してもよい。この場合の所定の条件を満たした場合とは、例えば、自動運転モードになった場合である。
On the contrary, when the alert level is less than a predetermined level or when a predetermined condition is satisfied, the display restriction may be canceled as shown in FIG. The case where the predetermined condition in this case is satisfied is, for example, a case where the automatic operation mode is set.
表示コンテンツ生成部501および/または重畳部503は、自動運転モード時等、表示制限を解除して、正面表示領域736のサイズを拡大し、コンテンツを表示するよう表示態様および表示位置を決定する。このとき、メタ情報を用い、注視が必要な文書や動画などのエンターテイメントコンテンツを大画面で表示させるよう構成してもよい。
The display content generation unit 501 and / or the superimposition unit 503 cancels the display restriction such as in the automatic operation mode, enlarges the size of the front display area 736, and determines the display mode and the display position to display the content. At this time, meta information may be used to display entertainment content such as a document or video that requires attention on a large screen.
表示領域の変更制御は、運転している場所や、交通状況、ユーザの状況を検出して行ってもよい。ユーザの運転技術のレベルを判断基準に加えて制御しても良い。
The display area change control may be performed by detecting the driving location, traffic conditions, and user conditions. The user's driving skill level may be controlled in addition to the criterion.
このように、本実施例の変形例であれば、環境条件、運転条件等により、コンテンツの表示領域、表示方法を変更できる。本変形例のAR表示装置500によれば、コンテンツの表示切り替えが可能であるため、前方への注意喚起と可読性を両立できる。
As described above, in the modified example of the present embodiment, the content display area and the display method can be changed according to the environmental condition, the driving condition, and the like. According to the AR display device 500 of the present modification, since the display of content can be switched, both forward alerting and readability can be achieved.
また、アイトラッキング装置508を備える場合の、ユーザの視線方向を利用した表示例を説明する。図16(a)は、実オブジェクトとコンテンツの対応付けの手法を説明するための図である。
Also, a display example using the user's line-of-sight direction when the eye tracking device 508 is provided will be described. FIG. 16A is a diagram for explaining a method of associating a real object with content.
撮影部102に対して角度を有する位置の実オブジェクト714に関連するコンテンツを、任意の表示領域に表示する。図16(a)では、例えば、ユーザに正対する表示領域741に表示する場合を例示する。ここでは、実オブジェクト714と表示領域741とを対応づけるため、例えば、コンテンツが、実オブジェクト714から引き出されているようなリボン状の引き出し効果751等を表示する。
Content related to the real object 714 at a position having an angle with respect to the photographing unit 102 is displayed in an arbitrary display area. FIG. 16A illustrates, for example, a case where the image is displayed in the display area 741 facing the user. Here, in order to associate the real object 714 with the display area 741, for example, a ribbon-like drawing effect 751 in which the content is drawn from the real object 714 is displayed.
重畳部503は、まず、ユーザの視線の方向とAR表示装置500と実オブジェクト714との相対位置から、ディスプレイ522上の実オブジェクト714の表示位置の座標を求める。そして、実オブジェクト714の表示位置と表示領域741の表示座標とを結び付けるようにリボン状の映像を生成する。
The superimposing unit 503 first obtains the coordinates of the display position of the real object 714 on the display 522 from the direction of the user's line of sight and the relative position of the AR display device 500 and the real object 714. Then, a ribbon-like image is generated so as to connect the display position of the real object 714 and the display coordinates of the display area 741.
なお、引き出し効果751は、紐状であっても良い。紐が細いほど実空間の光景を阻害しにくい。また、引き出し効果751は、半透明であってもよい。半透明の場合、実際の光景の邪魔をせず、関連を明確にできる。紐は、実オブジェクト714と表示領域741とを、視界を邪魔しないように、迂回して結んでも良い。
The drawer effect 751 may be a string. The narrower the string, the more difficult it is to disturb the sight of real space. Further, the drawing effect 751 may be translucent. When translucent, the relationship can be clarified without disturbing the actual scene. The string may be tied around the real object 714 and the display area 741 so as not to disturb the field of view.
なお、このような表示手法は、表示するコンテンツに応じて行うよう構成してもよい。例えば、表示するコンテンツが、ユーザが興味を持つ情報であるかを判断して実施すると、他のARコンテンツへの可読性低下を抑えられる。
In addition, you may comprise so that such a display method may be performed according to the content to display. For example, when the content to be displayed is information that the user is interested in, it is possible to suppress a decrease in readability to other AR content.
判断には、例えば、アイトラッキング装置508により検知されたユーザの視線方向を用いる。例えば、ユーザの視線方向と実オブジェクトの表示位置とが合致度により、ユーザの興味の度合いを特定する。また、ユーザによる能動的な選択動作で、ユーザの興味の度合いを判定してもよい。この場合、ユーザの意思が反映できる。また、この方法を利用して、ユーザの選択結果を蓄積し、他の処理に用いるよう構成してもよい。
For the determination, for example, the user's line-of-sight direction detected by the eye tracking device 508 is used. For example, the degree of interest of the user is specified based on the degree of coincidence between the viewing direction of the user and the display position of the real object. The degree of interest of the user may be determined by an active selection operation by the user. In this case, the intention of the user can be reflected. In addition, this method may be used to accumulate user selection results and use them for other processing.
実オブジェクトとコンテンツの関連付けをする別の方法として、参照マーカ761を用いてもよい。表示コンテンツ生成部501および/または重畳部503は、参照マーカ761を実オブジェクト714に重畳する。そして、その実オブジェクト714に関連するコンテンツは、任意の情報表示領域742に表示する。このとき、参照マーカ761も併せて情報表示領域742に表示する。
As another method for associating a real object with content, a reference marker 761 may be used. The display content generation unit 501 and / or the superimposition unit 503 superimpose the reference marker 761 on the real object 714. Then, the content related to the real object 714 is displayed in an arbitrary information display area 742. At this time, the reference marker 761 is also displayed in the information display area 742.
以上の構成であれば、可読性が良いコンテンツ表示が可能となる。
With the above configuration, it is possible to display content with good readability.
上述のような引き出し効果751を用いて、オブジェクトに関連するARコンテンツを表示する構成は、オブジェクト自体が情報を有している場合に特に有用である。例えば、図16(b)に示すように、実オブジェクト715が看板のような場合である。
The configuration in which AR content related to an object is displayed using the above-described extraction effect 751 is particularly useful when the object itself has information. For example, as shown in FIG. 16B, the real object 715 is a signboard.
看板は、その中に情報が含まれている。コンテンツは、その情報に加え、さらなる情報を有する。表示コンテンツ生成部501および/または重畳部503は、実オブジェクト715に関連するコンテンツを、任意の位置に設定した情報表示領域742に表示させるよう表示態様および表示位置を決定する。このとき、表示コンテンツ生成部501および/または重畳部503は、実オブジェクト715と、情報表示領域742との間に、引き出し効果751を表示する。
The signboard contains information. In addition to the information, the content has further information. The display content generation unit 501 and / or the superimposition unit 503 determine the display mode and the display position so that the content related to the real object 715 is displayed in the information display area 742 set at an arbitrary position. At this time, the display content generation unit 501 and / or the superimposition unit 503 displays the extraction effect 751 between the real object 715 and the information display area 742.
このように、構成することで、ユーザは看板を目視するよりも多くの情報を得ることができる。
Thus, by configuring, the user can obtain more information than viewing the signboard.
例えば、表示するARコンテンツとして、オブジェクトを撮影した画像を用いてもよい。図17(a)に示すように、案内板713に関連するコンテンツとして、案内板713を撮影した画像を表示領域744に表示する。表示領域744は、任意の位置であってよいが、サイズは、案内板713より大きくする。これにより、ユーザは、案内板713の情報を把握しやすくなる。また、このとき、案内板713と、表示領域744とを関連付けるため、両者に参照マーカ761を表示してもよい。
For example, you may use the image which image | photographed the object as AR content to display. As shown in FIG. 17A, an image obtained by photographing the guide plate 713 is displayed in the display area 744 as content related to the guide plate 713. The display area 744 may be at an arbitrary position, but the size is larger than that of the guide plate 713. This makes it easier for the user to grasp the information on the guide plate 713. At this time, in order to associate the guide plate 713 with the display area 744, a reference marker 761 may be displayed on both.
この場合、撮影部102が撮影した撮影画像から、オブジェクト検出部105が案内板713を検出する。そして、表示コンテンツ選択部502は、表示コンテンツとして、案内板713の画像を選択する。また、表示コンテンツ生成部501および/または重畳部503は、上記表示を実現するよう表示態様および表示位置を決定する。
In this case, the object detection unit 105 detects the guide plate 713 from the captured image captured by the imaging unit 102. Then, the display content selection unit 502 selects an image of the guide board 713 as the display content. In addition, the display content generation unit 501 and / or the superposition unit 503 determine a display mode and a display position so as to realize the display.
また、表示するARコンテンツは階層化されていてもよい。図17(b)に示すように、同一表示領域745に、一群の関連するコンテンツ821~825を表示する。なお、ここでは、5つのコンテンツを表示する場合を例示する。しかし、一つの表示領域745に表示するコンテンツ数は任意である。
Also, the AR content to be displayed may be hierarchized. As shown in FIG. 17B, a group of related contents 821 to 825 are displayed in the same display area 745. Here, a case where five contents are displayed is illustrated. However, the number of contents displayed in one display area 745 is arbitrary.
表示するコンテンツの集合は、表示コンテンツ選択部502が選択し、生成する。表示コンテンツ選択部502は、例えば、メタ情報を利用して、コンテンツの集合を生成する。
A set of contents to be displayed is selected and generated by the display content selection unit 502. The display content selection unit 502 generates a set of contents using, for example, meta information.
例えば、コンテンツ821~825が、ある製品の広告である場合を想定して説明する。一つ一つのコンテンツは、企業名前、企業を示すアイコン、商品の説明、価格表示、参照URLなどで構成される。表示コンテンツ選択部502、表示コンテンツ生成部501および重畳部503は、これらを同時に表示するのではなく、ユーザの選択や、タイミングに合わせて表示するよう、選択タイミング、表示態様および表示位置を決定する。
For example, the case where the contents 821 to 825 are advertisements for a certain product will be described. Each content includes a company name, an icon indicating the company, a product description, a price display, a reference URL, and the like. The display content selection unit 502, the display content generation unit 501, and the superimposition unit 503 determine the selection timing, the display mode, and the display position so that the display content selection unit 502, the display content generation unit 501, and the superimposition unit 503 display them according to the user's selection and timing. .
コンテンツの表示はリアルタイムに更新されることが多い。このため、一度に表示する情報量が多いとユーザの理解を妨げることがある。これを回避するために、ユーザに一度に提示するコンテンツの内容を単純化し、順次表示する。このような場合に、この階層化されたコンテンツの集合が用いられる。これにより、コンテンツに含まれる情報を可読性良く提供できる。
The content display is often updated in real time. For this reason, if the amount of information to be displayed at a time is large, the user's understanding may be hindered. In order to avoid this, the contents to be presented to the user at once are simplified and sequentially displayed. In such a case, this hierarchical set of contents is used. Thereby, the information included in the content can be provided with good readability.
さらに、コンテンツ集合に含まれるコンテンツをストーリー仕立てにし、時系列でストーリーを進めてもよい。また、ユーザの経路によってストーリーを分岐させる構成とするなどエンターテイメント性を持っていても良い。これらは、例えば、メタ情報に基づいて、表示コンテンツ選択部502の選択により実現される。
Furthermore, the content included in the content set may be tailored as a story, and the story may be advanced in time series. Moreover, you may have entertainment property, such as setting it as the structure which branches a story according to a user's path | route. These are realized, for example, by selection of the display content selection unit 502 based on the meta information.
また、マスコットつきコンテンツを表示してもよい。マスコットつきコンテンツは、コンテンツ831とマスコット841とを備える。例えば、広告のコンテンツ831に、ユーザの注目を得たいとき、人気のマスコット841を付加して表示する。
Also, content with mascots may be displayed. The content with a mascot includes a content 831 and a mascot 841. For example, when a user's attention is desired to be displayed on the advertisement content 831, a popular mascot 841 is added and displayed.
例えば、アイトラッキング装置508により、ユーザの視線が、コンテンツ831の表示領域746に所定期間、留まったか否かを判別する。所定期間留まった場合、表示コンテンツ生成部501および重畳部503は、マスコット841を表示領域746に表示する。これにより、ユーザがコンテンツ831を読んだ場合のみマスコット841を表示させるといった運用ができる。このような運用を行うことにより、コンテンツ831にユーザの興味を効果的に引くことができる。
For example, the eye tracking device 508 determines whether or not the user's line of sight has remained in the display area 746 of the content 831 for a predetermined period. When staying for a predetermined period, the display content generation unit 501 and the superposition unit 503 display the mascot 841 in the display area 746. Thereby, it is possible to operate such that the mascot 841 is displayed only when the user reads the content 831. By performing such an operation, the user's interest can be effectively attracted to the content 831.
例えば、マスコット841をゲームのキャラクタ等とし、表示後、ゲーム内で利用できるように構成すれば、コンテンツ831の閲覧率が高まる。このように、ユーザにとって可読性が良く、高付加価値のコンテンツ表示を実現できる。
For example, if the mascot 841 is a game character or the like and is configured to be usable in the game after being displayed, the viewing rate of the content 831 increases. In this way, high-value-added content display can be realized with good readability for the user.
また、表示コンテンツは、ディスプレイ522への表示に加え、外部に送信し、保存するよう構成してもよい。送信先は、例えば、図18(a)に示すように、情報処理端末910であってもよい。また、ネットワークに接続された他の記憶装置であってもよい。送信は、ゲートウェイ110を介して行う。他の記憶装置に送信し、情報を蓄積すれば、情報処理端末910の容量を圧迫しない。このように構成することにより、表示させた情報の再利用が可能となる。表示したコンテンツを後で活用することができ、ユーザにとって利便性が高くなる。
In addition to displaying on the display 522, the display content may be transmitted to the outside and stored. For example, the transmission destination may be an information processing terminal 910 as shown in FIG. Further, it may be another storage device connected to the network. Transmission is performed via the gateway 110. If information is transmitted to another storage device and stored, the capacity of the information processing terminal 910 is not compressed. With this configuration, the displayed information can be reused. The displayed content can be utilized later, which increases convenience for the user.
また、情報を蓄積するか否かを選択可能なように構成してもよい。この場合、例えば、指示受付部506を介して指示を受け付ける。そして、例えば、表示コンテンツ選択部502は、選択したコンテンツの中から、指示されたコンテンツを送信する。
Also, it may be configured to be able to select whether or not to accumulate information. In this case, for example, an instruction is received via the instruction receiving unit 506. Then, for example, the display content selection unit 502 transmits the instructed content from the selected content.
なお、選択指示は、上述の動作認識装置を介して受け付けるよう構成してもよい。例えば、音声により受け付ける場合、指示しやすいように、表示する各コンテンツに、数字や、簡単な文字等を識別子として付加して表示してもよい。識別子は、例えば、表示コンテンツ選択部502または表示コンテンツ生成部501により付与される。表示コンテンツ選択部502または表示コンテンツ生成部501は、メタ情報を用いたり、予め定めた文字、数字を順に付与したり、等により、識別子を付加する。
In addition, you may comprise so that a selection instruction | indication may be received via the above-mentioned motion recognition apparatus. For example, when receiving by voice, numbers, simple characters, or the like may be added and displayed as identifiers to each content to be displayed so that it is easy to give instructions. The identifier is given by, for example, the display content selection unit 502 or the display content generation unit 501. The display content selection unit 502 or the display content generation unit 501 adds an identifier by using meta information, sequentially assigning predetermined characters and numbers, or the like.
なお、コンテンツは、オブジェクトに装飾を与えるものであってもよい。そして、その装飾が、エンターテインメント性の高いものであってもよい。
It should be noted that the content may be one that gives decoration to the object. And the decoration may be a thing with high entertainment property.
この場合の表示例を図10に示す。本図の例では、実在する建物716と車717とに対応する位置に、コンテンツ816、817をそれぞれ表示し、これらに装飾を施している。
Fig. 10 shows a display example in this case. In the example of this figure, contents 816 and 817 are respectively displayed at positions corresponding to a real building 716 and a car 717, and these are decorated.
例えば、コンテンツ816、817が、行き先に合わせた統一感があるデザインで提供されると、ドライブ中から雰囲気を盛り上げることができる。この際、世界観を創り込むため、ナビ案内の音声の切り替え、音声出力を禁止すると、世界観の統一を図ることができる。このように、実際の光景を、コンテンツ816、817で置き換えることにより、エンターテインメント性が高まる。表示するコンテンツ816、817は、表示コンテンツ選択部502が選択する。
For example, if the contents 816 and 817 are provided with a design that has a sense of unity according to the destination, the atmosphere can be raised from driving. At this time, in order to create a world view, it is possible to unify the world view by switching the navigation guidance voice and prohibiting voice output. In this way, by replacing the actual scene with the contents 816 and 817, the entertainment property is enhanced. The display content selection unit 502 selects the contents 816 and 817 to be displayed.
なお、本実施形態は、例えば、自動車等の移動体に、AR表示装置500を搭載する場合を例にあげて説明したが、これに限定されない。例えば、HMDに搭載し、歩行者が利用してもよい。
In addition, although this embodiment gave and demonstrated the case where the AR display apparatus 500 was mounted in moving bodies, such as a motor vehicle, for example, it is not limited to this. For example, it may be mounted on the HMD and used by pedestrians.
さらに、通常のスクリーン等をディスプレイ522に用い、他の映像に重畳し、屋内で使用してもよい。
Furthermore, a normal screen or the like may be used for the display 522, superimposed on other images, and used indoors.
なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。
In addition, this invention is not limited to the above-mentioned Example, Various modifications are included. For example, the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described. Further, a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment. Further, it is possible to add, delete, and replace other configurations for a part of the configuration of each embodiment.
100:位置姿勢検出装置、101:制御部、102:撮影部、103:粗検出センサ、104:評価オブジェクト抽出部、105:オブジェクト検出部、106:方向算出部、107:位置姿勢算出部、108:マップ管理部、109:二次元マップ生成部、110:ゲートウェイ、121:撮影画像保持部、122:測位情報保持部、123:二次元マップ情報保持部、124:地図情報保持部、131:レンズ、132:撮像素子、133:光軸、141:CPU、142:メモリ、143:記憶装置、145:通信I/F、
210:二次元マップ情報、211:評価オブジェクト候補、212:評価オブジェクト候補、213:評価オブジェクト候補、220:撮影画像、221:評価オブジェクト、222:評価オブジェクト、223:評価オブジェクト、225:領域、230:地図情報、231:実オブジェクト、232:実オブジェクト、233:実オブジェクト、234:道路、241:円、242:円、251:形状、252:台形、261:正面形状、
310:テンプレート画像、311:テンプレート画像、312:テンプレート画像、313:テンプレート画像、314:テンプレート画像、315:テンプレート画像、316:テンプレート画像、321:テンプレート画像、322:テンプレート画像、323:テンプレート画像、324:テンプレート画像、325:テンプレート画像、331:代表点、
500:AR表示装置、501:表示コンテンツ生成部、502:表示コンテンツ選択部、503:重畳部、504:表示部、506:指示受付部、507:抽出部、508:アイトラッキング装置、511:コンテンツ保持部、521:プロジェクタ、522:ディスプレイ、530:ユーザ、531:視野範囲、541:撮影範囲、551:視線、552:交点、553:直線、
610:コンテンツサーバシステム、611:コンテンツサーバ、612:コンテンツ格納部、613:ARコンテンツ格納部、614:広告格納部、615:通信I/F、620:システム、620:マップサーバシステム、621:マップサーバ、622:地図情報格納部、623:二次元マップ情報格納部、624:オブジェクト、625:通信I/F、
711:建物、712:標識、713:案内板、714:実オブジェクト、715:実オブジェクト、716:建物、717:車、731:遠方領域、732:手前領域、733:道路表示領域、734:側面表示領域、735:空中表示領域、736:正面表示領域、737:退避領域、741:表示領域、742:情報表示領域、743:警告領域、744:表示領域、745:同一表示領域、745:表示領域、746:表示領域、751:引き出し効果、761:参照マーカ、
811:関連コンテンツ、812:関連コンテンツ、813:関連コンテンツ、816:コンテンツ、817:コンテンツ、821:コンテンツ、822:コンテンツ、823:コンテンツ、824:コンテンツ、825:コンテンツ、831:コンテンツ、841:マスコット、910:情報処理端末 DESCRIPTION OF SYMBOLS 100: Position and orientation detection apparatus, 101: Control part, 102: Imaging | photography part, 103: Coarse detection sensor, 104: Evaluation object extraction part, 105: Object detection part, 106: Direction calculation part, 107: Position / orientation calculation part, 108 : Map management unit, 109: Two-dimensional map generation unit, 110: Gateway, 121: Captured image holding unit, 122: Positioning information holding unit, 123: Two-dimensional map information holding unit, 124: Map information holding unit, 131: Lens 132: Image sensor, 133: Optical axis, 141: CPU, 142: Memory, 143: Storage device, 145: Communication I / F,
210: Two-dimensional map information, 211: Evaluation object candidate, 212: Evaluation object candidate, 213: Evaluation object candidate, 220: Captured image, 221: Evaluation object, 222: Evaluation object, 223: Evaluation object, 225: Area, 230 : Map information, 231: real object, 232: real object, 233: real object, 234: road, 241: circle, 242: circle, 251: shape, 252: trapezoid, 261: front shape,
310: Template image, 311: Template image, 312: Template image, 313: Template image, 314: Template image, 315: Template image, 316: Template image, 321: Template image, 322: Template image, 323: Template image, 324: Template image, 325: Template image, 331: Representative point,
500: AR display device, 501: display content generation unit, 502: display content selection unit, 503: superimposition unit, 504: display unit, 506: instruction reception unit, 507: extraction unit, 508: eye tracking device, 511: content Holding unit, 521: projector, 522: display, 530: user, 531: field of view range, 541: shooting range, 551: line of sight, 552: intersection, 553: straight line,
610: Content server system, 611: Content server, 612: Content storage unit, 613: AR content storage unit, 614: Advertisement storage unit, 615: Communication I / F, 620: System, 620: Map server system, 621: Map Server, 622: map information storage unit, 623: two-dimensional map information storage unit, 624: object, 625: communication I / F,
711: Building, 712: Sign, 713: Guide board, 714: Real object, 715: Real object, 716: Building, 717: Car, 731: Distant area, 732: Front area, 733: Road display area, 734: Side Display area, 735: Aerial display area, 736: Front display area, 737: Retraction area, 741: Display area, 742: Information display area, 743: Warning area, 744: Display area, 745: Same display area, 745: Display Area, 746: display area, 751: extraction effect, 761: reference marker,
811: Related content, 812: Related content, 813: Related content, 816: Content, 817: Content, 821: Content, 822: Content, 823: Content, 825: Content, 825: Content, 831: Content, 841: Mascot 910: Information processing terminal
210:二次元マップ情報、211:評価オブジェクト候補、212:評価オブジェクト候補、213:評価オブジェクト候補、220:撮影画像、221:評価オブジェクト、222:評価オブジェクト、223:評価オブジェクト、225:領域、230:地図情報、231:実オブジェクト、232:実オブジェクト、233:実オブジェクト、234:道路、241:円、242:円、251:形状、252:台形、261:正面形状、
310:テンプレート画像、311:テンプレート画像、312:テンプレート画像、313:テンプレート画像、314:テンプレート画像、315:テンプレート画像、316:テンプレート画像、321:テンプレート画像、322:テンプレート画像、323:テンプレート画像、324:テンプレート画像、325:テンプレート画像、331:代表点、
500:AR表示装置、501:表示コンテンツ生成部、502:表示コンテンツ選択部、503:重畳部、504:表示部、506:指示受付部、507:抽出部、508:アイトラッキング装置、511:コンテンツ保持部、521:プロジェクタ、522:ディスプレイ、530:ユーザ、531:視野範囲、541:撮影範囲、551:視線、552:交点、553:直線、
610:コンテンツサーバシステム、611:コンテンツサーバ、612:コンテンツ格納部、613:ARコンテンツ格納部、614:広告格納部、615:通信I/F、620:システム、620:マップサーバシステム、621:マップサーバ、622:地図情報格納部、623:二次元マップ情報格納部、624:オブジェクト、625:通信I/F、
711:建物、712:標識、713:案内板、714:実オブジェクト、715:実オブジェクト、716:建物、717:車、731:遠方領域、732:手前領域、733:道路表示領域、734:側面表示領域、735:空中表示領域、736:正面表示領域、737:退避領域、741:表示領域、742:情報表示領域、743:警告領域、744:表示領域、745:同一表示領域、745:表示領域、746:表示領域、751:引き出し効果、761:参照マーカ、
811:関連コンテンツ、812:関連コンテンツ、813:関連コンテンツ、816:コンテンツ、817:コンテンツ、821:コンテンツ、822:コンテンツ、823:コンテンツ、824:コンテンツ、825:コンテンツ、831:コンテンツ、841:マスコット、910:情報処理端末 DESCRIPTION OF SYMBOLS 100: Position and orientation detection apparatus, 101: Control part, 102: Imaging | photography part, 103: Coarse detection sensor, 104: Evaluation object extraction part, 105: Object detection part, 106: Direction calculation part, 107: Position / orientation calculation part, 108 : Map management unit, 109: Two-dimensional map generation unit, 110: Gateway, 121: Captured image holding unit, 122: Positioning information holding unit, 123: Two-dimensional map information holding unit, 124: Map information holding unit, 131: Lens 132: Image sensor, 133: Optical axis, 141: CPU, 142: Memory, 143: Storage device, 145: Communication I / F,
210: Two-dimensional map information, 211: Evaluation object candidate, 212: Evaluation object candidate, 213: Evaluation object candidate, 220: Captured image, 221: Evaluation object, 222: Evaluation object, 223: Evaluation object, 225: Area, 230 : Map information, 231: real object, 232: real object, 233: real object, 234: road, 241: circle, 242: circle, 251: shape, 252: trapezoid, 261: front shape,
310: Template image, 311: Template image, 312: Template image, 313: Template image, 314: Template image, 315: Template image, 316: Template image, 321: Template image, 322: Template image, 323: Template image, 324: Template image, 325: Template image, 331: Representative point,
500: AR display device, 501: display content generation unit, 502: display content selection unit, 503: superimposition unit, 504: display unit, 506: instruction reception unit, 507: extraction unit, 508: eye tracking device, 511: content Holding unit, 521: projector, 522: display, 530: user, 531: field of view range, 541: shooting range, 551: line of sight, 552: intersection, 553: straight line,
610: Content server system, 611: Content server, 612: Content storage unit, 613: AR content storage unit, 614: Advertisement storage unit, 615: Communication I / F, 620: System, 620: Map server system, 621: Map Server, 622: map information storage unit, 623: two-dimensional map information storage unit, 624: object, 625: communication I / F,
711: Building, 712: Sign, 713: Guide board, 714: Real object, 715: Real object, 716: Building, 717: Car, 731: Distant area, 732: Front area, 733: Road display area, 734: Side Display area, 735: Aerial display area, 736: Front display area, 737: Retraction area, 741: Display area, 742: Information display area, 743: Warning area, 744: Display area, 745: Same display area, 745: Display Area, 746: display area, 751: extraction effect, 761: reference marker,
811: Related content, 812: Related content, 813: Related content, 816: Content, 817: Content, 821: Content, 822: Content, 823: Content, 825: Content, 825: Content, 831: Content, 841: Mascot 910: Information processing terminal
Claims (16)
- 2以上のオブジェクトを含む予め定めた撮影範囲を撮影する撮影部と、
前記撮影部で撮影された撮影画像内の前記オブジェクトそれぞれの画素位置を特定するオブジェクト検出部と、
前記オブジェクトそれぞれの前記撮影部に対する方向であるオブジェクト方向を、前記画素位置と、前記オブジェクトそれぞれの地図情報と、前記撮影部の焦点距離と、を用いて算出する方向算出部と、
前記各オブジェクトのオブジェクト方向と前記地図情報とを用い、前記撮影部の位置および姿勢を算出する位置姿勢算出部と、を備え、
前記オブジェクト検出部は、予め定めた領域内の複数のオブジェクトの位置および形状の情報が格納される二次元マップ情報から、前記撮影範囲に対応する二次元マップ情報を抽出し、当該二次元マップ情報を用いて前記画素位置を特定すること
を特徴とする位置姿勢検出装置。 An imaging unit for imaging a predetermined imaging range including two or more objects;
An object detection unit for specifying a pixel position of each of the objects in the captured image captured by the imaging unit;
A direction calculation unit that calculates an object direction, which is a direction of each of the objects with respect to the shooting unit, using the pixel position, map information of each of the objects, and a focal length of the shooting unit;
A position and orientation calculation unit that calculates the position and orientation of the photographing unit using the object direction of each object and the map information;
The object detection unit extracts two-dimensional map information corresponding to the shooting range from two-dimensional map information in which information on positions and shapes of a plurality of objects in a predetermined region is stored, and the two-dimensional map information The position / orientation detection apparatus characterized by specifying the said pixel position using. - 請求項1記載の位置姿勢検出装置において、
前記位置姿勢算出部は、前記撮影部の概略の位置である測位情報を用い、前記位置および姿勢を算出すること
を特徴とする位置姿勢検出装置。 The position and orientation detection apparatus according to claim 1,
The position / orientation detection unit calculates the position and orientation using positioning information that is an approximate position of the photographing unit. - 請求項1記載の位置姿勢検出装置において、
前記位置姿勢算出部は、3つの前記オブジェクトのオブジェクト方向を用い、前記位置および姿勢を算出すること
を特徴とする位置姿勢検出装置。 The position and orientation detection apparatus according to claim 1,
The position and orientation calculation unit calculates the position and orientation using object directions of the three objects. - 請求項1記載の位置姿勢検出装置において、
前記位置姿勢算出部は、2つの前記オブジェクトのオブジェクト方向と、前記撮影範囲内の予め定めた要素の前記地図情報とを用いて、前記位置および姿勢を算出すること
を特徴とする位置姿勢検出装置。 The position and orientation detection apparatus according to claim 1,
The position and orientation calculation unit calculates the position and orientation using the object direction of two objects and the map information of predetermined elements within the shooting range. . - 請求項1記載の位置姿勢検出装置において、
前記位置姿勢算出部は、2つの前記オブジェクトのオブジェクト方向と、当該オブジェクトの形状の、前記撮影画像内での変形量とを用いて、前記位置および姿勢を算出すること
を特徴とする位置姿勢検出装置。 The position and orientation detection apparatus according to claim 1,
The position and orientation calculation unit calculates the position and orientation using an object direction of the two objects and a deformation amount of the shape of the object in the captured image. apparatus. - 請求項1記載の位置姿勢検出装置において、
前記撮影部の概略の位置および姿勢を検出する粗検出センサをさらに備え、
前記オブジェクト検出部は、前記粗検出センサが検出した前記概略の位置および姿勢を用い、前記二次元マップ情報を取得すること
を特徴とする位置姿勢検出装置。 The position and orientation detection apparatus according to claim 1,
A rough detection sensor for detecting a rough position and orientation of the photographing unit;
The position and orientation detection apparatus, wherein the object detection unit acquires the two-dimensional map information using the approximate position and orientation detected by the rough detection sensor. - 請求項1記載の位置姿勢検出装置において、
前記撮影部が撮影した撮影画像を用い、前記二次元マップ情報を生成する二次元マップ生成部をさらに備え、
前記二次元マップ生成部は、
前記オブジェクト検出部が特定した、前記オブジェクトの撮影画像内の画素位置と、前記位置姿勢算出部が算出した前記撮影部の位置および姿勢とを用い、前記撮影画像内の各オブジェクトの実空間上の位置を算出し、前記各オブジェクトの外観と前記実空間上の位置とを対応づけて、前記二次元マップ情報を生成すること
を特徴とする位置姿勢検出装置。 The position and orientation detection apparatus according to claim 1,
Using a captured image captured by the imaging unit, further comprising a two-dimensional map generation unit for generating the two-dimensional map information;
The two-dimensional map generation unit
Using the pixel position in the captured image of the object specified by the object detection unit and the position and orientation of the imaging unit calculated by the position / orientation calculation unit, each object in the captured image in real space A position and orientation detection apparatus that calculates a position and associates the appearance of each object with a position in the real space to generate the two-dimensional map information. - 透過性および反射性を有するディスプレイに、当該ディスプレイの背後の光景内のオブジェクトに関連づけてコンテンツを表示するAR表示装置であって、
請求項1記載の位置姿勢検出装置と、
前記ディスプレイに表示する前記コンテンツを生成する表示コンテンツ生成部と、
前記位置姿勢検出装置が決定した前記撮影部の位置および姿勢と、前記オブジェクト検出部が特定した前記オブジェクトの画素位置と、を用い、生成した前記コンテンツの前記ディスプレイ上の表示位置を決定する重畳部と、
前記ディスプレイ上の、前記重畳部が決定した表示位置に、生成した前記コンテンツを表示する表示部と、を備えること
を特徴とするAR表示装置。 An AR display device for displaying content in association with an object in a scene behind the display on a display having transparency and reflection,
A position and orientation detection apparatus according to claim 1;
A display content generation unit for generating the content to be displayed on the display;
A superimposition unit that determines the display position of the generated content on the display using the position and orientation of the photographing unit determined by the position and orientation detection device and the pixel position of the object specified by the object detection unit When,
An AR display device comprising: a display unit that displays the generated content at a display position determined by the superimposing unit on the display. - 透過性および反射性を有するディスプレイに、当該ディスプレイの背後の光景内のオブジェクトに関連づけてコンテンツを表示するAR表示装置であって、
2以上の前記オブジェクトを含む、予め定めた撮影範囲を撮影した撮影画像内の、当該オブジェクトそれぞれの画素位置を特定するオブジェクト検出部と、
前記撮影画像を取得した撮影装置に対する前記オブジェクトそれぞれの方向であるオブジェクト方向を、前記画素位置と、前記オブジェクトそれぞれの地図情報と、前記撮影装置の焦点距離と、を用いて算出する方向算出部と、
前記各オブジェクトのオブジェクト方向と前記地図情報とを用い、前記撮影装置の位置および姿勢を算出する位置姿勢算出部と、
前記ディスプレイに表示する前記コンテンツを生成する表示コンテンツ生成部と、
前記位置姿勢算出部が算出した前記撮影装置の位置および姿勢と、前記オブジェクト検出部が特定した前記オブジェクトの画素位置と、を用い、生成した前記コンテンツの前記ディスプレイ上の表示位置を決定する重畳部と、
前記ディスプレイ上の、前記重畳部が決定した表示位置に、生成した前記コンテンツを表示する表示部と、を備え
前記オブジェクト検出部は、予め定めた領域内の複数のオブジェクトの位置および形状の情報が格納される二次元マップ情報から、前記撮影範囲に対応する二次元マップ情報を抽出し、当該二次元マップ情報を用いて前記画素位置を特定すること
を特徴とするAR表示装置。 An AR display device for displaying content in association with an object in a scene behind the display on a display having transparency and reflection,
An object detection unit for specifying a pixel position of each of the objects in a photographed image obtained by photographing a predetermined photographing range including two or more objects;
A direction calculation unit that calculates an object direction, which is a direction of each of the objects with respect to the imaging apparatus that has acquired the captured image, using the pixel position, map information of each of the objects, and a focal length of the imaging apparatus; ,
A position and orientation calculation unit that calculates the position and orientation of the photographing device using the object direction of each object and the map information;
A display content generation unit for generating the content to be displayed on the display;
A superimposition unit that determines the display position of the generated content on the display using the position and orientation of the photographing apparatus calculated by the position / orientation calculation unit and the pixel position of the object specified by the object detection unit When,
A display unit that displays the generated content at a display position determined by the superimposing unit on the display, wherein the object detection unit has information on positions and shapes of a plurality of objects in a predetermined region. An AR display device, wherein two-dimensional map information corresponding to the imaging range is extracted from stored two-dimensional map information, and the pixel position is specified using the two-dimensional map information. - 請求項8または9記載のAR表示装置であって、
予め定めた表示条件に応じて、前記ディスプレイに表示する前記コンテンツを選択する表示コンテンツ選択部をさらに備えること
を特徴とするAR表示装置。 The AR display device according to claim 8 or 9, wherein
An AR display device, further comprising: a display content selection unit that selects the content to be displayed on the display in accordance with a predetermined display condition. - 請求項8または9記載のAR表示装置であって、
ユーザの視線方向および視界を出力するアイトラッキング装置をさらに備え、
前記重畳部は、前記ディスプレイ上の、前記視線方向および視界で特定される領域に、前記コンテンツが表示されるよう、前記表示位置を決定すること
を特徴とするAR表示装置。 The AR display device according to claim 8 or 9, wherein
An eye tracking device that outputs a user's line-of-sight direction and field of view;
The AR display device, wherein the superimposing unit determines the display position so that the content is displayed in an area specified by the line-of-sight direction and field of view on the display. - 請求項8または9記載のAR表示装置であって、
前記重畳部は、前記コンテンツが予め保持するメタ情報に応じて、当該コンテンツの表示位置を決定すること
を特徴とするAR表示装置。 The AR display device according to claim 8 or 9, wherein
The AR display device, wherein the superimposing unit determines a display position of the content according to meta information previously held by the content. - 請求項8または9記載のAR表示装置であって、
当該AR表示装置は、移動体に搭載されること
を特徴とするAR表示装置。 The AR display device according to claim 8 or 9, wherein
The AR display device is mounted on a moving body. - 請求項13記載のAR表示装置であって、
前記重畳部は、前記移動体の運転モードに応じて、前記ディスプレイ上の表示位置を決定すること
を特徴とするAR表示装置。 The AR display device according to claim 13,
The AR display device, wherein the superimposing unit determines a display position on the display according to an operation mode of the moving body. - 2以上のオブジェクトを含む予め定めた範囲である撮影範囲を撮影部により撮影して撮影画像を得、
前記撮影画像内の前記オブジェクトそれぞれの画素位置を特定し、
前記オブジェクトそれぞれの前記撮影部に対する方向であるオブジェクト方向を、前記画素位置と、前記オブジェクトそれぞれの地図情報と、前記撮影部の焦点距離と、を用いて算出し、
前記各オブジェクトのオブジェクト方向と前記地図情報とを用い、前記撮影部の位置および姿勢を算出し、
前記画素位置は、予め定めた領域内の複数のオブジェクトの位置および形状の情報が格納される二次元マップ情報から、前記撮影範囲に対応する二次元マップ情報を抽出し、当該二次元マップ情報を用いて特定されること
を特徴とする位置姿勢検出方法。 Shooting a shooting range that is a predetermined range including two or more objects by a shooting unit to obtain a shot image,
Identify the pixel position of each of the objects in the captured image,
Calculating an object direction, which is a direction of each of the objects with respect to the photographing unit, using the pixel position, map information of each of the objects, and a focal length of the photographing unit;
Using the object direction of each object and the map information, the position and orientation of the photographing unit are calculated,
The pixel position is obtained by extracting two-dimensional map information corresponding to the shooting range from two-dimensional map information in which information on positions and shapes of a plurality of objects in a predetermined region is stored. A position and orientation detection method characterized by being specified by using. - 透過性および反射性を有するディスプレイに、当該ディスプレイの背後の光景内のオブジェクトに関連づけてコンテンツを表示するAR表示方法であって、
前記ディスプレイに表示する前記コンテンツを生成し、
撮影部が撮影した前記オブジェクトを含む撮影画像を用いて、前記オブジェクトの前記撮影画像内の画素位置と、前記撮影部の位置および姿勢とを算出し、当該画素位置おと当該撮影部の位置および姿勢と、を用いて、前記コンテンツの前記ディスプレイ上の表示位置を決定し、
前記ディスプレイ上の、決定した前記表示位置に前記コンテンツを表示すること
を特徴とするAR表示方法。 An AR display method for displaying content in association with an object in a scene behind the display on a display having transparency and reflection,
Generating the content to be displayed on the display;
Using the captured image including the object captured by the imaging unit, the pixel position in the captured image of the object and the position and orientation of the imaging unit are calculated, and the pixel position and the position of the imaging unit are calculated. And using the posture to determine the display position of the content on the display,
The AR display method, wherein the content is displayed at the determined display position on the display.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2017/001426 WO2018134897A1 (en) | 2017-01-17 | 2017-01-17 | Position and posture detection device, ar display device, position and posture detection method, and ar display method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2017/001426 WO2018134897A1 (en) | 2017-01-17 | 2017-01-17 | Position and posture detection device, ar display device, position and posture detection method, and ar display method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018134897A1 true WO2018134897A1 (en) | 2018-07-26 |
Family
ID=62908970
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/001426 WO2018134897A1 (en) | 2017-01-17 | 2017-01-17 | Position and posture detection device, ar display device, position and posture detection method, and ar display method |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018134897A1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110069136A (en) * | 2019-04-29 | 2019-07-30 | 努比亚技术有限公司 | A kind of wearing state recognition methods, equipment and computer readable storage medium |
CN110399039A (en) * | 2019-07-03 | 2019-11-01 | 武汉子序科技股份有限公司 | A kind of actual situation scene fusion method based on eye-tracking |
CN111665943A (en) * | 2020-06-08 | 2020-09-15 | 浙江商汤科技开发有限公司 | Pose information display method and device |
CN112711982A (en) * | 2020-12-04 | 2021-04-27 | 科大讯飞股份有限公司 | Visual detection method, equipment, system and storage device |
TWI731624B (en) * | 2020-03-18 | 2021-06-21 | 宏碁股份有限公司 | Method for estimating position of electronic device, electronic device and computer device |
WO2022161140A1 (en) * | 2021-01-27 | 2022-08-04 | 上海商汤智能科技有限公司 | Target detection method and apparatus, and computer device and storage medium |
US20230249618A1 (en) * | 2017-09-22 | 2023-08-10 | Maxell, Ltd. | Display system and display method |
JP7578165B2 (en) | 2018-08-29 | 2024-11-06 | トヨタ自動車株式会社 | Vehicle display device |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006287435A (en) * | 2005-03-31 | 2006-10-19 | Pioneer Electronic Corp | Information processing apparatus, system thereof, method thereof, program thereof, and recording medium with the program recorded thereon |
JP2007121528A (en) * | 2005-10-26 | 2007-05-17 | Fujifilm Corp | System and method for renewing map creation |
JP2008287379A (en) * | 2007-05-16 | 2008-11-27 | Hitachi Ltd | Road sign data input system |
JP2010066042A (en) * | 2008-09-09 | 2010-03-25 | Toshiba Corp | Image irradiating system and image irradiating method |
JP2011053163A (en) * | 2009-09-04 | 2011-03-17 | Clarion Co Ltd | Navigation device and vehicle control device |
JP2011169808A (en) * | 2010-02-19 | 2011-09-01 | Equos Research Co Ltd | Driving assist system |
JP2012035745A (en) * | 2010-08-06 | 2012-02-23 | Toshiba Corp | Display device, image data generating device, and image data generating program |
JP2014009993A (en) * | 2012-06-28 | 2014-01-20 | Navitime Japan Co Ltd | Information processing system, information processing device, server, terminal device, information processing method, and program |
JP2015217798A (en) * | 2014-05-16 | 2015-12-07 | 三菱電機株式会社 | On-vehicle information display control device |
JP2016070716A (en) * | 2014-09-29 | 2016-05-09 | 三菱電機株式会社 | Information display control system and information display control method |
JP2016090557A (en) * | 2014-10-31 | 2016-05-23 | 英喜 菅沼 | Positioning system for movable body |
-
2017
- 2017-01-17 WO PCT/JP2017/001426 patent/WO2018134897A1/en active Application Filing
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006287435A (en) * | 2005-03-31 | 2006-10-19 | Pioneer Electronic Corp | Information processing apparatus, system thereof, method thereof, program thereof, and recording medium with the program recorded thereon |
JP2007121528A (en) * | 2005-10-26 | 2007-05-17 | Fujifilm Corp | System and method for renewing map creation |
JP2008287379A (en) * | 2007-05-16 | 2008-11-27 | Hitachi Ltd | Road sign data input system |
JP2010066042A (en) * | 2008-09-09 | 2010-03-25 | Toshiba Corp | Image irradiating system and image irradiating method |
JP2011053163A (en) * | 2009-09-04 | 2011-03-17 | Clarion Co Ltd | Navigation device and vehicle control device |
JP2011169808A (en) * | 2010-02-19 | 2011-09-01 | Equos Research Co Ltd | Driving assist system |
JP2012035745A (en) * | 2010-08-06 | 2012-02-23 | Toshiba Corp | Display device, image data generating device, and image data generating program |
JP2014009993A (en) * | 2012-06-28 | 2014-01-20 | Navitime Japan Co Ltd | Information processing system, information processing device, server, terminal device, information processing method, and program |
JP2015217798A (en) * | 2014-05-16 | 2015-12-07 | 三菱電機株式会社 | On-vehicle information display control device |
JP2016070716A (en) * | 2014-09-29 | 2016-05-09 | 三菱電機株式会社 | Information display control system and information display control method |
JP2016090557A (en) * | 2014-10-31 | 2016-05-23 | 英喜 菅沼 | Positioning system for movable body |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230249618A1 (en) * | 2017-09-22 | 2023-08-10 | Maxell, Ltd. | Display system and display method |
US12198399B2 (en) * | 2017-09-22 | 2025-01-14 | Maxell, Ltd. | Display system and display method |
JP7578165B2 (en) | 2018-08-29 | 2024-11-06 | トヨタ自動車株式会社 | Vehicle display device |
CN110069136A (en) * | 2019-04-29 | 2019-07-30 | 努比亚技术有限公司 | A kind of wearing state recognition methods, equipment and computer readable storage medium |
CN110069136B (en) * | 2019-04-29 | 2022-10-11 | 中食安泓(广东)健康产业有限公司 | Wearing state identification method and equipment and computer readable storage medium |
CN110399039A (en) * | 2019-07-03 | 2019-11-01 | 武汉子序科技股份有限公司 | A kind of actual situation scene fusion method based on eye-tracking |
TWI731624B (en) * | 2020-03-18 | 2021-06-21 | 宏碁股份有限公司 | Method for estimating position of electronic device, electronic device and computer device |
CN111665943A (en) * | 2020-06-08 | 2020-09-15 | 浙江商汤科技开发有限公司 | Pose information display method and device |
CN111665943B (en) * | 2020-06-08 | 2023-09-19 | 浙江商汤科技开发有限公司 | Pose information display method and device |
CN112711982A (en) * | 2020-12-04 | 2021-04-27 | 科大讯飞股份有限公司 | Visual detection method, equipment, system and storage device |
WO2022161140A1 (en) * | 2021-01-27 | 2022-08-04 | 上海商汤智能科技有限公司 | Target detection method and apparatus, and computer device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018134897A1 (en) | Position and posture detection device, ar display device, position and posture detection method, and ar display method | |
US11373357B2 (en) | Adjusting depth of augmented reality content on a heads up display | |
US10029700B2 (en) | Infotainment system with head-up display for symbol projection | |
US8773534B2 (en) | Image processing apparatus, medium recording image processing program, and image processing method | |
EP2208021B1 (en) | Method of and arrangement for mapping range sensor data on image sensor data | |
US8395490B2 (en) | Blind spot display apparatus | |
JP5443134B2 (en) | Method and apparatus for marking the position of a real-world object on a see-through display | |
JP6176541B2 (en) | Information display device, information display method, and program | |
US20120224060A1 (en) | Reducing Driver Distraction Using a Heads-Up Display | |
US20140285523A1 (en) | Method for Integrating Virtual Object into Vehicle Displays | |
CN108460734A (en) | The system and method that vehicle driver's supplementary module carries out image presentation | |
KR101573576B1 (en) | Image processing method of around view monitoring system | |
EP3942794A1 (en) | Depth-guided video inpainting for autonomous driving | |
JP2007080060A (en) | Object specification device | |
US12198238B2 (en) | Method and arrangement for producing a surroundings map of a vehicle, textured with image information, and vehicle comprising such an arrangement | |
JPWO2016031229A1 (en) | Road map creation system, data processing device and in-vehicle device | |
JP2004265396A (en) | Image forming system and image forming method | |
JP5086824B2 (en) | TRACKING DEVICE AND TRACKING METHOD | |
CN115176457A (en) | Image processing apparatus, image processing method, program, and image presentation system | |
JP2009077022A (en) | Driving support system and vehicle | |
CN113011212B (en) | Image recognition method and device and vehicle | |
CN111241946B (en) | Method and system for increasing FOV (field of view) based on single DLP (digital light processing) optical machine | |
CN119105717A (en) | Content display method, electronic equipment and medium | |
CN111243102B (en) | Method and system for improving and increasing FOV (field of view) based on diffusion film transformation | |
US20240426623A1 (en) | Vehicle camera system for view creation of viewing locations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17892855 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17892855 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |