CN110738143B

CN110738143B - Positioning method and device, equipment and storage medium

Info

Publication number: CN110738143B
Application number: CN201910921542.5A
Authority: CN
Inventors: 金珂; 马标; 李姬俊男; 刘耀勇; 蒋燚
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2023-06-02
Anticipated expiration: 2039-09-27
Also published as: CN110738143A; WO2021057744A1

Abstract

The embodiment of the application discloses a positioning method, a positioning device, positioning equipment and a storage medium, wherein the positioning method comprises the following steps: determining a target identifier for representing a current area of the image acquisition equipment according to network characteristics of a network where the image acquisition equipment is located and a fingerprint map; acquiring a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map, wherein the local map comprises attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in a sample image; acquiring an image to be processed acquired by the image acquisition equipment; and positioning the image acquisition equipment according to the to-be-processed image and the attribute information of a plurality of sampling points in the target local map.

Description

Positioning method and device, equipment and storage medium

Technical Field

Embodiments of the present application relate to electronic technology, and relate to, but are not limited to, positioning methods and apparatuses, devices, and storage media.

Background

At present, indoor positioning is performed through wireless fidelity (Wireless Fidelity, WIFI) and pedestrian dead reckoning (Pedestrian Dead Reckoning, PDR), and although two indoor positioning technologies of WIFI fingerprint and PDR are combined, indoor positioning accuracy based on WIFI fingerprint is about 2 meters, and there is still room for improvement in positioning accuracy.

Disclosure of Invention

The embodiment of the application provides a positioning method, a positioning device, positioning equipment and a storage medium. The technical scheme of the embodiment of the application is realized as follows:

in a first aspect, an embodiment of the present application provides a positioning method, where the method includes: determining a target identifier for representing a current area of the image acquisition equipment according to network characteristics of a network where the image acquisition equipment is located and a fingerprint map; acquiring a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map, wherein the local map comprises attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in a sample image; acquiring an image to be processed acquired by the image acquisition equipment; and positioning the image acquisition equipment according to the to-be-processed image and the attribute information of the plurality of sampling points of the target local map.

In a second aspect, embodiments of the present application provide a positioning device, the device including: the first positioning module is configured to determine a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map; the map acquisition module is configured to acquire a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map, wherein the local map includes attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in the sample image; the image acquisition module is configured to acquire an image to be processed acquired by the image acquisition equipment; and the second positioning module is configured to position the image acquisition equipment according to the to-be-processed image and attribute information of a plurality of sampling points in the target local map.

In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, where the memory stores a computer program executable on the processor, and where the processor implements steps in the positioning method described above when the program is executed.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the positioning method described above.

In the embodiment of the application, a positioning method is provided, firstly, coarse positioning is carried out by utilizing network characteristics and a fingerprint map, and a target identifier of a current area where image acquisition equipment is located is determined; then, the target local map (namely, a partial map in the point cloud map) corresponding to the target mark and the image to be processed acquired by the image acquisition equipment are utilized for fine positioning; therefore, compared with a method for positioning according to a fingerprint map, the positioning accuracy is higher; compared with a method for positioning according to the point cloud map only, the electronic device for implementing the positioning method only needs to load the target local map, so that the time cost and the memory cost of the electronic device can be saved.

Drawings

FIG. 1A is a schematic diagram of an implementation flow of a positioning method according to an embodiment of the present application;

FIG. 1B is a schematic diagram of a network environment in which an image capturing device according to an embodiment of the present application is located;

FIG. 2 is a schematic view of a floor structure according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a correspondence relationship between a local map and an identifier of an area according to an embodiment of the present application;

FIG. 4 is a schematic diagram of determining camera coordinates of a plurality of first target sampling points according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a feature point matching pair according to an embodiment of the present application;

FIG. 6A is a schematic diagram of a positioning device according to an embodiment of the present disclosure;

FIG. 6B is a schematic view of a positioning device according to another embodiment of the present disclosure;

fig. 7 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the present application.

Detailed Description

For the purposes, technical solutions and advantages of the embodiments of the present application to be more apparent, the specific technical solutions of the present application will be described in further detail below with reference to the accompanying drawings in the embodiments of the present application. The following examples are illustrative of the present application, but are not intended to limit the scope of the present application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

It should be noted that the term "first\second\third" in relation to the embodiments of the present application is merely to distinguish different objects and does not represent a specific ordering for the objects, it being understood that the "first\second\third" may be interchanged in a specific order or sequence, where allowed, to enable the embodiments of the present application described herein to be implemented in an order other than illustrated or described herein.

The embodiment of the application provides a positioning method, which can be applied to electronic equipment, wherein the electronic equipment can be equipment with information processing capability such as mobile phones, tablet computers, notebook computers, desktop computers, robots, unmanned aerial vehicles and the like. The functions performed by the positioning method may be performed by a processor in the electronic device, which comprises at least a processor and a storage medium, invoking program code, which may of course be stored in a computer storage medium.

Fig. 1A is a schematic flow chart of an implementation of a positioning method according to an embodiment of the present application, as shown in fig. 1A, the method at least includes the following steps S101 to S104:

step S101, determining a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map.

In implementation, the network features acquired by the electronic device include: and the image acquisition equipment can receive wireless signal characteristics sent by at least one network node at the position of the image acquisition equipment. To save the cost of network deployment, network nodes are typically deployed as an infrastructure to enable public network communication. For example, the network node is: wireless access points (Wireless Access Point, APs) for realizing WIFI communication, base Stations (BS) for realizing mobile communication, zigBee nodes for realizing short-range communication, and the like. In addition, the parameters used to characterize the network characteristics may be varied. For example, the network characteristic may be a signal strength of the wireless node (simply called received signal strength (Received Signal Strength, RSS)) or a distribution of RSS, etc.

For example, as shown in fig. 1B, if the electronic device or the image acquisition module is capable of receiving the wireless signals sent by the AP1, the AP2, the base station BS1 and the base station BS2, the network characteristics acquired by the electronic device are as follows

Wherein (1)>

For the average value of a plurality of signal intensities when the signal sent by the WIFI node AP1 is received at the position of the image acquisition device, +.>

Is an average of a plurality of signal strengths when receiving a signal transmitted from the base station BS1 at a location where the image acquisition apparatus is located.

In the embodiments of the present application, the image capturing apparatus may be various. For example, the image capture device is a monocular camera or a multi-view camera (e.g., a binocular camera). It should be noted that the electronic device may include an image capturing device, that is, the image capturing device is installed in the electronic device, for example, the electronic device is a smart phone having a camera. Of course, in other embodiments, the electronic device may not include the image capturing device, and the image capturing device may send the acquired network characteristics of the network where the image capturing device is located to the electronic device.

Step S102, obtaining a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map.

It should be noted that the fingerprint map and the point cloud map are maps within the same geographic range, and only the content is different. The fingerprint map stores network characteristics and corresponding identification information at each specific position point (namely a subarea as described below) in a preset geographic range; and the point cloud map comprises a plurality of local maps, each local map comprises attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in the sample image. In addition, each local map is also marked with corresponding identification information. In this way, after determining the target identifier in step S101, the electronic device can quickly obtain the target local map corresponding to the target identifier from the point cloud map through the labeling information on the local map.

Step S103, obtaining an image to be processed acquired by the image acquisition equipment.

In general, the image to be processed is a two-dimensional image, for example, red, green, blue (Red, green, blue, RGB) images.

And step S104, positioning the image acquisition equipment according to the to-be-processed image and the attribute information of a plurality of sampling points of the target local map.

As can be appreciated, since the information contained in the image is richer, a better positioning accuracy can be obtained by step S103 and step S104, compared to step S101, in which the position of the image pickup apparatus is determined using the network characteristics and the fingerprint map.

In the embodiment of the application, firstly, coarse positioning is carried out on the image acquisition equipment by utilizing network characteristics and a fingerprint map, namely, the target identification of the current area of the image acquisition equipment is determined; then, a target local map corresponding to the target mark is obtained from the point cloud map, and the image acquisition equipment is positioned only based on the target local map (instead of the global map, namely the point cloud map) and the image to be processed; therefore, the positioning accuracy can be improved, and the time cost and the memory cost of the electronic equipment when the positioning method is implemented can be saved.

Before implementing any positioning method in the embodiments of the present application, a fingerprint map and a point cloud map need to be pre-constructed, where the two map construction methods at least include the following steps S111 to S114:

step S111, obtaining network characteristics of at least one sub-area in at least one physical area and an identification of each sub-area.

It should be noted that the identities of all the sub-areas in the same physical area may be the same or different. When the same is adopted, the identification is used for uniquely identifying the corresponding physical area; at different times, the identification is used to uniquely identify the corresponding sub-region, e.g., the identification is a number, the acquisition time of the network feature, or the geographic coordinates at the corresponding sub-region, etc.

Before step S111 is performed, the geographic area is generally divided into a plurality of physical areas according to building features in the geographic area corresponding to the fingerprint map, each physical area is divided into a plurality of sub-areas, and corresponding network features are collected at each sub-area. Taking a geographic range corresponding to the fingerprint map as an example, as shown in fig. 2, the floor has 8 rooms 201 to 208, and wireless nodes AP1 and AP2 are disposed, each room can be used as a physical area, and a plurality of sub-areas (i.e. sampling points) can be set in each room; then, signal acquisition is performed on each sub-region, and network characteristics at each sub-region are obtained. For example, on each sub-area, the average signal strength from the various APs is determined based on the acquired data by data acquisition over a period of time (the acquisition time is approximately 5 to 15 minutes, about once per second). Thus, the network characteristics at each sub-region, namely, the two-dimensional vector ρ= [ ρ ] can be obtained _AP1 ρ _AP2 ]Wherein ρ is _AP1 And ρ _AP2 The average signal strengths from AP1 and AP2, respectively. Each subarea corresponds to a two-dimensional vector (namely fingerprint) and an identifier, thereby constructing a WIFI fingerA texture map. In a more general scenario, assuming N APs, the fingerprint ρ is an N-dimensional vector.

And step S112, associating the network characteristics of each sub-region with the corresponding identifiers to obtain the fingerprint map.

It will be appreciated that in a fingerprint map, the network characteristics of each sub-region are in a one-to-one correspondence with the identity.

Step S113, obtaining a local map corresponding to each physical area; wherein each local map comprises attribute information of sampling points in a corresponding physical area.

Still taking fig. 2 as an example, each room has a local map corresponding to it, which is helpful for the electronic device to accurately mark the identifier of each sub-area in each room on the corresponding local map, and also facilitates the construction of the local map, because a local map can be built after each sample image of each room is acquired.

It will be appreciated that for the construction of a local map for each physical area, the electronic device may obtain sampling points of the object surface in the physical area by means of image acquisition, and construct the local map based on world coordinates of these sampling points. Since the spacing between these sampling points is greater than the spacing threshold, these sampling points are referred to as sampling points. It can be seen that the sampling point is actually a feature point in the image where the sampling point is located, and the attribute information of the sampling point is information specific to the sampling point. The attribute information of the sampling point includes at least one of: image characteristics of the sampling points and world coordinates of the sampling points. In one example, the attribute information of the sampling point includes an image feature of the sampling point and world coordinates of the sampling point. In another example, the attribute information of the sampling point includes world coordinates of the sampling point and does not include image features of the sampling point. It is understood that the world coordinates of the sampling point refer to the coordinates of the sampling point in the world coordinate system, and the coordinates may be two-dimensional coordinates or three-dimensional coordinates.

And step S114, marking the identification of each sub-region in the corresponding physical region on each local map to obtain the point cloud map.

For example, the point cloud map includes a local map 1 and a local map 2; the local map 1 is a map corresponding to a physical area 1, the physical area 1 comprises a subarea 1 and a subarea 2, the mark of the subarea 1 is a mark 1, and the mark of the subarea 2 is a mark 2; the local map 2 is a map corresponding to the physical area 2, the physical area 2 comprises a subarea 3 and a subarea 4, the mark of the subarea 3 is a mark 3, and the mark of the subarea 4 is a mark 4. As shown in fig. 3, the marks 1 to 4 are marked on the corresponding local map, so that the association between the local map and the sub-region in the fingerprint map is established. When the electronic equipment is positioned later, after the network characteristics of the current network of the image acquisition equipment and the fingerprint map are utilized to obtain the target identification representing the current area of the image acquisition equipment, the target local map corresponding to the target identification can be obtained according to the corresponding relation between the local map and the identification of the sub-area in the fingerprint map.

Here, the order of establishing the point cloud map and the fingerprint map is not limited, and step S113 and step S114 may be performed first, and then step S111 and step S112 may be performed. Or, simultaneously establishing a point cloud map and a fingerprint map.

The embodiment of the application further provides a method for constructing the fingerprint map and the point cloud map, which at least comprises the following steps S201 to S209:

step S201, acquiring network characteristics of at least one sub-area in at least one physical area and an identification of each sub-area.

Step S202, associating the network characteristics of each sub-region with the corresponding identifiers to obtain the fingerprint map.

Step S203, determining world coordinates of at least one sampling point according to camera coordinates and image characteristics of the sampling points in the plurality of sample images.

When the method is realized, the image acquisition equipment can be used for acquiring the sample image according to the preset frame rate. For example, acquisition of red, green, blue (Red, green, blue, RGB) images is performed at a fixed frame rate using a monocular camera. Alternatively, the plurality of sample images may be acquired from a library of sample images acquired in advance.

It will be appreciated that at the initial stage of the local map construction, only the image features and camera coordinates of the sample points can be obtained, but the world coordinates of the sample points are not known. When the method is realized, a plurality of sample images can be processed by a three-dimensional reconstruction method, so that world coordinates of sampling points are obtained. For example, the world coordinates of each sampling point are obtained by initializing a plurality of sample images by a method of recovering structures in motion (Structure from motion, SFM). In one example, the sample image is a two-dimensional image.

Step S204, determining a first data set according to the world coordinates and the image features of each sampling point. That is, the first dataset includes world coordinates and corresponding image features for each of the sampling points.

Step S205, determining a corresponding second data set according to the camera coordinates and image characteristics of sampling points in the obtained kth other sample image; wherein the plurality of sample images and each of the other sample images are different images acquired at an ith physical area, and k and i are integers greater than 0.

Similarly, an image acquisition device (e.g., a robot having an image acquisition function) may be used herein to acquire sample images in real time at a preset frame rate in the ith physical area. Alternatively, the other sample images may be obtained from a pre-established sample image library. Each second data set includes camera coordinates and image features of sampling points in the corresponding other sample images.

Step S206, determining world coordinates of sampling points in the other corresponding sample images according to the first data set and each of the second data sets.

In fact, the world coordinates of the sampling points in the plurality of sample images are determined by step S203, the time complexity of which is relatively high. Therefore, after the world coordinates of the sampling points in the plurality of sample images are obtained, the world coordinates of the sampling points in the other sample images are determined through step S206. Thus, the time cost of map construction can be greatly reduced.

In implementation, the world coordinates of the sampling points in the other sample images may be determined by steps S506 to S507, or steps S606 to S610, similar to those provided in the following embodiments. Alternatively, the world coordinates of the sampling points in the other sample images are determined by analogy to step S706 and step S708.

Step S207, constructing a local map corresponding to the ith physical area at least according to the determined world coordinates of each sampling point.

When the method is implemented, the local map can be constructed according to the determined world coordinates and image characteristics of each sampling point, namely, the attribute information of each sampling point in the local map comprises the world coordinates and the image characteristics; or, according to the determined world coordinates of each sampling point, constructing the local map, that is, the attribute information of each sampling point in the local map comprises world coordinates and does not comprise image features; in this way, the storage space for storing the local map can be saved.

Step S208, obtaining a local map corresponding to each physical area; wherein each local map comprises attribute information of sampling points in a corresponding physical area.

Step S209, marking the identifier of each sub-region in the corresponding physical region on each local map, so as to obtain the point cloud map.

In the embodiment of the application, when a map is constructed, world coordinates of partial sampling points are obtained through a plurality of sample images, and then the world coordinates of the sampling points in the other sample images are determined according to the world coordinates of the sampling points and the acquired attribute information of the sampling points in the other sample images, so as to obtain a second data set; therefore, the world coordinates of the sampling points in the other sample images can be obtained quickly, so that the time cost of map construction is reduced.

In other embodiments, for step S203, the determining the world coordinates of at least one sampling point according to the camera coordinates and the image features of the sampling points in the plurality of sample images may be implemented at least by the following steps S211 to S213:

step S211, selecting a first target image and a second target image satisfying a second condition from the plurality of sample images according to the camera coordinates and the image characteristics of each sampling point in the plurality of sample images.

When the method is implemented, the selected first target image and the second target image are two sample images with relatively large parallax; in this way, the accuracy of determining the world coordinates of the sampling points in the first target image or the second target image can be improved, and further higher positioning accuracy can be obtained later. The first target image and the second target image are determined, for example, by steps S221 to S224 of the following embodiments.

Step S212, determining a fourth rotation relationship and a fourth translation relationship between the first target image and the second target image.

When implemented, the first target image and the second target image may be processed using a four-point method in a random sample consensus (Random Sample Consensus, RANSAC) algorithm, and a homography matrix may be calculated, thereby obtaining the fourth rotation relationship and the fourth translation relationship.

Step S213, determining world coordinates of the sampling point in the first target image according to the fourth rotation relationship, the fourth translation relationship and the camera coordinates of the sampling point in the first target image.

It will be appreciated that the sampling points in the first target sample image are substantially co-located with the sampling points of the matching second target sample image; therefore, it is sufficient here to determine only the world coordinates of the sampling points in either one of the two target sample images.

In other embodiments, for step S211, the selecting, according to the camera coordinates and the image features of each of the first sampling points in the plurality of sample images, the first target image and the second target image that satisfy the second condition from the plurality of sample images may be at least implemented by the following steps S221 to S224:

Step S221, performing pairwise matching on the plurality of sample images according to the image features of each sampling point in the plurality of sample images, to obtain a first matching pair set of each pair of sample images.

The matching is that each sample image is matched with other sample images in the plurality of sample images. For example, the plurality of sample images includes sample images 1 to 6, sample image 1 and sample images 2 to 6 are respectively matched, and sample image 2 is respectively matched with sample images 1, 3 to 6. The obtained first matching pair set comprises matching relations between sampling points in the two images, namely a plurality of sampling point matching pairs.

Step S222, eliminating sampling point matching pairs in the first matching pair set, which do not meet a third condition, to obtain a second matching pair set.

In the implementation, the base matrix can be calculated by adopting a RANSAC eight-point method, and the base matrix is selected to be removed without matching. Therefore, some sampling point matching pairs with poor robustness can be eliminated, so that the robustness of the algorithm is improved.

And S223, selecting a target matching pair set with the matching pair number meeting the second condition from each second matching pair set.

In general, when the number of matching pairs is too large, it is explained that the parallax of the two images is relatively small, but when the number of matching pairs is small, the fourth rotation relationship and the fourth translation relationship between the two images cannot be determined. In implementation, the second condition may be set such that the matching pair number is greater than the first value and less than the second value.

Step S224, determining two sample images corresponding to the target matching pair set as a first target image and a second target image.

Based on the pre-constructed fingerprint map and the point cloud map, several embodiments of the positioning method are provided below.

The embodiment of the application further provides a positioning method, which at least includes the following steps S301 to S305:

step S301, matching the network characteristics of the network where the image acquisition device is located with the network characteristics of each sub-region in the fingerprint map, so as to obtain target network characteristics.

When the method is implemented, the similarity (such as Euclidean distance, hamming distance or cosine similarity) between the network characteristics of the network where the image acquisition equipment is located and the network characteristics of each subarea can be determined, and then the network characteristics with the similarity meeting the condition in the fingerprint map are determined as the target network characteristics. For example, determining network characteristics with similarity smaller than a preset threshold as the target network characteristics; alternatively, the network feature with the smallest similarity is determined as the target network feature.

Step S302, obtaining a target identifier associated with the target network feature from the fingerprint map.

As can be seen from the foregoing embodiments, an association relationship between the network feature of each sub-region and the corresponding identifier is established in the fingerprint map; thus, the target identification associated with the target network feature may be obtained from the fingerprint map.

Step S303, obtaining a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map.

As can be seen from the foregoing embodiments, each local map of the point cloud map is marked with an identifier of each sub-region in the corresponding physical region, and therefore, the target local map corresponding to the target identifier can be acquired from the point cloud map.

Step S304, the image to be processed acquired by the image acquisition equipment is acquired.

And step S305, positioning the image acquisition equipment according to the to-be-processed image and the attribute information of a plurality of sampling points of the target local map.

In the embodiment of the application, firstly, coarse positioning is carried out on the image acquisition equipment by utilizing network characteristics and a fingerprint map, namely, the target identification of the current area of the image acquisition equipment is determined; and then, acquiring a target local map corresponding to the target identifier from the point cloud map, and positioning the image acquisition device only based on the target local map (instead of the global map, namely the point cloud map) and the image to be processed, so that the time cost and the memory cost of the electronic device when the positioning method is implemented can be saved.

The embodiment of the application further provides a positioning method, which at least includes the following steps S401 to S406:

step S401, determining a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map.

Step S402, obtaining a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map.

Step S403, acquiring an image to be processed acquired by the image acquisition device.

Step S404, determining attribute information of feature points in the image to be processed.

It can be understood that the feature points are pixel points with certain features in the image to be processed. When the method is implemented, the corner points in the detected image to be processed are usually taken as the characteristic points. The attribute information of the feature points is information specific to the feature points. The attribute information of the feature points includes at least one of: image features of the feature points, camera coordinates of the feature points. In one example, the attribute information of the feature point includes an image feature of the feature point and camera coordinates of the feature point. In implementation, a feature descriptor of the feature point may be acquired, and the feature descriptor is used as an image feature of the feature point. As will be appreciated, the camera coordinates of the feature points refer to the coordinates of the feature points in the camera coordinate system. The camera coordinates may be two-dimensional coordinates or three-dimensional coordinates.

Step S405, acquiring attribute information of a plurality of sampling points in the target local map.

Sampling points on the surface of an object are obtained in a physical space in an image acquisition mode, and a point cloud map is built based on world coordinates of the sampling points. In an embodiment of the present application, the point cloud map includes a plurality of local maps, and each local map is a map having a property of sampling point cloud. Since the spacing between these sampling points is greater than the spacing threshold, these sampling points are referred to as sampling points.

It will be appreciated that the sampling points are actually feature points in the image in which they are located, and the attribute information of the sampling points is information specific to the sampling points. The attribute information of the sampling point includes at least one of: image characteristics of the sampling points and world coordinates of the sampling points. In one example, the attribute information of the sampling points includes image features and world coordinates. In another example, the attribute information of the sampling points includes world coordinates and does not include image features. It is understood that the world coordinates of the sampling points refer to the coordinates of the sampling points in the world coordinate system. The world coordinates may be two-dimensional coordinates or three-dimensional coordinates.

Step S406, matching the attribute information of the plurality of feature points with the attribute information of the plurality of sampling points to obtain the position information of the image acquisition device, so as to implement positioning of the image acquisition device.

In the embodiment of the application, after coarse positioning is performed to obtain the target local map, the position information of the image acquisition device can be more accurately determined according to the attribute information of the feature points in the image to be processed and the attribute information of a plurality of sampling points in the target local map. Thus, the positioning accuracy of the image acquisition equipment is improved, and the positioning method is independent of the fact that a fixed object and a person to be positioned are required to be in the image to be processed, so that better robustness can be obtained.

It should be noted that, each local map in the point cloud map includes an image feature of a sampling point and does not include an image feature, and the corresponding positioning methods are different. In the following cases, namely: each local map comprises image features and world coordinates of the sampling points; accordingly, the attribute information of the feature points includes image features and camera coordinates, and the corresponding positioning method may include the following several embodiments.

The embodiment of the application further provides a positioning method, which at least includes the following steps S501 to S507:

step S501, determining a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map.

Step S502, obtaining a target local map corresponding to the target mark from the point cloud map.

Step S503, acquiring an image to be processed acquired by the image acquisition device.

Step S504, determining attribute information of feature points in the image to be processed.

Step S505, acquiring attribute information of a plurality of sampling points in the target local map.

Step S506, matching the image features of the jth feature point with the image features of the plurality of sampling points to obtain a first target sampling point matched with the jth feature point, where j is an integer greater than 0.

It is to be understood that matching refers to feature points and sampling points corresponding to the same position point in the actual physical space. That is, the feature point and the first target sampling point that matches the feature point represent the same position point in the actual physical space. In implementation, a sampling point in the target local map that is the same as or similar to the image feature of the feature point is typically determined as the first target sampling point. For example, the first target sampling point is determined by step S606 and step S607 in the following embodiment.

And step S507, determining the position information of the image acquisition equipment according to the camera coordinates of the feature points and the world coordinates of the corresponding first target sampling points, so as to realize positioning of the image acquisition equipment.

Here, attribute information of a plurality of sampling points including image features and world coordinates of the sampling points is included in the target local map. It can be understood that if world coordinates of a plurality of first target sampling points and camera coordinates of feature points matching the plurality of first target sampling points are known, it is possible to determine rotation and translation relationships at the time of photographing first sampling points corresponding to the plurality of feature points by the image capturing apparatus relative to rotation and translation relationships at the time of photographing second sampling points corresponding to the plurality of matched first target sampling points (actually, the first sampling points), based on which it is possible to determine position information of the image capturing apparatus. For example, the world coordinates and orientation of the image capturing apparatus (i.e., the pose of the image capturing apparatus) are determined by steps S608 to S611 in the following embodiments.

According to the positioning method provided by the embodiment of the application, the first target sampling point matched with the characteristic point can be more accurately determined from the plurality of sampling points according to the image characteristics of the characteristic point and the image characteristics of the plurality of sampling points, so that the positioning accuracy is improved.

The embodiment of the application further provides a positioning method, which at least includes the following steps S601 to S611:

Step S601, determining a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map.

Step S602, obtaining a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map.

Step S603, acquiring an image to be processed acquired by the image acquisition device.

Step S604, determining attribute information of feature points in the image to be processed.

Step S605 obtains attribute information of a plurality of sampling points in the target local map.

Step S606, determining the similarity between the image feature of the jth feature point and the image feature of each sampling point, to obtain a similarity set, where j is an integer greater than 0.

The similarity refers to the degree of similarity between the image features of the sampling points and the image features of the feature points. In practice, the similarity may be determined by a variety of methods. For example, euclidean distances between the image features of the sampling points and the image features of the feature points are determined, and the euclidean distances are determined as the similarities. In other embodiments, a hamming distance or cosine similarity between the image feature of the sampling point and the image feature of the feature point may also be determined, and the hamming distance or cosine similarity may be determined as the similarity. The type of parameters characterizing the similarity is not limited here. The parameter type may be the euclidean distance, hamming distance, cosine similarity, or the like.

In step S607, the sampling point whose similarity satisfies the first condition in the similarity set is determined as the first target sampling point matched with the jth feature point.

When the method is implemented, sampling points with similarity smaller than or equal to a preset threshold value in a similarity set are determined to be first target sampling points. For example, a sampling point having a euclidean distance less than or equal to a euclidean distance threshold is determined as the first target sampling point. Or determining the sampling point with the minimum similarity in the similarity set as the first target sampling point.

In step S608, the camera coordinates of each of the first target sampling points are determined according to the camera coordinates of the feature points and the world coordinates of the first target sampling points of the feature points.

When implementing step S608, the electronic device may accurately determine the camera coordinates of the first target sampling points matched with the 3 feature points according to the camera coordinates of at least 3 feature points and the world coordinates of the first target sampling points respectively matched with the 3 feature points.

For example, as shown in fig. 4, the point O is the origin of the camera coordinate system, that is, the optical center of the image capturing device, the plurality of first target sampling points are the 3 sampling points, A, B, C, in uppercase shown in fig. 4, in the image to be processed 40, the feature point matching the sampling point a is the lowercase feature point a, the feature point matching the sampling point B is the lowercase feature point B, and the feature point matching the sampling point C is the lowercase feature point C.

The following formula (1) can be listed according to the cosine law:

in the formula (1) < a, b > refers to ++aOb, < a, c > refers to ++ aOc, < b, c > refers to ++ bOc.

The above components are eliminated and divided by OC ² And order

The following formula (2) can be derived:

then replace to make

The following formula (3) can be derived:

by taking the above formula (1) into the formulae (2) and (3), the following formula (4) can be obtained:

in equation (4), w, v, cos < a, c >, cos < b, c >, cos < a, b > are all known quantities, so the unknown quantity is only two of x and y, so the values of x and y can be found by two equations in equation (4) above, and then the values of OA, OB and OC can be found according to three equations of equation (5) below:

finally, solving A, B, C the camera coordinates of the 3 sampling points, and obtaining according to a vector formula (6):

in the formula (6), the amino acid sequence of the compound,

is the direction from point O to point a; />

Is the direction from point O to point b; />

Is in the direction of point O to point c.

Step S609, determining a first rotation relationship and a first translation relationship of the camera coordinate system relative to the world coordinate system according to the world coordinate of each first target sampling point and the camera coordinate of each first target sampling point;

it will be appreciated that if the world coordinates and camera coordinates of the plurality of first target sampling points are known, a first rotational relationship and a first translational relationship of the camera coordinate system relative to the world coordinate system may be determined.

Step S610, determining world coordinates of the image capturing device according to the first translational relationship and the first rotational relationship.

Step S611, determining an orientation of the image capturing device in the point cloud map according to the first rotation relationship.

It can be understood that the world coordinates of the image capturing device and the orientation of the image capturing device in the point cloud map are the position information of the image capturing device.

In the positioning method provided by the embodiment of the application, after coarse positioning is performed by using the network characteristics and the fingerprint map to obtain the target local map, a first rotation relationship and a first translation relationship of a camera coordinate system relative to the world coordinate system are determined according to the world coordinate of each first target sampling point and the camera coordinate of each first target sampling point. In this way, not only the world coordinates of the image acquisition device can be determined according to the first translation relationship, but also the orientation of the image acquisition device in the point cloud map can be determined according to the first rotation relationship, so that the positioning method can be suitable for more application scenes. For example, the robot is instructed to perform the next action according to the current orientation of the robot.

In the following cases, namely: each local map in the point cloud map includes world coordinates of the sampling point and does not include image features of the sampling point, and accordingly, attribute information of the feature point includes camera coordinates and does not include image features of the feature point, and a corresponding positioning method may include the following several embodiments.

The embodiment of the application further provides a positioning method, which at least includes the following steps S701 to S708:

step S701, determining a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map.

Step S702, obtaining a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map.

Step S703, acquiring an image to be processed acquired by the image acquisition device.

Step S704, determining attribute information of feature points in the image to be processed.

Step S705, obtaining attribute information of a plurality of sampling points in the target local map.

Step S706, matching the camera coordinates of the feature points with the world coordinates of the sampling points according to an iteration strategy, so as to obtain a target rotation relationship and a target translation relationship of the camera coordinate system relative to the world coordinate system.

Here, each of the partial maps in the point cloud map includes world coordinates of the sampling point, but does not include image features of the sampling point. It will be appreciated that image features typically occupy a relatively large amount of storage space when storing a point cloud map. For example, an image feature is a feature descriptor, typically having 256 bytes per sample point, which requires the electronics to allocate at least 256 bytes of memory space per sample point to store the feature descriptor. When the method is realized, each local map does not comprise the image characteristics of the sampling points, so that the data volume of the point cloud map can be greatly reduced, and the storage space of the point cloud map in the electronic equipment is saved.

Under the condition that each local map does not comprise image features of sampling points, namely on the premise that camera coordinates of a plurality of feature points and world coordinates of a plurality of sampling points are known, the target rotation relation and the target translation relation of a camera coordinate system relative to the world coordinate system are tried to be searched through an iteration strategy, and then the positioning of the image acquisition equipment can be achieved.

For the search of the target rotation relationship and the target translation relationship, for example, the sampling point nearest to (i.e., most matching) each feature point is iteratively searched for by the following embodiments of step S721 to step S725, or step S731 to step S738, thereby obtaining the target rotation relationship and the target translation relationship.

Step S707 determines an orientation of the image capturing device in the point cloud map according to the target rotation relationship.

Step S708, determining world coordinates of the image capturing device according to the target translation relationship and the target rotation relationship.

In the positioning method provided by the embodiment of the application, the image features of the feature points do not need to be extracted, the image features of the feature points do not need to be matched with the image features of a plurality of sampling points in the target local map, and the camera coordinates of each feature point are matched with the world coordinates of a plurality of sampling points only through an iteration strategy, so that the accurate positioning of the image acquisition equipment can be realized. Therefore, the image characteristics of each sampling point do not need to be stored in the point cloud map, and the storage space of the point cloud map is greatly saved.

In other embodiments, for step S706, matching the camera coordinates of the plurality of feature points with the world coordinates of the plurality of sampling points according to the iteration strategy to obtain the target rotation relationship and the target translation relationship of the camera coordinate system with respect to the world coordinate system may be at least implemented by the following steps S721 to S725:

Step S721, selecting an initial target sampling point matched with each feature point from the plurality of sampling points.

When the method is realized, an initial rotation relation and an initial translation relation of a camera coordinate system relative to a world coordinate system can be set, and then the characteristic points are matched with the sampling points according to the camera coordinates of the characteristic points, the initial rotation relation and the initial rotation relation, so that an initial target sampling point matched with the characteristic points is selected from the sampling points. In one example, the initial target sampling point may be selected through steps S731 to S733 in the following embodiments.

In fact, the initial target sampling point may not be a point actually matching the feature point only for selecting a sampling point possibly matching the feature point through step S721, and thus it is necessary to further determine whether the initial target sampling point is a point actually matching the feature point through steps S722 to S725 as follows.

Step S722, determining a second rotation relationship and a second translation relationship of the camera coordinate system relative to the world coordinate system according to the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point.

When the method is realized, an error function can be constructed according to the camera coordinates of each characteristic point and the world coordinates of the corresponding initial target sampling point; then, the second rotation relation and the second translation relation which are optimal at present are solved through a least square method. For example, a set of camera coordinates including n feature points is expressed as p= { P ₁ ,p ₂ ,...,p _i ,...,p _n P for camera coordinates of feature points _i To represent, the set of world coordinates of the initial target sampling points matching the n feature points is represented as q= { Q ₁ ,q ₂ ,...,q _i ,...,q _n World coordinates of initial target sample point q _i To express, then, for example, can be listedThe following formula (7):

wherein E (R, T) is an error function, and R and T are a second rotation relationship and a second translation relationship to be solved respectively. Then, the optimal solution of R and T in the solution equation (7) can be found by the least square method.

Step S723, determining a matching error according to the second rotation relationship, the second translation relationship, the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point.

Here, the matching error refers to an overall matching error, that is, a matching error of all feature points. After the optimal solution, that is, the second rotation relationship and the second translation relationship, the camera coordinates of each feature point may be converted into corresponding second world coordinates. If the initial target sampling point matched with the feature point selected in step S721 and the feature point represent the same location point or two similar location points in the actual physical space, the second world coordinate of the feature point should be the same as or similar to the world coordinate of the corresponding initial target sampling point. Otherwise, if the two representations are not the same position point or two similar position points, the second world coordinates of the feature points are different from or not similar to the world coordinates of the corresponding initial target sampling points. Based on this, the matching error may be determined through the following steps S735 and S736, so that it is determined whether the initial target sampling point is a point actually matching the feature point based on the matching error and a preset threshold, thereby determining a target conversion relationship and a target translation relationship.

Step S724, if the matching error is greater than the preset threshold, returning to step S721, re-selecting the initial target sampling point, and re-determining the matching error.

It can be appreciated that if the matching error is greater than the preset threshold, it indicates that the currently selected initial target sampling point is not the sampling point matched with the feature point, and the two points refer to the same location point or similar location points in the physical space. At this time, it is further necessary to return to step S721 to reselect an initial target sampling point, and based on the reselected initial target sampling point, re-execute steps S722 to S723 to redetermine a matching error until the redetermined matching error is smaller than the preset threshold, and consider that the initial target sampling point selected in the current iteration is a point actually matching the feature point, where at this time, the second rotation relationship and the second translation relationship obtained in the current iteration may be determined as a target rotation relationship and a target translation relationship, respectively.

Conversely, in other embodiments, if the matching error is less than or equal to the preset threshold, determining an orientation (i.e., a pose) of the image capturing device in the point cloud map according to the second rotation relationship obtained by the current iteration, and determining coordinates (i.e., world coordinates) of the image capturing device in the point cloud map according to the second translation relationship obtained by the current iteration.

Step S725, determining the second rotation relationship and the second translation relationship when the re-determined matching error is less than or equal to the preset threshold as the target rotation relationship and the target translation relationship, respectively.

In other embodiments, for step S706, the matching, according to an iteration strategy, the camera coordinates of each feature point with the world coordinates of the plurality of sampling points to obtain the target rotation relationship and the target translation relationship of the camera coordinate system relative to the world coordinate system may be further implemented at least by the following steps S731 to S738:

step S731, obtaining a third rotation relationship and a third translation relationship of the camera coordinate system with respect to the world coordinate system.

In implementation, the third rotational relationship and the third translational relationship may each be set to an initial value.

Step S732, determining a first world coordinate of the jth feature point according to the third rotation relationship, the third translation relationship, and the camera coordinates of the jth feature point, where j is an integer greater than 0.

Step S733, matching the first world coordinate of each feature point with the world coordinates of the plurality of sampling points to obtain a corresponding initial target sampling point.

In implementation, a distance between the first world coordinate of the feature point and the world coordinate of each of the sampling points may be determined, and then a sampling point closest to the feature point may be determined as the initial target sampling point, or a sampling point having a distance less than or equal to a distance threshold may be determined as the initial target sampling point. In implementation, a euclidean distance between the first world coordinate of the feature point and the world coordinate of the sampling point may be determined, where the euclidean distance is taken as a distance between the feature point and the sampling point.

Step S734, determining a second rotation relationship and a second translation relationship of the camera coordinate system with respect to the world coordinate system according to the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point;

in step S735, a second world coordinate of the jth feature point is determined according to the second rotation relationship, the second translation relationship, and the camera coordinate of the jth feature point, where j is an integer greater than 0.

In step S736, the matching error is determined according to the second world coordinates of each of the feature points and the world coordinates of the corresponding initial target sampling point.

When implemented, a distance (e.g., euclidean distance) between the world coordinates of the second world coordinate of each of the feature points and the world coordinates of the corresponding initial target sample point may be determined; the matching error is then determined based on each of the distances.

Here, the average distance between the plurality of feature points and the initial target sampling point of the matching may be determined as the matching error. For example, the set of second world coordinates including n feature points is expressed as P '= { P' ₁ ,p′ ₂ ,...,p′ _i ,...,p′ _n Second world coordinates of feature points are p' _i Representing that the set of world coordinates of the initial target sampling points matched with the n feature points is represented as q= { Q ₁ ,q ₂ ,...,q _i ,...,q _n World coordinates of initial target sample point q _i Expressed, then, the matching error d can be found by the following equation (8):

in the formula, |p '' _i -q _i || ² And representing the Euclidean distance between the characteristic point and the matched initial target sampling point.

And step S737, if the matching error is greater than a preset threshold, taking the second translational relationship as the third translational relationship, taking the second rotational relationship as the third rotational relationship, returning to step S732, re-selecting the initial target sampling point, and re-determining the matching error until the re-determined matching error is less than the preset threshold.

It may be understood that if the matching error is greater than the preset threshold, it is indicated that the obtained third rotation relationship and third translation relationship are not realistic, in other words, the obtained initial target sampling point is not a point actually matching the feature point, at this time, the second translation relationship may be taken as the third translation relationship, the second rotation relationship may be taken as the third rotation relationship, and steps S732 to S736 are re-performed until the matching error is less than the threshold, and step S738 is performed.

And S738, determining the second rotation relationship and the second translation relationship when the re-determined matching error is smaller than or equal to the preset threshold value as the target rotation relationship and the target translation relationship respectively.

Schemes for indoor positioning based on wireless devices such as WIFI and bluetooth have been popular in daily life to some extent. In the related art, an indoor rapid comprehensive positioning method based on WIFI position fingerprint data is achieved. The scheme discloses a fusion positioning method based on WIFI and PDR, which comprises the following steps: positioning is carried out through WIFI and PDR respectively, and if the WIFI positioning data are not received within the time tolerance T, the PDR positioning data are regarded as positioning results; if the error of the positioning result based on the WIFI and the positioning result based on the PDR at the same moment exceeds a threshold value, acquiring sensor data from an initial position to the current moment, and obtaining the positioning result based on the PDR, namely the position of the mobile terminal (an example of electronic equipment); if the error does not exceed the threshold value, resetting the positioning result based on the PDR at the moment, taking the positioning result based on the WIFI as the initial position of the PDR positioning, recalculating the positioning result based on the PDR, namely the position of the mobile terminal according to the sensor data from the moment to the current moment, and repeating the steps to perform continuous positioning. The key technical point of the scheme is that WIFI fingerprint and PDR positioning are combined, global consistency of the WIFI fingerprint positioning is utilized to correct accumulated errors of the PDR positioning, and meanwhile stable output results of the PDR positioning are utilized to make up for the defect of insufficient positioning accuracy and stability of the WIFI fingerprint.

In the related art, although two indoor positioning technologies of the WIFI fingerprint and the PDR are combined, the indoor positioning precision based on the WIFI fingerprint is about 2 meters at present, and space still exists for improving the positioning precision; in addition, in the above-mentioned related art, only the coordinate output can be provided on the positioning output result, but the attitude information cannot be provided, and the output positioning result is deficient in the degree of freedom of the application scene.

Based on this, an exemplary application of the embodiments of the present application in one practical application scenario will be described below.

The embodiment of the application realizes the indoor positioning method combining the wireless indoor positioning technology and the sparse point cloud, and can help a user to position the user in real time and with high precision. The wireless indoor positioning technology, such as the WIFI fingerprint positioning technology, can help to perform coarse positioning to obtain the approximate position of the user, and then performs fine positioning through a sparse point cloud map (namely an example of the point cloud map) to obtain the accurate position and the accurate gesture of the user. The scheme can be used for combining a WIFI fingerprint map (namely an example of the fingerprint map) and a sparse point cloud map aiming at a large-scale indoor scene, and is high in positioning precision and strong in robustness; wherein the sparse point cloud map comprises a plurality of local maps, each local map comprising attribute information of a plurality of sparse points (i.e., sampling points), such as world coordinates and image features of the sparse points. In the present embodiment, two main parts are included: and constructing a map and fusing and positioning.

The sparse point cloud map is actually a set of attribute information including a plurality of sampling points in a physical space, and the intervals between the sampling points are large.

In the embodiment of the application, the map construction part mainly comprises a WIFI fingerprint map construction, a sparse point cloud map construction and a sparse point cloud map labeling. The sparse point cloud map is constructed by acquiring RGB image information through a monocular camera and extracting corresponding image features. The specific technical steps for constructing the map at least comprise the following steps S11 to S17:

s11, collecting a WIFI fingerprint map and storing the WIFI fingerprint map to a local place;

step S12, RGB image acquisition is carried out by utilizing a monocular camera;

step S13, extracting attribute information (such as image features and camera coordinates of sparse points in an image) in the image in real time in the acquisition process;

step S14, after a certain number of RGB images are acquired, initializing the relative rotation and translation of the images by using an SFM method;

step S15, after initialization is completed, calculating three-dimensional world coordinates (namely an example of the world coordinates) of a subsequent image sparse Point through a PnP (Perselect-n-Point) algorithm to obtain a local map;

step S16, after collecting a local map of a complete scene, marking the local map, wherein the marking content is an area ID (namely the identification) and is associated with a WIFI fingerprint map;

In step S17, each local map and its corresponding image features are stored, for example, such information is stored in series locally as an offline map.

Here, for the acquisition of the WIFI fingerprint map in step S11, the following explanation is given here. The process of acquiring the WIFI fingerprint map is called an offline stage in the WIFI fingerprint indoor positioning method. In the off-line phase, in order to collect fingerprints at various locations (i.e. sub-areas), a database is built, which requires multiple measurements at the formulated area, the collected data also being referred to as training set. The establishment of correspondence between location and fingerprint is typically done in an off-line stage.

Among them, for the image features in the extracted RGB image in step S13, the following explanation is given here. The process of feature extraction is actually a process of interpreting and labeling RGB images. In one example, FAST corner points are extracted for RGB images, and the number of extracted corner points is generally fixed to 150 for image tracking; and extracting ORB descriptors from the corner points for feature descriptor matching of sparse points. Here, 150 is an empirical value, which is a preset value, because too few corner points may result in a high tracking failure rate, and too many corner points may affect algorithm efficiency.

Wherein the initialization is performed by using the SFM method in step S14, the following explanation is given here. The relative rotation and translation of the image is initialized using the SFM method, which is explained here below. Firstly, after a certain number of images are acquired, initializing the relative rotation and translation of the images by using an SFM method, and obtaining the three-dimensional world coordinates of sparse points. The SFM algorithm at least comprises the following steps S131 to S139:

step S131, carrying out pairwise matching on a certain number of images, and establishing a matching relationship between sparse points of the images by using a Euclidean distance judging method;

step S132, rejecting the matched pairs, wherein the rejecting method adopts a RANSAC eight-point method to calculate a base matrix, and the matched pairs of the base matrix are selected to be rejected without meeting the requirement;

step S133, after the matching relation is established, a tracking list is generated, wherein the tracking list refers to an image name set of the same name point;

step S134, invalid matching in the tracking list is eliminated;

step S135, searching for an initialized image pair, wherein the aim is to find the image pair with the maximum camera baseline, calculating a homography matrix by adopting a RANSAC algorithm four-point method, and the matching points meeting the homography matrix are called inner points and the matching points not meeting the homography matrix are called outer points. Finding the image pair with the minimum inner point duty ratio;

Step S136, searching for the relative rotation and translation of the initialized image pairs, wherein the method is to calculate an essential matrix through a RANSAC eight-point method, and obtain the relative rotation and translation between the image pairs through SVD decomposition of the essential matrix;

step S137, obtaining three-dimensional world coordinates of sparse points in the initialized image pair through triangulation calculation;

step S138, repeatedly executing steps S136 and S137 on other images to obtain the relative rotation and translation of all the images and the three-dimensional world coordinates of the sparse points;

step S139, optimizing the rotation, translation and three-dimensional world coordinates of sparse points between the obtained images by a beam adjustment method. This is a non-linear optimization procedure aimed at reducing the error in the SFM results.

Based on steps S11 to S17, a WIFI fingerprint map and a sparse point cloud map (including a plurality of local maps) may be constructed, where the sparse point cloud map stores the sparse point cloud and its image feature information (including three-dimensional coordinates and description sub-information) and labeling information in a binary format to a local place. In the fusion positioning process, the two maps are respectively loaded and used.

In the embodiment of the application, the fusion positioning part mainly comprises coarse positioning by utilizing WIFI fingerprints and vision-based fine positioning. The coarse positioning process determines the approximate location of the user and also determines the local map to be loaded; the precise positioning is to collect the current RGB image through a monocular camera, load a target local map selected by coarse positioning, find a matching pair between the current characteristic point and a sparse point in the target local map by utilizing descriptor matching, and finally solve the precise pose of the current camera in the map through a PnP algorithm so as to achieve the positioning purpose, wherein the specific technical steps at least comprise the following steps of S21 to S26:

Step S21, coarse positioning is carried out through WIFI fingerprint positioning, and the approximate position of the user is obtained;

s22, selecting a local scene by using a rough positioning result, and loading a corresponding target local map;

s23, RGB image acquisition is carried out by using a monocular camera;

step S24, extracting attribute information in the current frame image in real time in the acquisition process;

step S25, matching pairs between the current characteristic points and sparse points in the target local map are found through descriptor matching;

step S26, after enough matching pairs are found, solving the accurate pose of the current camera in the map coordinate system through a PnP algorithm.

The rough positioning is performed through WIFI fingerprint positioning in step S21, so as to obtain the rough position of the user, which is explained below. The WIFI fingerprint location phase is generally referred to as an online phase. The mobile device measures and obtains the RSS from each AP, a measured value r= [ r ] of the RSS vector ₁ r ₂ ](i.e., the network characteristics of the network in which the image acquisition device is currently located). In determining the location of the mobile device, it is necessary to find the fingerprint ρ that matches r best in the fingerprint library. Once the best match is found, the location of the mobile device is estimated as the location corresponding to the fingerprint of this best match. The matching method can be used for matching through Euclidean distance and other methods, and the mesh granularity of WIFI fingerprint positioning is larger in the embodiment of the application, so that the matching method is more suitable for matching through Euclidean distance. After the position of the user is determined through WIFI fingerprint positioning, the target local map to be loaded can be found through the mapping relation between the position and the region ID.

Wherein, reference may be made to step S13 for extracting feature information in the current frame image in real time in step S24.

Wherein, for the matching pair between the current feature point and the sparse point in the target local map found by descriptor matching in step S25, the algorithm at least includes the following steps S251 to S254:

step S251, N (initially 0) th feature point F extracted from the current image _1N Setting a Euclidean distance minimum value d _min ＝d _TH Setting matching points

Step S252, calculating F _1N And the M (initial 0) th feature point F in the sparse point cloud _2M Calculating Euclidean distance d between feature point descriptors _NM ；

Step S253, determining the Euclidean distance d _NM Minimum value d of Euclidean distance _min If d _NM ＜d _min D is then _min ＝d _NM ,

Then m=m+1, if the sparse points in the sparse point cloud are not traversed, jumping back to step S252; otherwise n=n+1, step S251 is skipped. If the feature points of the current image are traversed, jumping to step S254;

and step S254, matching pairs between the characteristic points of the current image and sparse points in the target local map are arranged and output as an algorithm, and the algorithm is ended.

For the solving of the accurate pose of the current camera in the map coordinate system by PnP algorithm in step S26, there is a preferred example shown in fig. 5:

First, a matching pair sequence is formed in the judgment step S25 (in this example, the matching pair sequence is { F ₀ ,F ₁ ,F ₂ -TH) if the number of elements of the matching pair sequence is greater than TH ₂ Step S26 is performed; otherwise the algorithm ends. In the preferred example, based on the matching pair sequence, the solvent PnP function in the OpenCV is called to Solve the pose of the current camera under the map coordinate system. The principle of the PnP algorithm is as follows:

the inputs to the PnP algorithm are three-dimensional (three dimensional, 3D) points (i.e., three-dimensional world coordinates of sparse points in the map coordinate system) and 2D points resulting from the projection of these 3D points in the current image (i.e., camera coordinates of feature points in the current frame), and the output of the algorithm is a pose transformation of the current frame relative to the origin of the map coordinate system (i.e., pose of the current frame in the map coordinate system).

The PnP algorithm does not directly calculate the camera pose matrix according to the matching pair sequence, but calculates the 3D coordinates of the corresponding 2D point under the current coordinate system, and then calculates the camera pose according to the 3D coordinates under the map coordinate system and the 3D coordinates under the current coordinate system.

Based on the steps S21 to S26, coarse positioning can be performed through WIFI fingerprint positioning, the current user position is roughly determined, then fine positioning is performed in a predefined sparse point cloud map (i.e., a target local map) by using visual information, and the accurate position and posture of the user in the current environment are determined. The positioning scheme combines the traditional wireless positioning technology with a high-precision high-robustness visual algorithm, and has the advantages of higher positioning result precision, high degree of freedom and strong robustness.

In the embodiment of the application, the wireless indoor positioning method and the visual positioning method are combined, the position and the gesture can be provided on the positioning result at the same time, and the positioning freedom degree is enhanced relative to other indoor positioning methods.

In the embodiment of the application, through a cascading mode, coarse positioning is performed through WIFI fingerprints, then fine positioning is performed by combining a high-precision high-robustness visual matching algorithm, positioning precision is high, and centimeter level can be achieved theoretically.

In the embodiment of the application, the wireless indoor positioning method is used for coarse positioning firstly, and after a coarse positioning result is obtained, the sparse point cloud map is selectively loaded, so that compared with pure visual positioning, the overhead of memory resources is reduced, and the method can be used for large-scale indoor scenes.

The positioning method provided by the embodiment of the application combines a wireless indoor positioning method and a visual positioning method. In the map construction, not only is a WIFI fingerprint map collected, but also three-dimensional coordinates and description sub-information of feature points in a visual image are collected by utilizing camera motion, the three-dimensional coordinates and the description sub-information are stored as the visual map in a sparse point cloud mode, and the two maps are associated through an area ID; in the positioning method, firstly, a WIFI fingerprint positioning technology is adopted for coarse positioning, after a rough position result of a user is obtained, a local map in a sparse point cloud map is selectively loaded, then a descriptor matching method is adopted for finding a matching pair of a current characteristic point in a target local map, and then the current position and the current gesture are accurately calculated through a PnP algorithm. The two are combined to form a set of high-precision and high-robustness indoor positioning method, which can be used for large-scale indoor scenes and meets the requirement of productization.

The sparse point cloud map comprises three-dimensional coordinates and descriptive sub-information of sparse points. The descriptor information is used for matching with the characteristic points in the current image in the visual positioning process. In one example, the descriptor information is an ORB descriptor, and the descriptor information for each sparse point occupies 256 bytes of space. For an offline map stored in a sparse point cloud form, 256 bytes are allocated for each sparse point as storage space for a feature descriptor, and a small proportion is occupied in the size of the final offline map. In order to reduce the size of the offline map, in the embodiment of the application, only three-dimensional coordinate information of sparse points is stored in a serialized manner. In the fused positioning portion, the embodiment of the present application provides an adjusted positioning scheme, which at least includes the following steps S31 to S36:

step S31, coarse positioning is carried out through WIFI fingerprint positioning, and the approximate position of the user is obtained;

step S32, selecting a local scene by using the rough positioning result, and loading a specific sparse point cloud map (namely a target local map);

s33, carrying out RGB image acquisition by using a monocular camera;

step S34, extracting attribute information (namely camera coordinates of feature points) in the current frame image in real time in the acquisition process;

Step S35, calculating three-dimensional camera coordinates of feature points in the current image to form local point clouds;

step S36, matching the local point cloud with the map sparse point cloud through an iterative closest point (Iterative Closest Point, ICP) algorithm, and solving the accurate pose of the current camera in the map coordinate system.

Wherein, for matching the local point cloud and the map sparse point cloud by ICP algorithm in step S36, the following explanation is given here.

The ICP algorithm is essentially an optimal registration method based on the least squares method. The algorithm repeatedly selects corresponding relation point pairs, calculates the optimal rigid body transformation untilMeeting the convergence accuracy requirement of correct registration. The basic principle of the ICP algorithm is as follows: the nearest point (P) is found in the target point cloud P and the source point cloud Q to be matched according to a certain constraint condition _i ,q _i ) Then, the optimal rotation R and translation T are calculated such that the error function is minimized, and the error function E (R, T) is formulated as:

where n is the number of adjacent pairs of points, p _i Q is a point in the target point cloud P _i For the AND p in the source point cloud Q _i The corresponding nearest point, R is the rotation matrix and T is the translation vector.

Based on the steps S31 to S36, the positioning purpose can be achieved in the predefined sparse point cloud map by a wireless positioning and visual positioning method, and the position and the posture of the self-body under the map coordinate system can be obtained. And the predetermined sparse point cloud map does not need to store additional feature point descriptor information, so that the size of the offline map is compressed.

Based on the foregoing embodiments, the embodiments of the present application provide a positioning device, where the positioning device includes each module included, and each unit included in each module may be implemented by a processor in an electronic device; of course, the method can also be realized by a specific logic circuit; in an implementation, the processor may be a Central Processing Unit (CPU), a Microprocessor (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like.

Fig. 6A is a schematic diagram of a composition structure of a positioning device according to an embodiment of the present application, as shown in fig. 6A, the device 600 includes a first positioning module 601, a map obtaining module 602, an image obtaining module 603, and a second positioning module 604; wherein,,

the first positioning module 601 is configured to determine a target identifier for representing a current area of the image acquisition device according to network characteristics of a network where the image acquisition device is located and a fingerprint map;

a map obtaining module 602, configured to obtain a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map, where the local map includes attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in the sample image;

An image acquisition module 603 configured to acquire an image to be processed acquired by the image acquisition device;

and a second positioning module 604, configured to position the image acquisition device according to the to-be-processed image and attribute information of a plurality of sampling points in the target local map.

In other embodiments, as shown in fig. 6B, the apparatus 600 further comprises: a fingerprint map construction module 605; the fingerprint map construction module 605 is configured to: acquiring network characteristics of at least one sub-area in at least one physical area and an identification of each sub-area; and associating the network characteristics of each sub-region with the corresponding identifiers to obtain the fingerprint map.

In other embodiments, as shown in fig. 6B, the apparatus 600 further includes a point cloud map construction module 605 configured to: obtaining a local map corresponding to each physical area; wherein, each local map comprises attribute information of sampling points in a corresponding physical area; and marking the identification of each sub-region in the corresponding physical region on each local map to obtain the point cloud map.

In other embodiments, the point cloud map construction module 606 includes: the world coordinate determination submodule is configured to: determining world coordinates of at least one sampling point according to camera coordinates and image characteristics of the sampling points in the plurality of sample images; a dataset determination submodule configured to: determining a first data set according to the world coordinates and the image characteristics of each sampling point; determining a corresponding second data set according to camera coordinates and image characteristics of sampling points in the obtained kth other sample image; wherein the plurality of sample images and each of the other sample images are different images acquired in an ith physical area, and k and i are integers greater than 0; the world coordinate determination submodule is further configured to: determining world coordinates of sampling points in the other sample images according to the first data set and each second data set; a local map construction sub-module configured to: and constructing a local map corresponding to the ith physical area at least according to the determined world coordinates of each sampling point.

In other embodiments, the world coordinate determination sub-module includes: an image pair selecting unit configured to: selecting a first target image and a second target image meeting a second condition from the plurality of sample images according to the camera coordinates and the image characteristics of each sampling point in the plurality of sample images; a transformation relation determining unit configured to: determining a fourth rotational relationship and a fourth translational relationship between the first target image and the second target image; a world coordinate determination unit configured to: and determining world coordinates of the sampling points in the first target image according to the fourth rotation relation, the fourth translation relation and camera coordinates of the sampling points in the first target image.

In other embodiments, the image pair picking unit is configured to: according to the image characteristics of each sampling point in the plurality of sample images, carrying out pairwise matching on the plurality of sample images to obtain a first matching pair set of each pair of sample images; removing sampling point matching pairs which do not meet a third condition in the first matching pair set to obtain a second matching pair set; selecting a target matching pair set with the number of matching pairs meeting the second condition from each second matching pair set; and determining two sample images corresponding to the target matching pair set as a first target image and a second target image.

In other embodiments, the first positioning module 601 includes: a feature matching sub-module configured to: matching the network characteristics of the network where the image acquisition equipment is located with the network characteristics of each subarea in the fingerprint map to obtain target network characteristics; the first acquisition submodule is configured to: and acquiring a target identifier associated with the target network characteristic from the fingerprint map.

In other embodiments, the second positioning module 604 includes: a second acquisition sub-module configured to: determining attribute information of feature points in the image to be processed; acquiring attribute information of a plurality of sampling points in the target local map; a fine positioning sub-module configured to: and matching the attribute information of the plurality of feature points with the attribute information of the plurality of sampling points to obtain the position information of the image acquisition equipment, so as to realize positioning of the image acquisition equipment.

In other embodiments, the attribute information of the feature points includes image features and camera coordinates; the attribute information of the sampling points comprises image characteristics and world coordinates; the fine positioning sub-module includes: the sampling point matching unit is configured to match the image features of the jth feature point with the image features of the sampling points to obtain a first target sampling point matched with the jth feature point, wherein j is an integer greater than 0; and the positioning unit is configured to determine the position information of the image acquisition equipment according to the camera coordinates of the feature points and the world coordinates of the corresponding first target sampling points.

In other embodiments, the sampling point matching unit is configured to: determining the similarity between the image features of the jth feature point and the image features of each sampling point to obtain a similarity set; and determining the sampling points with the similarity meeting the first condition in the similarity set as first target sampling points matched with the j-th feature points.

In other embodiments, the location information includes world coordinates of the image capture device and an orientation of the image capture device in the point cloud map; the positioning unit is configured to: determining the camera coordinates of each first target sampling point according to the camera coordinates of a plurality of the characteristic points and the world coordinates of the first target sampling points of the characteristic points; determining a first rotation relation and a first translation relation of a camera coordinate system relative to a world coordinate system according to the world coordinate of each first target sampling point and the camera coordinate of each first target sampling point; determining world coordinates of the image acquisition device according to the first translation relationship and the first rotation relationship; and determining the orientation of the image acquisition equipment in the point cloud map according to the first rotation relation.

In other embodiments, the attribute information of the feature points includes camera coordinates, and the attribute information of the sample points includes world coordinates; the position information comprises world coordinates of the image acquisition equipment and the orientation of the image acquisition equipment in the point cloud map; the second positioning module comprises: an iteration unit configured to: according to an iteration strategy, matching camera coordinates of the feature points with world coordinates of the sampling points to obtain a target rotation relationship and a target translation relationship of a camera coordinate system relative to the world coordinate system; the positioning unit is further configured to: determining the orientation of the image acquisition equipment in the point cloud map according to the target rotation relation; and determining world coordinates of the image acquisition equipment according to the target translation relationship and the target rotation relationship.

In other embodiments, the iteration unit includes: selecting a subunit configured to: selecting an initial target sampling point matched with each characteristic point from the plurality of sampling points; a determining subunit configured to: determining a second rotation relation and a second translation relation of the camera coordinate system relative to the world coordinate system according to the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point; determining a matching error according to the second rotation relation, the second translation relation, the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point; if the matching error is larger than a preset threshold value, re-selecting an initial target sampling point and re-determining the matching error; and respectively determining a second rotation relation and a second translation relation when the re-determined matching error is smaller than or equal to the preset threshold value as the target rotation relation and the target translation relation.

In other embodiments, the selecting subunit is configured to: acquiring a third rotation relationship and a third translation relationship of the camera coordinate system relative to the world coordinate system; determining a first world coordinate of a jth feature point according to the third rotation relation, the third translation relation and the camera coordinate of the jth feature point, wherein j is an integer greater than 0; and matching the first world coordinates of each characteristic point with the world coordinates of the plurality of sampling points to obtain a corresponding initial target sampling point.

In other embodiments, the determining subunit is configured to: determining a second world coordinate of a jth feature point according to the second rotation relation, the second translation relation and the camera coordinate of the jth feature point, wherein j is an integer greater than 0; and determining the matching error according to the second world coordinates of each characteristic point and the world coordinates of the corresponding initial target sampling point.

In other embodiments, the determining subunit is configured to: determining the distance between the world coordinates of the second world coordinate of each feature point and the world coordinates of the corresponding initial target sampling point; and determining the matching error according to each distance.

In other embodiments, the selecting subunit is configured to: and if the matching error is larger than a preset threshold, taking the second translation relation as the third translation relation, taking the second rotation relation as the third rotation relation, and re-selecting an initial target sampling point.

The description of the apparatus embodiments above is similar to that of the method embodiments above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the device embodiments of the present application, please refer to the description of the method embodiments of the present application for understanding.

It should be noted that, in the embodiment of the present application, if the positioning method is implemented in the form of a software functional module, and is sold or used as a separate product, the positioning method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied essentially or in a part contributing to the related art in the form of a software product stored in a storage medium, including several instructions for causing an electronic device (which may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a robot, an unmanned aerial vehicle, etc.) to perform all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, an optical disk, or other various media capable of storing program codes. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.

Correspondingly, an electronic device is provided in the embodiment of the present application, fig. 7 is a schematic diagram of a hardware entity of the electronic device in the embodiment of the present application, and as shown in fig. 7, the hardware entity of the electronic device 700 includes: comprising a memory 701 and a processor 702, said memory 701 storing a computer program executable on the processor 702, said processor 702 implementing the steps of the positioning method provided in the above-described embodiments when said program is executed.

The memory 701 is configured to store instructions and applications executable by the processor 702, and may also cache data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or processed by the respective modules in the processor 702 and the electronic device 700, which may be implemented by a FLASH memory (FLASH) or a random access memory (Random Access Memory, RAM).

Accordingly, embodiments of the present application provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the positioning method provided in the above embodiments.

It should be noted here that: the description of the storage medium and apparatus embodiments above is similar to that of the method embodiments described above, with similar benefits as the method embodiments. For technical details not disclosed in the embodiments of the storage medium and the apparatus of the present application, please refer to the description of the method embodiments of the present application for understanding.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application. The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read Only Memory (ROM), a magnetic disk or an optical disk, or the like, which can store program codes.

Alternatively, the integrated units described above may be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present application may be embodied essentially or in a part contributing to the related art in the form of a software product stored in a storage medium, including several instructions for causing an electronic device (which may be a mobile phone, a tablet computer, a notebook computer, a desktop computer, a robot, an unmanned aerial vehicle, etc.) to perform all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.

The methods disclosed in the several method embodiments provided in the present application may be arbitrarily combined without collision to obtain a new method embodiment.

The features disclosed in the several product embodiments provided in the present application may be combined arbitrarily without conflict to obtain new product embodiments.

The features disclosed in the several method or apparatus embodiments provided in the present application may be arbitrarily combined without conflict to obtain new method embodiments or apparatus embodiments.

The foregoing is merely an embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered in the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of positioning, the method comprising:

determining a target identifier for representing a current area of the image acquisition equipment according to network characteristics of a network where the image acquisition equipment is located and a fingerprint map;

acquiring a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map, wherein the local map comprises attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in a sample image;

Acquiring an image to be processed acquired by the image acquisition equipment;

positioning the image acquisition equipment according to the to-be-processed image and attribute information of a plurality of sampling points in the target local map;

the construction process of each local map comprises the following steps:

determining world coordinates of at least one sampling point according to camera coordinates and image characteristics of the sampling points in the plurality of sample images;

determining a first data set according to the world coordinates and the image characteristics of each sampling point;

determining a corresponding second data set according to camera coordinates and image characteristics of sampling points in the obtained kth other sample image; wherein the plurality of sample images and each of the other sample images are different images acquired in an ith physical area, and k and i are integers greater than 0;

determining world coordinates of sampling points in the other sample images according to the first data set and each second data set;

and constructing a local map corresponding to the ith physical area at least according to the determined world coordinates of each sampling point.

2. The method according to claim 1, wherein the method further comprises:

Acquiring network characteristics of at least one sub-area in at least one physical area and an identification of each sub-area;

and associating the network characteristics of each sub-region with the corresponding identifiers to obtain the fingerprint map.

3. The method according to claim 2, wherein the method further comprises:

obtaining a local map corresponding to each physical area; wherein, each local map comprises attribute information of sampling points in a corresponding physical area;

and marking the identification of each sub-region in the corresponding physical region on each local map to obtain the point cloud map.

4. The method of claim 1, wherein determining world coordinates of at least one sample point from camera coordinates and image features of the sample point in the plurality of sample images comprises:

selecting a first target image and a second target image meeting a second condition from the plurality of sample images according to the camera coordinates and the image characteristics of each sampling point in the plurality of sample images;

determining a fourth rotational relationship and a fourth translational relationship between the first target image and the second target image;

And determining world coordinates of the sampling points in the first target image according to the fourth rotation relation, the fourth translation relation and camera coordinates of the sampling points in the first target image.

5. The method of claim 4, wherein selecting the first target image and the second target image from the plurality of sample images that satisfy the second condition based on the camera coordinates and the image characteristics of each of the sampling points in the plurality of sample images comprises:

according to the image characteristics of each sampling point in the plurality of sample images, carrying out pairwise matching on the plurality of sample images to obtain a first matching pair set of each pair of sample images;

removing sampling point matching pairs which do not meet a third condition in the first matching pair set to obtain a second matching pair set;

selecting a target matching pair set with the number of matching pairs meeting the second condition from each second matching pair set;

and determining two sample images corresponding to the target matching pair set as a first target image and a second target image.

6. The method according to claim 2, wherein determining the target identifier for characterizing the current area of the image capturing device according to the network characteristics of the network in which the image capturing device is located and the fingerprint map comprises:

Matching the network characteristics of the network where the image acquisition equipment is located with the network characteristics of each subarea in the fingerprint map to obtain target network characteristics;

and acquiring a target identifier associated with the target network characteristic from the fingerprint map.

7. The method according to claim 1, wherein positioning the image capturing device according to the attribute information of the image to be processed and the plurality of sampling points in the target local map includes:

determining attribute information of feature points in the image to be processed;

acquiring attribute information of a plurality of sampling points in the target local map;

and matching the attribute information of the plurality of feature points with the attribute information of the plurality of sampling points to obtain the position information of the image acquisition equipment, so as to realize positioning of the image acquisition equipment.

8. The method of claim 7, wherein the attribute information of the feature points includes image features and camera coordinates; the attribute information of the sampling points comprises image characteristics and world coordinates;

the matching the attribute information of the feature points with the attribute information of the sampling points to obtain the position information of the image acquisition device includes:

Matching the image features of the jth feature point with the image features of the plurality of sampling points to obtain a first target sampling point matched with the jth feature point, wherein j is an integer greater than 0;

and determining the position information of the image acquisition equipment according to the camera coordinates of the feature points and the world coordinates of the first target sampling points of the feature points.

9. The method of claim 8, wherein matching the image features of the jth feature point with the image features of the plurality of sample points results in a first target sample point that matches the jth feature point, comprising:

determining the similarity between the image features of the jth feature point and the image features of each sampling point to obtain a similarity set;

and determining the sampling points with the similarity meeting the first condition in the similarity set as first target sampling points matched with the j-th feature points.

10. The method of claim 8, wherein the location information includes world coordinates of the image capture device and an orientation of the image capture device in the point cloud map;

the determining the position information of the image acquisition device according to the camera coordinates of the feature points and the world coordinates of the first target sampling points of the feature points comprises the following steps:

Determining the camera coordinates of each first target sampling point according to the camera coordinates of a plurality of the characteristic points and the world coordinates of the first target sampling points of the characteristic points;

determining a first rotation relation and a first translation relation of a camera coordinate system relative to a world coordinate system according to the world coordinate of each first target sampling point and the camera coordinate of each first target sampling point;

determining world coordinates of the image acquisition device according to the first translation relationship and the first rotation relationship;

and determining the orientation of the image acquisition equipment in the point cloud map according to the first rotation relation.

11. The method of claim 7, wherein the attribute information of the feature points includes camera coordinates and the attribute information of the sample points includes world coordinates; the position information comprises world coordinates of the image acquisition equipment and the orientation of the image acquisition equipment in the point cloud map;

according to an iteration strategy, matching camera coordinates of the feature points with world coordinates of the sampling points to obtain a target rotation relationship and a target translation relationship of a camera coordinate system relative to the world coordinate system;

Determining the orientation of the image acquisition equipment in the point cloud map according to the target rotation relation;

and determining world coordinates of the image acquisition equipment according to the target translation relationship and the target rotation relationship.

12. The method of claim 11, wherein matching the camera coordinates of the plurality of feature points with the world coordinates of the plurality of sampling points according to an iterative strategy to obtain a target rotational relationship and a target translational relationship of the camera coordinate system relative to the world coordinate system comprises:

selecting an initial target sampling point matched with each characteristic point from the plurality of sampling points;

determining a second rotation relation and a second translation relation of the camera coordinate system relative to the world coordinate system according to the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point;

determining a matching error according to the second rotation relation, the second translation relation, the camera coordinates of each feature point and the world coordinates of the corresponding initial target sampling point;

if the matching error is larger than a preset threshold value, re-selecting an initial target sampling point and re-determining the matching error;

And respectively determining a second rotation relation and a second translation relation when the re-determined matching error is smaller than or equal to the preset threshold value as the target rotation relation and the target translation relation.

13. The method of claim 12, wherein selecting an initial target sampling point from the plurality of sampling points that matches each of the feature points comprises:

acquiring a third rotation relationship and a third translation relationship of the camera coordinate system relative to the world coordinate system;

determining a first world coordinate of a jth feature point according to the third rotation relation, the third translation relation and the camera coordinate of the jth feature point, wherein j is an integer greater than 0;

and matching the first world coordinates of each characteristic point with the world coordinates of the plurality of sampling points to obtain a corresponding initial target sampling point.

14. The method of claim 12, wherein said determining a match error based on said second rotational relationship, said second translational relationship, camera coordinates of each of said feature points, and world coordinates of a corresponding initial target sample point, comprises:

determining a second world coordinate of a jth feature point according to the second rotation relation, the second translation relation and the camera coordinate of the jth feature point, wherein j is an integer greater than 0;

And determining the matching error according to the second world coordinates of each characteristic point and the world coordinates of the corresponding initial target sampling point.

15. The method of claim 14, wherein said determining the match error based on the second world coordinates of each of the feature points and the world coordinates of the corresponding initial target sample point comprises:

determining the distance between the world coordinates of the second world coordinate of each feature point and the world coordinates of the corresponding initial target sampling point;

and determining the matching error according to each distance.

16. The method of claim 13, wherein if the match error is greater than a preset threshold, re-selecting the initial target sample point comprises:

and if the matching error is larger than a preset threshold, taking the second translation relation as the third translation relation, taking the second rotation relation as the third rotation relation, and re-selecting an initial target sampling point.

17. A positioning device, the device comprising:

the first positioning module is configured to determine a target identifier for representing the current area of the image acquisition equipment according to the network characteristics of the network where the image acquisition equipment is located and the fingerprint map;

The map acquisition module is configured to acquire a target local map corresponding to the target identifier from a plurality of local maps included in the point cloud map, wherein the local map includes attribute information of a plurality of sampling points, and world coordinates in the attribute information of the sampling points are world coordinates of the sampling points in the sample image;

the image acquisition module is configured to acquire an image to be processed acquired by the image acquisition equipment;

the second positioning module is configured to position the image acquisition equipment according to the to-be-processed image and attribute information of a plurality of sampling points in the target local map; the construction process of each local map comprises the following steps: determining world coordinates of at least one sampling point according to camera coordinates and image characteristics of the sampling points in the plurality of sample images; determining a first data set according to the world coordinates and the image characteristics of each sampling point; determining a corresponding second data set according to camera coordinates and image characteristics of sampling points in the obtained kth other sample image; wherein the plurality of sample images and each of the other sample images are different images acquired in an ith physical area, and k and i are integers greater than 0; determining world coordinates of sampling points in the other sample images according to the first data set and each second data set; and constructing a local map corresponding to the ith physical area at least according to the determined world coordinates of each sampling point.

18. An electronic device comprising a memory and a processor, the memory storing a computer program executable on the processor, characterized in that the processor implements the steps of the positioning method of any of claims 1 to 16 when the program is executed.

19. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the positioning method according to any of claims 1 to 16.