CN116429082A

CN116429082A - A Visual SLAM Method Based on ST-ORB Feature Extraction

Info

Publication number: CN116429082A
Application number: CN202310172971.3A
Authority: CN
Inventors: 鲁文涛; 周牡丹; 王堉豪
Original assignee: Xiamen University; Xiamen University Tan Kah Kee College
Current assignee: Xiamen University; Xiamen University Tan Kah Kee College
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-07-14

Abstract

The invention relates to a visual SLAM method based on ST-ORB feature extraction, firstly constructing a multi-scale space through an image pyramid, and then adopting a Shi-Tomasi algorithm to detect corner points of a gray image so as to improve the quality of feature points; then homogenizing the characteristic points by using a quadtree algorithm, and marking the characteristic points by using a binary descriptor BRIEF so as to match the characteristics of the front and rear frame images; finally, the algorithm is applied to an ORB-SLAM2 system to estimate the pose of the moving object. The method has good stability and improves the track precision of the visual SLAM.

Description

A Visual SLAM Method Based on ST-ORB Feature Extraction

技术领域technical field

本发明涉及视觉SLAM领域，具体涉及一种基于ST-ORB(Shi-Tomasi-ORB)特征提取的视觉SLAM方法。The invention relates to the field of visual SLAM, in particular to a visual SLAM method based on ST-ORB (Shi-Tomasi-ORB) feature extraction.

背景技术Background technique

V-SLAM(Visual Simultaneous Localization and Mapping)即视觉同步定位和地图构建技术是通过视觉传感器在对环境进行建图的同时进行无人系统的自主定位。随着无人机、自动驾驶行业的快速发展，视觉SLAM成为了广大学者的研究热点。现有的视觉SLAM方案的特征点提取方法检测到的特征点冗余且均匀性差，由于提取的特征点不能够准确地描述图像信息，导致在跟踪过程中出现关键点丢失的问题，影响了位姿估计的准确性，从而最终影响了轨迹定位和地图构建精度；并且在无人飞行器、无人车这些计算资源有限的平台，其常用的特征检测方法计算效率低，无法满足SLAM算法的实时性要求。V-SLAM (Visual Simultaneous Localization and Mapping), that is, visual synchronous localization and map construction technology, uses visual sensors to map the environment while performing autonomous positioning of unmanned systems. With the rapid development of drones and autonomous driving industries, visual SLAM has become a research hotspot for scholars. The feature points detected by the feature point extraction method of the existing visual SLAM scheme are redundant and poor in uniformity. Since the extracted feature points cannot accurately describe the image information, the problem of key point loss occurs during the tracking process, which affects the location. The accuracy of attitude estimation, which ultimately affects the accuracy of trajectory positioning and map construction; and on platforms with limited computing resources such as unmanned aerial vehicles and unmanned vehicles, the commonly used feature detection methods have low computational efficiency and cannot meet the real-time performance of SLAM algorithms. Require.

特征提取是V-SLAM重要的环节之一，通常可以分为基于学习和基于设计的方法。近年来，机器学习的方法被广泛应用于图像的特征提取，Arnfred等人提出了无几何约束的图像特征提取与特征匹配通用框架，Zeng等人提出了使用参数化Koopmans-Beckmann监督学习的图像提取匹配方法，但是由于机器学习方法对计算资源要求高，且普适性较差，无法应用于复杂多变的复杂环境。Bay在尺度不变特征算法的基础上提出了加速鲁棒特征算法，该算法使用基于Hessian矩阵的检测器和基于分布的描述符加速提取过程，但仍然难以满足视觉SLAM的实时性需求。Feature extraction is one of the important links of V-SLAM, which can usually be divided into learning-based and design-based methods. In recent years, machine learning methods have been widely used in image feature extraction. Arnfred et al. proposed a general framework for image feature extraction and feature matching without geometric constraints. Zeng et al. proposed image extraction using parametric Koopmans-Beckmann supervised learning. However, due to the high requirements on computing resources and poor universality of machine learning methods, they cannot be applied to complex and changeable complex environments. Bay proposed an accelerated robust feature algorithm based on the scale-invariant feature algorithm. This algorithm uses a Hessian matrix-based detector and a distribution-based descriptor to accelerate the extraction process, but it is still difficult to meet the real-time requirements of visual SLAM.

Rublee等人提出的ORB算法采用FAST算法提取特征点，使用抗旋转的描述符进行特征修饰和特征匹配。ORB算法大幅度地提升了计算效率，但是这种方法提取的特征点稳定性差和分布不均匀，这也导致了位姿估计精度的降低。因此，有必要设计一种新的视觉SLAM方法，以解决现有技术存在的问题。The ORB algorithm proposed by Rublee et al. uses the FAST algorithm to extract feature points, and uses anti-rotation descriptors for feature modification and feature matching. The ORB algorithm greatly improves the computational efficiency, but the feature points extracted by this method have poor stability and uneven distribution, which also leads to a decrease in the accuracy of pose estimation. Therefore, it is necessary to design a new visual SLAM method to solve the problems existing in the existing technology.

发明内容Contents of the invention

本发明的目的在于提供一种基于ST-ORB特征提取的视觉SLAM方法，该方法稳定性好，提高了视觉SLAM的轨迹精度。The purpose of the present invention is to provide a visual SLAM method based on ST-ORB feature extraction, which has good stability and improves the trajectory accuracy of visual SLAM.

为实现上述目的，本发明采用的技术方案是：一种基于ST-ORB特征提取的视觉SLAM方法，先通过图像金字塔构造多尺度空间，再采用Shi-Tomasi算法对灰度图像进行角点检测，以提高特征点质量；然后用四叉树算法对特征点进行均匀化处理，并采用二进制描述子BRIEF对特征点进行标记，以便前后帧图像的特征匹配；最后将该算法应用于ORB-SLAM2系统对移动的物体进行位姿估计。In order to achieve the above purpose, the technical solution adopted by the present invention is: a visual SLAM method based on ST-ORB feature extraction, first constructing a multi-scale space through the image pyramid, and then using the Shi-Tomasi algorithm to detect the corners of the gray image, To improve the quality of feature points; then use the quadtree algorithm to homogenize the feature points, and use the binary descriptor BRIEF to mark the feature points so that the features of the front and rear frame images can be matched; finally, the algorithm is applied to the ORB-SLAM2 system Estimating the pose of a moving object.

进一步地，该方法具体包括以下步骤：Further, the method specifically includes the following steps:

1)为了保持图像特征的尺度不变性，设置图像金字塔层数，采用高斯图像金字塔对每一帧图像进行降采样，构造多尺度空间，以在特征匹配环节提高特征点匹配的正确率；1) In order to maintain the scale invariance of image features, set the number of image pyramid layers, use Gaussian image pyramid to down-sample each frame of image, and construct a multi-scale space to improve the accuracy of feature point matching in the feature matching process;

2)采用Shi-Tomasi算法对每一层级图像的特征点进行提取；2) Using the Shi-Tomasi algorithm to extract the feature points of each level of image;

3)通过四叉树算法对特征点进行均匀化处理；3) Homogenize the feature points through the quadtree algorithm;

4)采用BRIEF描述算子对特征点进行特征描述；4) Use the BRIEF description operator to describe the features of the feature points;

5)根据二进制描述子BRIEF对前后两帧图像进行特征匹配，筛选关键帧，并生成地图点；5) According to the binary descriptor BRIEF, perform feature matching on the two frames of images before and after, filter key frames, and generate map points;

6)将上述算法应用到ORB-SLAM2系统中，实现视觉SLAM的位姿估计。6) Apply the above algorithm to the ORB-SLAM2 system to realize the pose estimation of visual SLAM.

进一步地，所述Shi-Tomasi算法根据滑动窗口中像素点的灰度变化趋势来进行角点判断；具体为：Further, the Shi-Tomasi algorithm performs corner judgment according to the trend of grayscale variation of pixels in the sliding window; specifically:

将一个窗口置于图像之上，对其进行移动；设图像灰度值I(x,y)在点(x,y)处以(u,v)的偏移量移动后，产生的灰度差值E(u,v)为：Place a window on the image and move it; if the image gray value I(x,y) is moved at the point (x,y) by the offset of (u,v), the resulting grayscale difference The value E(u,v) is:

式中，S是移动窗口区域，h(x,y)是高斯加权函数；使用泰勒公式对I(x+u,y+v)进行近似，得到：In the formula, S is the moving window area, h(x, y) is the Gaussian weighting function; use the Taylor formula to approximate I(x+u,y+v), and get:

将式(2)代入式(1)中可得：Substituting formula (2) into formula (1) can get:

设M矩阵为：Let the M matrix be:

经过简化后，灰度差表示为：After simplification, the gray level difference is expressed as:

式(5)中，M矩阵是关于x和y的二阶函数，根据其特征值的不同，将像素点分为角点、直线和平面三种情况；若该点是角点，则由于其移动的窗口朝向任意方向移动都会导致窗口内灰度产生较大的变化，此时，M矩阵的两个特征值都较大；In formula (5), the M matrix is a second-order function about x and y. According to the difference of its eigenvalues, the pixel point is divided into three cases: corner point, straight line and plane; if the point is a corner point, due to its Moving the moving window in any direction will cause a large change in the gray level in the window. At this time, the two eigenvalues of the M matrix are both large;

通常采用一个角点响应值R来判断Shi-Tomasi算法中检测的角点质量；Usually a corner point response value R is used to judge the quality of the corner points detected in the Shi-Tomasi algorithm;

R＝min(λ₁，λ₂) (6)R=min(λ ₁ ,λ ₂ ) (6)

当R大于设定的阈值时，则判断该像素点为一个角点。When R is greater than the set threshold, it is judged that the pixel point is a corner point.

与现有技术相比，本发明具有以下有益效果：本发明提供了一种基于ST-ORB特征提取的视觉SLAM方法，该方法克服了ORB算法提取的特征点分布不均和稳定性差的问题，采用ST-ORB特征提取的ORB-SLAM2系统稳定性更强，并且能够在一定程度上降低了绝对轨迹误差，提高了视觉SLAM的轨迹精度。Compared with the prior art, the present invention has the following beneficial effects: the present invention provides a visual SLAM method based on ST-ORB feature extraction, which overcomes the problems of uneven distribution and poor stability of feature points extracted by the ORB algorithm, The ORB-SLAM2 system using ST-ORB feature extraction is more stable, and can reduce the absolute trajectory error to a certain extent, and improve the trajectory accuracy of visual SLAM.

附图说明Description of drawings

图1是本发明实施例的方法实现流程图。Fig. 1 is a flow chart of the implementation of the method of the embodiment of the present invention.

图2是本发明实施例中FAST特征点示意图。Fig. 2 is a schematic diagram of FAST feature points in an embodiment of the present invention.

图3是本发明实施例中图像金字塔结构示意图。Fig. 3 is a schematic diagram of an image pyramid structure in an embodiment of the present invention.

图4是本发明实施例中ORB特征提取示意图。Fig. 4 is a schematic diagram of ORB feature extraction in an embodiment of the present invention.

图5是本发明实施例中ST-ORB特征提取示意图。Fig. 5 is a schematic diagram of ST-ORB feature extraction in the embodiment of the present invention.

图6是本发明实施例中轨迹误差对比图。Fig. 6 is a comparison chart of track errors in the embodiment of the present invention.

图7是本发明实施例中APE指标误差分析图。Fig. 7 is an analysis diagram of APE index error in the embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图及实施例对本发明做进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

应该指出，以下详细说明都是示例性的，旨在对本申请提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本申请所属技术领域的普通技术人员通常理解的相同含义。It should be pointed out that the following detailed description is exemplary and is intended to provide further explanation to the present application. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

需要注意的是，这里所使用的术语仅是为了描述具体实施方式，而非意图限制根据本申请的示例性实施方式。如在这里所使用的，除非上下文另外明确指出，否则单数形式也意图包括复数形式，此外，还应当理解的是，当在本说明书中使用术语“包含”和/或“包括”时，其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terminology used here is only for describing specific implementations, and is not intended to limit the exemplary implementations according to the present application. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that when the terms "comprising" and/or "comprising" are used in this specification, they mean There are features, steps, operations, means, components and/or combinations thereof.

如图1所示，本实施例提供了一种基于ST-ORB特征提取的视觉SLAM方法，先通过图像金字塔构造多尺度空间，再采用Shi-Tomasi算法对灰度图像进行角点检测，以提高特征点质量；然后用四叉树算法对特征点进行均匀化处理，并采用二进制描述子BRIEF对特征点进行标记，以便前后帧图像的特征匹配；最后将该算法应用于ORB-SLAM2系统对移动的物体进行位姿估计。具体地，该方法包括以下步骤：As shown in Figure 1, this embodiment provides a visual SLAM method based on ST-ORB feature extraction, first constructing a multi-scale space through an image pyramid, and then using the Shi-Tomasi algorithm to detect corners on gray images to improve The quality of the feature points; then use the quadtree algorithm to homogenize the feature points, and use the binary descriptor BRIEF to mark the feature points so that the features of the front and rear frame images can be matched; finally, the algorithm is applied to the ORB-SLAM2 system for moving object pose estimation. Specifically, the method includes the following steps:

1)为了保持图像特征的尺度不变性，设置图像金字塔层数，采用高斯图像金字塔对每一帧图像进行降采样，构造多尺度空间，以在特征匹配环节提高特征点匹配的正确率。1) In order to maintain the scale invariance of image features, set the number of image pyramid layers, use Gaussian image pyramid to down-sample each frame of image, and construct a multi-scale space to improve the accuracy of feature point matching in the feature matching process.

2)采用Shi-Tomasi算法对每一层级图像的特征点进行提取。2) Using the Shi-Tomasi algorithm to extract the feature points of each level of image.

3)通过四叉树算法对特征点进行均匀化处理。3) Homogenize the feature points through the quadtree algorithm.

4)采用BRIEF描述算子对特征点进行特征描述。4) Use the BRIEF description operator to describe the features of the feature points.

5)根据二进制描述子BRIEF对前后两帧图像进行特征匹配，筛选关键帧，并生成地图点。5) According to the binary descriptor BRIEF, perform feature matching on the two frames of images before and after, filter key frames, and generate map points.

该方法避免了ORB算法中FAST角点检测中出现的旋转不变性问题以及ORB特征点分布密集的问题，也进一步提高了输入图像特征点的稳定性。This method avoids the problem of rotation invariance in FAST corner detection in ORB algorithm and the problem of dense distribution of ORB feature points, and further improves the stability of input image feature points.

所述Shi-Tomasi算法根据滑动窗口中像素点的灰度变化趋势来进行角点判断；具体为：Described Shi-Tomasi algorithm carries out corner point judgment according to the gray level change trend of pixel point in the sliding window; Specifically:

设M矩阵为：Let the M matrix be:

R＝min(λ₁，λ₂) (6)R=min(λ ₁ ,λ ₂ ) (6)

下面对本发明涉及的相关内容作进一步说明。The relevant content involved in the present invention will be further described below.

1.ORB-SLAM2算法1. ORB-SLAM2 algorithm

视觉SLAM算法框架主要包括前端视觉里程计、后端非线性优化和回环检测三个部分。传统的视觉SLAM算法主要通过视觉传感器读取环境信息，经过视觉里程计完成局部建图，然后通过后端部分对局部地图进行非线性优化，并通过回环检测来进一步提高SLAM轨迹定位精度。The visual SLAM algorithm framework mainly includes three parts: front-end visual odometer, back-end nonlinear optimization and loop closure detection. The traditional visual SLAM algorithm mainly reads the environmental information through the visual sensor, completes the local mapping through the visual odometer, and then performs nonlinear optimization on the local map through the back-end part, and further improves the SLAM trajectory positioning accuracy through loop detection.

ORB-SLAM是由Raul Mur-Atral等人于2015年发表在IEEE Transaction onRobotics上的一种视觉SLAM算法。ORB-SLAM是一个基于特征点的实时SLAM系统，在大规模的、小规模的、室内室外的环境都可以工作。该系统对剧烈运动也很鲁棒，支持宽基线的闭环检测和重定位，包括全自动初始化。该系统包含了所有SLAM系统共有的模块：跟踪、建图、重定位、回环检测。由于ORB-SLAM系统是基于特征点法的SLAM系统，故该过程能够实时计算出相机的运动轨迹，并生成场景的稀疏三维重建结果。ORB-SLAM is a visual SLAM algorithm published in IEEE Transaction on Robotics by Raul Mur-Atral et al. in 2015. ORB-SLAM is a real-time SLAM system based on feature points, which can work in large-scale, small-scale, indoor and outdoor environments. The system is also robust to severe motion, supporting wide-baseline loop closure detection and relocalization, including fully automatic initialization. The system contains modules common to all SLAM systems: tracking, mapping, relocation, and loop detection. Since the ORB-SLAM system is a SLAM system based on the feature point method, this process can calculate the motion trajectory of the camera in real time and generate a sparse 3D reconstruction result of the scene.

ORB-SLAM2是在ORB-SLAM的基础上，不仅支持单目相机，还支持了双目相机和RGB-D相机，相比于单目相机，深度相机能够通过红外结构光原理主动测出图像各个像素和相机之间的距离，能够避免为获取深度信息的复杂运算，并在一定程度上减少了轨迹定位误差。与传统视觉SLAM相比，ORB-SLAM2算法框架主要包括Tracking、LocalMapping和LoopClosing三个线程，这种多线程并行的形式极大缩短了SLAM运行的时间，提高了视觉SLAM系统的实时性。Based on ORB-SLAM, ORB-SLAM2 supports not only monocular cameras, but also binocular cameras and RGB-D cameras. The distance between the pixel and the camera can avoid complex calculations to obtain depth information, and reduce the trajectory positioning error to a certain extent. Compared with traditional visual SLAM, the ORB-SLAM2 algorithm framework mainly includes three threads: Tracking, LocalMapping and LoopClosing. This multi-threaded parallel form greatly shortens the running time of SLAM and improves the real-time performance of the visual SLAM system.

Tracking部分的主要工作是从图像中提取ORB特征，根据上一帧进行姿态估计，或者进行通过全局重定位初始化姿态，然后跟踪已经重建的局部地图，优化位姿，再根据一些规则确定新的关键帧。The main work of the Tracking part is to extract ORB features from the image, perform pose estimation based on the previous frame, or initialize the pose through global relocation, then track the reconstructed local map, optimize the pose, and then determine the new key according to some rules frame.

LocalMapping部分主要完成局部地图构建，包括对关键帧的插入，验证最近生成的地图点并进行筛选，然后生成新的地图点，使用局部调整，最后再对插入的关键帧进行筛选，去除多余的关键帧信息。The LocalMapping part mainly completes local map construction, including inserting key frames, verifying and filtering recently generated map points, then generating new map points, using local adjustments, and finally filtering the inserted key frames to remove redundant keys frame information.

LoopClosing部分主要分为两个过程，分别是闭环探测和闭环矫正。闭环检测先使用WOB进行探测，然后通过Sim3算法计算相似变换。闭环校正，主要是闭环融合和EssentialGraph的图优化过程。The LoopClosing part is mainly divided into two processes, which are closed-loop detection and closed-loop correction. The closed-loop detection first uses WOB to detect, and then calculates the similarity transformation through the Sim3 algorithm. Closed-loop correction, mainly closed-loop fusion and the graph optimization process of EssentialGraph.

2.特征提取2. Feature extraction

2.1ORB特征提取2.1 ORB Feature Extraction

ORB(Oriented FAST and Rotated BRIEF)是一种快速特征点提取和描述的算法。ORB算法分为特征点提取和特征点描述两个部分，而ORB特征点提取主要是由FAST(Features fromAccelerated Segment Test)算法发展形成的。FAST算法提取图像特征点的定义为以在某个像素点为中心的领域内，若有足够多的像素点的值为中心点不再同一个区域内，则该点可能为角点。对于灰度图像来说，即若该点的灰度值比周围领域中足够多像素点灰度值或大或小，则该点可能为角点。ORB (Oriented FAST and Rotated BRIEF) is an algorithm for fast feature point extraction and description. The ORB algorithm is divided into two parts: feature point extraction and feature point description, and ORB feature point extraction is mainly developed by the FAST (Features from Accelerated Segment Test) algorithm. The image feature points extracted by the FAST algorithm are defined as the area centered on a certain pixel point. If there are enough pixel points whose values are not in the same area as the center point, the point may be a corner point. For a grayscale image, that is, if the grayscale value of the point is larger or smaller than the grayscale value of enough pixels in the surrounding area, the point may be a corner point.

2.1.1.FAST角点检测2.1.1.FAST corner detection

FAST角点检测过程可以分为以下几个步骤：The FAST corner detection process can be divided into the following steps:

1)从灰度图像中选取一个像素点P，如图2所示。1) Select a pixel point P from the grayscale image, as shown in Figure 2.

2)设置一个阈值n。2) Set a threshold n.

3)以P点为圆心提取半径为3个像素的圆上的16个像素点，根据式(1)将其分为三类。3) Take point P as the center to extract 16 pixel points on a circle with a radius of 3 pixels, and divide them into three categories according to formula (1).

式(1.1)中：Ip表示像素点P的灰度值；Ip-x表示P点周围上第x个像素点的灰度值；其中x的取值为1-16之间；n为检测时的阈值，一般情况下为经验值；a表示圆上的点比p要暗；b表示圆上的点与P的亮度相似；c表示圆上的像素点比P亮。In the formula (1.1): Ip represents the gray value of the pixel point P; Ip-x represents the gray value of the xth pixel point around the point P; where the value of x is between 1-16; n is the detection time The threshold value of is generally an empirical value; a means that the points on the circle are darker than p; b means that the points on the circle are similar to the brightness of P; c means that the pixels on the circle are brighter than P.

4)当周围圆上至少有12个连续的像素点的分类结果为a或者c时，则判定P点为一个角点。在实际检测中，直接遍历圆上的16个点的方法的效率比较低，为了提升速度，可以先检测圆上1/5/9和13这四个点的灰度值，若P点为角点，那么这四个点至少有3个满足都大于Ip+n或都小于Ip-n，如果不符合条件，则直接剔除该点。这种方法可以快速排出大部分非角点。4) When the classification result of at least 12 consecutive pixel points on the surrounding circle is a or c, it is determined that point P is a corner point. In actual detection, the efficiency of directly traversing the 16 points on the circle is relatively low. In order to improve the speed, you can first detect the gray values of the four points 1/5/9 and 13 on the circle. If point P is an angle point, then at least three of these four points are greater than Ip+n or less than Ip-n. If the condition is not met, the point is directly eliminated. This method can quickly exclude most of the non-corner points.

2.1.2.图像金字塔法2.1.2. Image pyramid method

FAST特征点既没有尺度不变性也没有旋转不变性。为了解决尺度不变性的问题，ORB算法通过建立若干层图像金字塔解决。图像金字塔根据上采样和下采样的方式分为拉普拉斯图像金字塔和高斯图像金字塔。FAST feature points have neither scale invariance nor rotation invariance. In order to solve the problem of scale invariance, the ORB algorithm solves it by establishing several layers of image pyramids. Image pyramids are divided into Laplacian image pyramids and Gaussian image pyramids according to the way of upsampling and downsampling.

高斯图像金字塔对原图像进行多尺度结构表达，对给定的图像设置比例因子进行逐级向下采样，即对原图像进行缩放，采样后的图像按分辨率大小有高到低向上排列构成图像金字塔。对金字塔的第n层进行均值滤波和向下采样得到第n+1层图像，其计算公式为：The Gaussian image pyramid expresses the multi-scale structure of the original image, and sets the scale factor of the given image to downsample step by step, that is, the original image is scaled, and the sampled images are arranged from high to low in resolution to form an image pyramid. Perform mean filtering and down-sampling on the nth layer of the pyramid to obtain the n+1th layer image, and its calculation formula is:

式中，Gn、Gn+1为图像金字塔第n层和第n+1层的图像；h为均值滤波器模板，尺寸为axa，其中a为下采样的行列间隔。ORB-SLAM2算法框架中对输入的图像构建8层图像金字塔，构建的高斯图像金字塔如图3所示。In the formula, Gn and Gn+1 are the images of the nth layer and the n+1th layer of the image pyramid; h is the mean filter template, and the size is axa, where a is the row and column interval of the downsampling. In the ORB-SLAM2 algorithm framework, an 8-layer image pyramid is constructed for the input image, and the constructed Gaussian image pyramid is shown in Figure 3.

2.1.3.灰度质心法2.1.3. Gray scale centroid method

针对FAST角点检测后的特征点不具备旋转不变性的问题，可以通过灰度质心法来解决。For the problem that the feature points after FAST corner point detection do not have rotation invariance, it can be solved by the gray centroid method.

灰度质心法的基本原理为：假设某个特征点的灰度与该领域质心之间存在偏移，通过该特征点到质心的向量来计算该特征点的主方向。质心通过矩计算，其定义如下：The basic principle of the gray-scale centroid method is: assuming that there is an offset between the gray level of a feature point and the centroid of the field, the main direction of the feature point is calculated by the vector from the feature point to the centroid. The centroid is calculated via moments, which are defined as follows:

式(1.3)中，I(x,y)为图像的灰度值；r为领域半径。那么该矩的质心为：In formula (1.3), I(x, y) is the gray value of the image; r is the radius of the field. Then the centroid of the moment is:

所以，由(1.4)可得，该特征点的方向为：Therefore, according to (1.4), the direction of the feature point is:

θ＝arctan(m₀₁/m₁₀) (1.5)θ=arctan(m ₀₁ /m ₁₀ ) (1.5)

ORB特征提取部分主要是由FAST角点检测、图像金字塔法和灰度质心法组成；当然为解决图像特征点分布不均匀的问题，ORB算法中使用了四叉树的方法对每一层高斯金字塔图像中国的特征点进行均匀化处理。通过图像金字塔和灰度质心法解决了FAST算法中的尺度变换和旋转不变性的问题，提高了视觉SLAM系统中特征点的稳定性，并且能够有效防止出现误匹配而导致的关键帧缺失问题。The ORB feature extraction part is mainly composed of FAST corner detection, image pyramid method and gray centroid method; of course, in order to solve the problem of uneven distribution of image feature points, the ORB algorithm uses the method of quadtree for each layer of Gaussian pyramid The feature points of the image are homogenized. The problem of scale transformation and rotation invariance in the FAST algorithm is solved through the image pyramid and gray-scale centroid method, which improves the stability of feature points in the visual SLAM system, and can effectively prevent the missing key frames caused by mismatching.

正是基于特征点的稳定性问题，以及视觉SLAM中关键帧易丢失的问题，本发明对ORB特征提取算法进行优化，提出了一种新的特征点提取的方法，即ST-ORB(Shi-Tomasi-ORB)特征提取，并将该方法应用于ORB-SLAM2算法框架中，以便探讨室内移动机器人的轨迹定位问题。It is based on the stability of feature points and the problem that key frames are easily lost in visual SLAM. The present invention optimizes the ORB feature extraction algorithm and proposes a new feature point extraction method, namely ST-ORB (Shi- Tomasi-ORB) feature extraction, and this method is applied in the framework of ORB-SLAM2 algorithm in order to explore the trajectory localization of indoor mobile robots.

3.实验结果与分析3. Experimental results and analysis

为了评估本方法中ST-ORB特征提取方法的特征点的质量和优化后的ORB-SLAM2的轨迹定位效果，本实施例采用室内图像进行特征提取实验，通过对比传统ORB和ST-ORB特征提取算法进行结果分析；通过采用TUM数据集来验证改进后的ORB-SLAM2算法定位的准确性。本实施例运行的PC平台为Intel Core i5-7200U CPU@2.5GHz，内存为12GB，系统为Ubuntu 20.04。In order to evaluate the quality of the feature points of the ST-ORB feature extraction method in this method and the trajectory positioning effect of the optimized ORB-SLAM2, this example uses indoor images for feature extraction experiments, by comparing the traditional ORB and ST-ORB feature extraction algorithms Analyze the results; use the TUM data set to verify the accuracy of the improved ORB-SLAM2 algorithm positioning. The PC platform running in this embodiment is Intel Core i5-7200U CPU@2.5GHz, the memory is 12GB, and the system is Ubuntu 20.04.

3.1特征点提取3.1 Feature point extraction

设定目标提取的特征点个数为200，使用ORB和ST-ORB两种特征提取方法分别在室内和室外环境下进行特征点提取实验，其结果分别如图4和图5所示。Set the number of feature points for target extraction to 200, and use two feature extraction methods, ORB and ST-ORB, to conduct feature point extraction experiments in indoor and outdoor environments, and the results are shown in Figure 4 and Figure 5, respectively.

从图4和图5可以看出，对于ORB特征提取算法，提取的特征点分布过于密集，只能识别图像中的局部信息，并且对视觉SLAM的位姿估计有一定的影响；而对于ST-ORB特征提取算法，我们可以看到图5中的特征点分布的更加均匀，与ORB算法相比，其特征点的质量更高，能够在一定程度上避免关键帧的丢失。It can be seen from Figure 4 and Figure 5 that for the ORB feature extraction algorithm, the extracted feature points are too densely distributed, and can only recognize local information in the image, and have a certain impact on the pose estimation of visual SLAM; while for ST- ORB feature extraction algorithm, we can see that the distribution of feature points in Figure 5 is more uniform. Compared with the ORB algorithm, the quality of the feature points is higher, which can avoid the loss of key frames to a certain extent.

3.2轨迹估计3.2 Trajectory Estimation

为了验证ORB和ST-ORB两种特征提取方法对视觉SLAM方法精度的影响，采用ORB-SLAM2单目视觉作为主体SLAM方案，使用的数据集为TUM数据集。表1为本方法和ORB算法在四份TUM数据集中经过实验得到的轨迹的均方根误差。In order to verify the influence of the ORB and ST-ORB two feature extraction methods on the accuracy of the visual SLAM method, the ORB-SLAM2 monocular vision is used as the main SLAM solution, and the dataset used is the TUM dataset. Table 1 shows the root mean square error of the trajectory obtained by the method and the ORB algorithm in four TUM data sets through experiments.

表1轨迹均方根误差(RMSE)/(m)Table 1 Trajectory root mean square error (RMSE)/(m)

表1结果表明：ORB方法在fr1_rpy数据集上的绝对轨迹误差较低，该数据集在运行的过程中，由于使用了ST-ORB的SLAM方法出现了部分关键帧跟踪丢失的现象，故导致其轨迹定位精度低于ORB方案；本方法在另外三个数据集的表现明显强于ORB，特别是ORB方法在fr1_desk2数据集上出现了严重的跟踪丢失问题，ST-ORB方法还能保持良好的稳定性，较大程度上降低了其轨迹误差。从ORB和ST-ORB这两种方法在TUM数据集上的性能表现上来看，本发明提出的ST-ORB特征提取方法在SLAM位姿估计上的性能表现明显由于传统ORB方法。The results in Table 1 show that the absolute trajectory error of the ORB method on the fr1_rpy dataset is low. During the operation of the dataset, some key frame tracking is lost due to the SLAM method using ST-ORB, which leads to its The trajectory positioning accuracy is lower than that of the ORB scheme; the performance of this method in the other three data sets is obviously better than that of ORB, especially the ORB method has a serious tracking loss problem on the fr1_desk2 data set, and the ST-ORB method can maintain good stability , which reduces the trajectory error to a large extent. From the perspective of the performance of the two methods of ORB and ST-ORB on the TUM dataset, the performance of the ST-ORB feature extraction method proposed by the present invention in SLAM pose estimation is obviously superior to the traditional ORB method.

图6是两种方法在fr1_desk数据序列上求出的轨迹误差对比，其中热力条表示轨迹误差大小由下向上逐渐变大，虚线表示基准轨迹(即真值)，其他线条表示估计出来的计算值，从中我们可以看出，相较于传统ORB-SLAM2算法，本方法的估计轨迹更加接近于基准轨迹。Figure 6 is a comparison of the trajectory errors obtained by the two methods on the fr1_desk data sequence, where the thermal bar indicates that the trajectory error size gradually increases from bottom to top, the dotted line indicates the reference trajectory (that is, the true value), and other lines indicate the estimated calculated value , from which we can see that compared with the traditional ORB-SLAM2 algorithm, the estimated trajectory of this method is closer to the benchmark trajectory.

图7是两种方法在fr1_desk数据集上求出的轨迹误差APE分析，即绝对轨迹误差性能分析，可以看出ORB-SLAM2在使用ORB方法得出的均方根误差为0.135m，使用ST-ORB方法时得到的均方根误差为0.071m，在该数据集上的轨迹定位精度提高了47％，故本发明提出的方法更具有优越性。Figure 7 is the trajectory error APE analysis obtained by the two methods on the fr1_desk data set, that is, the absolute trajectory error performance analysis. It can be seen that the root mean square error obtained by ORB-SLAM2 using the ORB method is 0.135m, using ST- The root mean square error obtained by the ORB method is 0.071m, and the trajectory positioning accuracy on this data set is increased by 47%, so the method proposed by the present invention has more advantages.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

以上所述，仅是本发明的较佳实施例而已，并非是对本发明作其它形式的限制，任何熟悉本专业的技术人员可能利用上述揭示的技术内容加以变更或改型为等同变化的等效实施例。但是凡是未脱离本发明技术方案内容，依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与改型，仍属于本发明技术方案的保护范围。The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention to other forms. Any skilled person who is familiar with this profession may use the technical content disclosed above to change or modify the equivalent of equivalent changes. Example. However, any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention without departing from the content of the technical solution of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims

1. A visual SLAM method based on ST-ORB feature extraction is characterized in that a multi-scale space is constructed through an image pyramid, and then corner detection is carried out on a gray image by adopting a Shi-Tomasi algorithm so as to improve the quality of feature points; then homogenizing the characteristic points by using a quadtree algorithm, and marking the characteristic points by using a binary descriptor BRIEF so as to match the characteristics of the front and rear frame images; finally, the algorithm is applied to an ORB-SLAM2 system to estimate the pose of the moving object.

2. The visual SLAM method based on ST-ORB feature extraction of claim 1, comprising the steps of:

1) In order to keep the scale invariance of the image features, the number of layers of an image pyramid is set, each frame of image is downsampled by adopting a Gaussian image pyramid, and a multi-scale space is constructed so as to improve the accuracy of feature point matching in a feature matching link;

2) Extracting feature points of each level image by adopting a Shi-Tomasi algorithm;

3) Homogenizing the characteristic points through a quadtree algorithm;

4) Carrying out feature description on the feature points by adopting a BRIEF description operator;

5) Performing feature matching on the front frame image and the rear frame image according to the binary descriptor BRIEF, screening key frames, and generating map points;

6) The above algorithm is applied to an ORB-SLAM2 system to realize pose estimation of the visual SLAM.

3. The visual SLAM method based on ST-ORB feature extraction of claim 2, wherein the Shi-Tomasi algorithm performs corner judgment according to gray scale variation trend of pixel points in a sliding window; the method comprises the following steps:

a window is placed over the image and moved; let the gray value I (x, y) of the image be shifted by an offset of (u, v) at the point (x, y) to produce a gray difference E (u, v) of:

where S is the moving window region and h (x, y) is the Gaussian weighting function; approximation of I (x+u, y+v) using the taylor formula yields:

substitution of formula (2) into formula (1) yields:

let M matrix be:

after simplification, the gray scale difference is expressed as:

in the formula (5), the M matrix is a second-order function about x and y, and pixel points are divided into three conditions of angular points, straight lines and planes according to different characteristic values; if the point is a corner point, the moving window moves towards any direction to cause larger change of gray scale in the window, and at the moment, two characteristic values of the M matrix are larger;

a corner response value R is generally adopted to judge the corner quality detected in the Shi-Tomasi algorithm;

R＝min(λ ₁ ，λ ₂ ) (6)

and when R is larger than the set threshold value, judging the pixel point as a corner point.