CN111462135A - Semantic Mapping Method Based on Visual SLAM and 2D Semantic Segmentation - Google Patents
Semantic Mapping Method Based on Visual SLAM and 2D Semantic Segmentation Download PDFInfo
- Publication number
- CN111462135A CN111462135A CN202010246158.2A CN202010246158A CN111462135A CN 111462135 A CN111462135 A CN 111462135A CN 202010246158 A CN202010246158 A CN 202010246158A CN 111462135 A CN111462135 A CN 111462135A
- Authority
- CN
- China
- Prior art keywords
- semantic
- camera
- image
- map
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 49
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000013507 mapping Methods 0.000 title claims abstract description 45
- 230000000007 visual effect Effects 0.000 title claims abstract description 33
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 42
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 11
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000001514 detection method Methods 0.000 claims description 18
- 238000005457 optimization Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 8
- 238000012937 correction Methods 0.000 claims description 7
- 239000013589 supplement Substances 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000001960 triggered effect Effects 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 abstract description 6
- 230000004927 fusion Effects 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 8
- 238000003384 imaging method Methods 0.000 description 7
- 230000007613 environmental effect Effects 0.000 description 5
- 230000008447 perception Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 206010047571 Visual impairment Diseases 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及计算机视觉与深度学习的交叉融合领域,更具体的说,涉及一种基于视觉SLAM与二维语义分割的语义建图方法。本发明的方法,包括以下步骤:S1、标定相机参数,校正相机畸变;S2、获取图像帧序列;S3、图像预处理;S4、判断当前图像帧是否为关键帧,如果是,则转入步骤S6,如果不是,则转入步骤S5;S5、动态模糊补偿;S6、语义分割,针对图像帧进行ORB特征点的提取,利用掩膜区域卷积神经网络算法模型进行语义分割;S7、位姿计算,利用稀疏SLAM算法模型计算相机位姿;S8、将语义信息辅助稠密语义地图构建,实现全局点云地图的三维语义建图。本发明可以提升无人机语义建图系统性能,显著提升针对动态场景进行特征点的提取与匹配的鲁棒性。
The invention relates to the field of cross fusion of computer vision and deep learning, and more particularly, to a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation. The method of the present invention includes the following steps: S1, calibrating camera parameters, and correcting camera distortion; S2, acquiring image frame sequence; S3, image preprocessing; S4, judging whether the current image frame is a key frame, and if so, go to the step S6, if not, go to step S5; S5, dynamic blur compensation; S6, semantic segmentation, extract ORB feature points for the image frame, and use the mask area convolutional neural network algorithm model to perform semantic segmentation; S7, pose Calculate, use the sparse SLAM algorithm model to calculate the camera pose; S8, assist the semantic information to construct a dense semantic map to realize the three-dimensional semantic map of the global point cloud map. The invention can improve the performance of the UAV semantic mapping system, and significantly improve the robustness of extracting and matching feature points for dynamic scenes.
Description
技术领域technical field
本发明涉及计算机视觉与深度学习的交叉融合领域,更具体的说,涉及一 种基于视觉SLAM与二维语义分割的语义建图方法。The invention relates to the field of cross fusion of computer vision and deep learning, and more specifically, to a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation.
背景技术Background technique
无人机一般由智能决策、环境感知、运动控制三个模块构成,其中环境感 知是一切的基础。UAVs are generally composed of three modules: intelligent decision-making, environmental perception, and motion control, of which environmental perception is the basis of everything.
无人机感知周围环境,需要一套稳定、性能强大的传感器系统来充当“眼 睛”,同时需要相应的算法和强有力的处理单元“读懂物体”。For drones to perceive the surrounding environment, a stable and powerful sensor system is needed to act as "eyes", and corresponding algorithms and powerful processing units are needed to "read objects".
无人机的环境感知模块中,视觉传感器是不可或缺的一部分,视觉传感器 可以是摄像头,相较于激光雷达、毫米波雷达,摄像头的分辨率更高,能获取 足够的环境细节,例如可以描述物体的外观和形状、读取标识等。In the environmental perception module of the drone, the visual sensor is an indispensable part. The visual sensor can be a camera. Compared with lidar and millimeter-wave radar, the resolution of the camera is higher and can obtain sufficient environmental details. Describe the appearance and shape of objects, read logos, etc.
尽管全球定位系统(Global Positioning System,GPS)有助于定位过程,但是 由于高大树木、建筑、隧道等造成的干扰会使得GPS定位不可靠,因此视觉 传感器不能被GPS系统所取代。Although the Global Positioning System (GPS) helps the positioning process, the visual sensor cannot be replaced by the GPS system because of the interference caused by tall trees, buildings, tunnels, etc., which can make GPS positioning unreliable.
定位与建图(Simultaneous Localization and Mapping,SLAM)是指载有特定 传感器的主体在没有先验信息的情况下,通过计算特定传感器获取图像帧来估 计自身运动的轨迹,并建立周围环境的地图,其广泛应用于机器人、无人机、 自动驾驶、增强现实、虚拟现实等应用中。Simultaneous Localization and Mapping (SLAM) refers to the fact that a subject carrying a specific sensor can estimate the trajectory of its own motion by calculating the image frame obtained by a specific sensor without prior information, and establish a map of the surrounding environment. It is widely used in robotics, drones, autonomous driving, augmented reality, virtual reality and other applications.
SLAM可以划分为激光SLAM和视觉SLAM两类。SLAM can be divided into two categories: laser SLAM and visual SLAM.
由于起步早,激光SLAM在理论技术和工程应用上都较为成熟,但是激光 SLAM在机器人的应用上有一个致命的缺点,就是激光雷达智能感知的结构信 息是二维信息,信息量较少,造成丢失了大量的环境信息。同时其高昂的成本、 庞大的体积以及缺少语义信息使其在一些特定的应用场景中受限。Due to its early start, laser SLAM is relatively mature in theoretical technology and engineering applications, but laser SLAM has a fatal disadvantage in the application of robots, that is, the structural information of lidar intelligent perception is two-dimensional information, and the amount of information is small, resulting in A lot of environmental information is lost. At the same time, its high cost, huge volume and lack of semantic information make it limited in some specific application scenarios.
视觉SLAM的感知信息源为相机图像。The perceptual information source of visual SLAM is the camera image.
根据相机类型,可将视觉SLAM分为三种:单目、双目以及深度SLAM。 类似于激光雷达,深度相机可以通过采集点云来直接计算到障碍物的距离。深 度相机结构简单,易于安装操作,而且成本低、使用场景广泛。According to the camera type, visual SLAM can be divided into three types: monocular, binocular, and depth SLAM. Similar to lidar, depth cameras can directly calculate distances to obstacles by collecting point clouds. The depth camera has a simple structure, is easy to install and operate, and has low cost and a wide range of usage scenarios.
随着深度学习的兴起,视觉SLAM在近几年也取得了长足的进步。With the rise of deep learning, visual SLAM has also made great progress in recent years.
大部分的视觉SLAM方案都是特征点或像素级别,为了完成一个特定的任 务,或者与周围环境进行智能化的交互,无人机需要获取语义信息。Most of the visual SLAM solutions are at the feature point or pixel level. In order to complete a specific task or intelligently interact with the surrounding environment, UAVs need to obtain semantic information.
视觉SLAM系统能够选择有用信息,剔除无效信息。The visual SLAM system can select useful information and eliminate invalid information.
随着深度学习的发展,许多成熟的目标检测和语义分割的方法为精确的语 义建图提供了条件。语义地图有利于提高无人机的自主性和鲁棒性,完成更复 杂的任务,从路径规划转化为任务规划。With the development of deep learning, many mature object detection and semantic segmentation methods provide conditions for accurate semantic mapping. Semantic maps are conducive to improving the autonomy and robustness of UAVs, completing more complex tasks, and transforming from path planning to mission planning.
随着硬件计算能力的提高,以及算法结构的优化,深度学习取得了越来越 瞩目的成就。With the improvement of hardware computing power and the optimization of algorithm structure, deep learning has made more and more remarkable achievements.
在计算机视觉领域取得了巨大的飞跃,就RGB图像分割来看,可以大体 分为目标检测和语义分割。A huge leap has been made in the field of computer vision. As far as RGB image segmentation is concerned, it can be roughly divided into object detection and semantic segmentation.
在前期主要是目标检测框架的提出,实现了越来越精准的目标检测。In the early stage, the main target detection framework was proposed, which achieved more and more accurate target detection.
主流的深度学习目标检测框架主要是基于CNN(Convolutional NeuralNetworks,卷积神经网络)的,其中较为高效的有YOLO(You Only Look Once, 你只用看一次)系列和R-CNN(Region-CNN,区域卷积神经网络)系列。The mainstream deep learning target detection framework is mainly based on CNN (Convolutional Neural Networks, convolutional neural network), among which the more efficient are the YOLO (You Only Look Once, you only need to look once) series and R-CNN (Region-CNN, Regional Convolutional Neural Networks) series.
三维图像中的目标感知技术越来越成熟,三维理解的需求也越来越紧迫。 由于点云的不规则性,大多数研究者会将点转化为规则的体素或者网格模型, 利用深度神经网络进行预测。Object perception technology in 3D images is becoming more and more mature, and the demand for 3D understanding is becoming more and more urgent. Due to the irregularity of point clouds, most researchers convert the points into regular voxel or grid models and use deep neural networks for prediction.
直接对点云空间进行语义分割需要消耗极大的计算资源,空间点之间的相 互关系被削弱。The direct semantic segmentation of point cloud space consumes a lot of computing resources, and the relationship between spatial points is weakened.
2017年提出的PointNet(点网)是第一个可以直接处理原始三维点云的深 度神经网络。PointNet, proposed in 2017, is the first deep neural network that can directly process raw 3D point clouds.
现有大部分视觉SLAM系统采用的稠密建图方法,缺少语义信息,无法完 成智能化的需求。The dense mapping method adopted by most of the existing visual SLAM systems lacks semantic information and cannot meet the requirements of intelligence.
视觉SLAM算法有一个典型假设是场景固定,其中一些动态物体的出现不 仅影响相机位姿的估计而且在地图中留下残影,影响地图质量。A typical assumption of the visual SLAM algorithm is that the scene is fixed, and the appearance of some dynamic objects not only affects the estimation of the camera pose but also leaves an afterimage in the map, which affects the quality of the map.
相机在高速运动情况下所捕捉的照片容易模糊,极大的影响了特征点的提 取与匹配。The photos captured by the camera in the case of high-speed motion are easy to be blurred, which greatly affects the extraction and matching of feature points.
发明内容SUMMARY OF THE INVENTION
本发明的目的是提供一种基于视觉SLAM与二维语义分割的语义建图方 法,解决高速运动的动态物体影响建立地图质量的技术问题。The purpose of the present invention is to provide a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation, so as to solve the technical problem that high-speed moving dynamic objects affect the quality of the established map.
为了实现上述目的,本发明提供了一种基于视觉SLAM与二维语义分割的 语义建图方法,包括以下步骤:In order to achieve the above object, the present invention provides a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation, comprising the following steps:
S1、标定相机参数,校正相机畸变;S1. Calibrate camera parameters and correct camera distortion;
S2、获取图像帧序列,图像帧序列包括RGB图像和深度图像;S2, obtain an image frame sequence, the image frame sequence includes an RGB image and a depth image;
S3、图像预处理,采用针孔相机模型,得到RGB图像中每个像素点对应 的真实三维空间点的坐标;S3, image preprocessing, using the pinhole camera model to obtain the coordinates of the real three-dimensional space point corresponding to each pixel point in the RGB image;
S4、判断当前图像帧是否为关键帧,如果是,则转入步骤S6,如果不是, 则转入步骤S5;S4, determine whether the current image frame is a key frame, if so, then go to step S6, if not, then go to step S5;
S5、动态模糊补偿,计算得到当前图像帧的图像块质心作为语义特征点, 作为ORB特征点的补充;S5, dynamic blur compensation, calculating the centroid of the image block of the current image frame as a semantic feature point, as a supplement to the ORB feature point;
S6、语义分割,针对图像帧进行ORB特征点的提取,利用掩膜区域卷积 神经网络算法模型进行语义分割,获取该帧图像每一个像素点的语义信息;S6, semantic segmentation, extracting ORB feature points for the image frame, using the mask area convolutional neural network algorithm model to perform semantic segmentation, and obtaining the semantic information of each pixel of the frame image;
S7、位姿计算,利用稀疏SLAM算法模型计算相机位姿;S7, pose calculation, use the sparse SLAM algorithm model to calculate the camera pose;
S8、将语义信息输入到稀疏SLAM算法模型,辅助稠密语义地图构建, 完成关键帧的遍历,实现全局点云地图的三维语义建图。S8. Input the semantic information into the sparse SLAM algorithm model, assist the construction of the dense semantic map, complete the traversal of key frames, and realize the three-dimensional semantic map of the global point cloud map.
在一实施例中,所述步骤S1中的校正相机畸变,进一步包括以下步骤:In one embodiment, the correction of camera distortion in step S1 further includes the following steps:
S11、将相机坐标系的三维空间点P(X,Y,Z),投影到归一化图像平面形 成该点的归一化坐标为[x,y]T;S11, project the three-dimensional space point P(X, Y, Z) of the camera coordinate system to the normalized image plane to form the normalized coordinates of the point as [x, y] T ;
S12、对归一化平面上的点[x,y]T进行径向畸变和切向畸变校正,通过 以下公式实现:S12. Perform radial distortion and tangential distortion correction on the point [x, y] T on the normalized plane, which is achieved by the following formula:
其中,[xcorrected,ycorrected]T是校正后的点坐标,p1,p2为相机的切向畸变系 数,k1,k2,k3为相机的径向畸变系数,r为点P离坐标系原点的距离;Among them, [x corrected , y corrected ] T is the corrected point coordinates, p 1 , p 2 are the tangential distortion coefficients of the camera, k 1 , k 2 , k 3 are the radial distortion coefficients of the camera, and r is the point P the distance from the origin of the coordinate system;
S13、将校正后的点[xcorrected,ycorrected]T通过内参数矩阵,投影到像素平面得到其在图像上的正确位置[u,v]T,通过以下公式实现:S13. Project the corrected point [x corrected ,y corrected ] T through the internal parameter matrix to the pixel plane to obtain its correct position [u,v] T on the image, which is achieved by the following formula:
其中,fx,fy,cx,cy为相机的内参数。Among them, f x , f y , c x , and cy are the internal parameters of the camera.
在一实施例中,所述步骤S3的图像预处理,进一步包括,像素点[u,v]T到 真实三维空间点P(X,Y,Z)的映射关系满足以下公式:In one embodiment, the image preprocessing in step S3 further includes that the mapping relationship between the pixel point [u, v] T to the real three-dimensional space point P(X, Y, Z) satisfies the following formula:
其中,K称为内参数矩阵,fx,fy,cx,cy为相机的内参数,P为真实三维 空间点坐标,[u,v]T为像素点坐标。Among them, K is called the internal parameter matrix, f x , f y , c x , cy are the internal parameters of the camera, P is the real three-dimensional space point coordinates, [u, v] T is the pixel point coordinates.
在一实施例中,所述步骤S4的关键帧,使用稀疏SLAM算法模型进行筛 选。In one embodiment, the key frame of step S4 is screened using a sparse SLAM algorithm model.
在一实施例中,所述步骤S5的图像块质心,通过以下步骤得到:In one embodiment, the centroid of the image block in step S5 is obtained through the following steps:
将该帧图像每一个物体标注为一个具体的类;Label each object in the frame image as a specific class;
对于每一个分割出来的对象有对应标注区域,分割出来的图像称为图像 块;For each segmented object, there is a corresponding marked area, and the segmented image is called an image block;
计算图像块的矩p,q={0,1};Calculate moments of image patches p,q={0,1};
计算对应的质心C作为语义特征点,对ORB特征点进行补充,其中Calculate the corresponding centroid C as a semantic feature point to supplement the ORB feature point, where
在一实施例中,所述步骤S6,进一步包括:In one embodiment, the step S6 further includes:
每一个像素点的语义信息包括语义分类标签、包围框坐标以及该分类的 置信分数;The semantic information of each pixel includes semantic classification label, bounding box coordinates and confidence score of the classification;
基于语义分割结果,对于指定某种类别为动态物体的区域所提取的ORB 特征点进行剔除。Based on the results of semantic segmentation, the ORB feature points extracted from the regions that specify a certain category as dynamic objects are eliminated.
在一实施例中,所述步骤S6的Mask R-CNN算法模型进行语义分割,进 一步包括:In one embodiment, the Mask R-CNN algorithm model of described step S6 carries out semantic segmentation, and further comprises:
通过特征图金字塔网络提取输入图像不同层次上的特征;Extract the features at different levels of the input image through the feature map pyramid network;
通过区域生成网络提出感兴趣提案;Propose proposals of interest through a region generation network;
利用感兴趣区域排列进行提案区域对齐;Proposal area alignment using ROI alignment;
利用全卷积网络进行掩膜分割;Mask segmentation using fully convolutional networks;
利用全连接层进行区域坐标确定和所述类别分类。The region coordinate determination and the class classification are performed using fully connected layers.
在一实施例中,所述稀疏SLAM算法模型,进一步包括,跟踪线程、局部 建图线程、回环检测线程:In one embodiment, the sparse SLAM algorithm model further includes a tracking thread, a local mapping thread, and a loop closure detection thread:
所述跟踪线程,通过寻找对局部地图特征进行匹配,利用纯运动光束平差 法最小化重投影误差进行定位每帧图片的相机;Described tracking thread, by looking for the local map feature to be matched, utilize pure motion beam adjustment method to minimize the reprojection error to locate the camera of each frame of picture;
所述局部建图线程,通过执行局部光束平差法管理局部地图并优化,通过 地图点维护关键帧之间的共视关系,通过局部光束平差法优化共视关键帧位姿 和地图点;The local map building thread manages and optimizes the local map by executing the local beam adjustment method, maintains the co-view relationship between key frames through map points, and optimizes the co-view key frame pose and map point through the local beam adjustment method. ;
所述回环检测线程,检测大的环并通过执行位姿图优化更正漂移误差,加 速闭环匹配帧的筛选,并优化尺度,通过全局光束平差法优化本质图和地图点。The loop closure detection thread detects large loops and corrects drift errors by performing pose graph optimization, accelerates the screening of closed-loop matching frames, optimizes scale, and optimizes essence maps and map points by global beam adjustment.
在一实施例中,所述稀疏SLAM算法模型,进一步包括全局光束平差法 优化线程,在回环检测线程确认后触发,在位姿图优化之后,计算整个系统最 优结构和运动结果。In one embodiment, the sparse SLAM algorithm model further includes a global beam adjustment method optimization thread, which is triggered after the loop closure detection thread is confirmed, and after the pose graph optimization, calculates the optimal structure and motion results of the entire system.
在一实施例中,所述步骤S7的位姿计算,进一步包括:通过PnP求解初 步计算相机位姿,利用后端位姿图优化计算相机位姿,构建相机位姿估计的最 小化重投影误差:In one embodiment, the pose calculation in step S7 further includes: initially calculating the camera pose through PnP solution, using the back-end pose graph to optimize the calculation of the camera pose, and constructing a minimized reprojection error of the camera pose estimation :
其中,ui为像素坐标,Pi为相机坐标,ξ^为相机位姿对应的李代数,si为 特征点深度,K为相机内参数矩阵。Among them, ui is the pixel coordinate, P i is the camera coordinate, ξ^ is the Lie algebra corresponding to the camera pose, si is the depth of the feature point, and K is the camera internal parameter matrix.
本发明提供的一种基于视觉SLAM与二维语义分割的语义建图方法,基于 ORB特征点的Mask R-CNN算法模型和稀疏SLAM算法模型,建立剔除动态 物体的稠密语义地图,利用帧间信息以及图像帧上的语义信息来提升无人机语 义建图系统性能,提升针对动态场景进行特征点的提取与匹配的鲁棒性。The present invention provides a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation, based on the Mask R-CNN algorithm model of ORB feature points and the sparse SLAM algorithm model, establishes a dense semantic map for eliminating dynamic objects, and uses inter-frame information. And semantic information on image frames to improve the performance of UAV semantic mapping system, and improve the robustness of feature point extraction and matching for dynamic scenes.
附图说明Description of drawings
本发明上述的以及其他的特征、性质和优势将通过下面结合附图和实施例 的描述而变的更加明显,在附图中相同的附图标记始终表示相同的特征,其中:The above and other features, properties and advantages of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings and embodiments, wherein like reference numerals refer to like features throughout, wherein:
图1揭示了根据本发明一实施例的方法流程图;FIG. 1 discloses a flow chart of a method according to an embodiment of the present invention;
图2揭示了根据本发明一实施例的相机标定用标定板;FIG. 2 discloses a calibration board for camera calibration according to an embodiment of the present invention;
图3a揭示了根据本发明一实施例的针孔相机的小孔成像模型图;Fig. 3a discloses a pinhole imaging model diagram of a pinhole camera according to an embodiment of the present invention;
图3b揭示了根据本发明一实施例的针孔相机的相似三角形原理图;Fig. 3b discloses a similar triangular schematic diagram of a pinhole camera according to an embodiment of the present invention;
图4揭示了根据本发明一实施例的Mask RCNN的系统流程图。FIG. 4 discloses a system flowchart of Mask RCNN according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实 施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅 用以解释发明,并不用于限定发明。In order to make the objects, technical solutions and advantages of the present invention more clearly understood, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the invention, but not to limit the invention.
语义地图的概念,指的是一种包含丰富语义信息的地图,表示了对环境中 空间几何关系和存在的物体种类位置等语义信息的抽象。语义地图既包含环境 空间信息又包含环境语义信息的地图,这样移动机器人可以像人一样,既知道 环境中有物体也知道物体是什么。The concept of semantic map refers to a map containing rich semantic information, which represents the abstraction of semantic information such as spatial geometric relationship and existing object types and locations in the environment. Semantic maps contain both the spatial information of the environment and the semantic information of the environment, so that the mobile robot, like a human, knows both the objects in the environment and what the objects are.
针对现有技术中存在的问题与不足,本发明提出基于视觉SLAM与二维语 义分割的语义建图系统,使用基于ORB(Oriented FAST and Rotated BRIEF, 快速导向与简要旋转)特征点进行语义分割,结合稀疏SLAM算法模型,实现 定位的同时完成语义建图。In view of the problems and deficiencies in the prior art, the present invention proposes a semantic mapping system based on visual SLAM and two-dimensional semantic segmentation, and uses feature points based on ORB (Oriented FAST and Rotated BRIEF, fast orientation and brief rotation) for semantic segmentation, Combined with the sparse SLAM algorithm model, semantic mapping is completed while positioning is achieved.
图1揭示了根据本发明一实施例的基于视觉SLAM与二维语义分割的语 义建图方法流程图,在图1所示的实施例中,本发明提出的基于视觉SLAM 与二维语义分割的语义建图方法,具体步骤如下:1 shows a flowchart of a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation according to an embodiment of the present invention. In the embodiment shown in FIG. 1 , the present invention proposes a method based on visual SLAM and two-dimensional semantic segmentation. Semantic mapping method, the specific steps are as follows:
S1、标定相机参数,校正相机畸变;S1. Calibrate camera parameters and correct camera distortion;
S2、获取图像帧序列,图像帧序列包括RGB图像和深度图像;S2, obtain an image frame sequence, the image frame sequence includes an RGB image and a depth image;
S3、图像预处理,采用针孔相机模型,得到RGB图像中每个像素点对应 的真实三维空间点的坐标;S3, image preprocessing, using the pinhole camera model to obtain the coordinates of the real three-dimensional space point corresponding to each pixel point in the RGB image;
S4、判断当前图像帧是否为关键帧,如果是,则转入步骤S6,如果不是, 则转入步骤S5;S4, determine whether the current image frame is a key frame, if so, then go to step S6, if not, then go to step S5;
S5、动态模糊补偿,计算得到当前图像帧的图像块质心作为语义特征点, 作为ORB特征点的补充;S5, dynamic blur compensation, calculating the centroid of the image block of the current image frame as a semantic feature point, as a supplement to the ORB feature point;
S6、语义分割,针对图像帧进行ORB特征点的提取,利用Mask R-CNN 算法模型进行语义分割,获取该帧图像每一个像素点的语义信息;S6. Semantic segmentation, extracting ORB feature points for the image frame, using the Mask R-CNN algorithm model to perform semantic segmentation, and obtaining the semantic information of each pixel of the frame image;
S7、位姿计算,利用稀疏SLAM算法模型计算相机位姿;S7, pose calculation, use the sparse SLAM algorithm model to calculate the camera pose;
S8、将语义信息输入到稀疏SLAM算法模型,辅助稠密语义地图构建, 完成关键帧的遍历,实现全局点云地图的三维语义建图。S8. Input the semantic information into the sparse SLAM algorithm model, assist the construction of the dense semantic map, complete the traversal of key frames, and realize the three-dimensional semantic map of the global point cloud map.
下面对每一步骤进行详细的说明介绍。Each step is described in detail below.
步骤S1:标定相机参数、校正相机畸变。Step S1: calibrate camera parameters and correct camera distortion.
在图像测量过程以及机器视觉应用中,为确定空间物体表面某点的三维几 何位置与其在图像中对应点之间的相互关系,必须建立相机成像的几何模型, 这些几何模型参数就是相机参数。In the process of image measurement and machine vision applications, in order to determine the relationship between the three-dimensional geometric position of a point on the surface of a space object and its corresponding point in the image, a geometric model of camera imaging must be established, and these geometric model parameters are camera parameters.
畸变系数属于其中一种相机参数,对应与相机畸变现象。在大多数条件下, 这些相机参数必须通过实验与计算才能得到,这个求解参数的过程就称之为相 机标定(或摄像机标定)。The distortion coefficient is one of the camera parameters, which corresponds to the camera distortion phenomenon. Under most conditions, these camera parameters must be obtained through experiments and calculations, and this process of solving parameters is called camera calibration (or camera calibration).
相机畸变包括径向畸变和切向畸变。Camera distortion includes radial distortion and tangential distortion.
所述径向畸变由透镜形状引起。The radial distortion is caused by the lens shape.
更具体的说,在针孔模型中,一条直线投影到像素平面上还是一条直线。More specifically, in the pinhole model, a straight line projected onto the pixel plane is still a straight line.
可是,在实际拍摄的照片中,摄像机的透镜往往使得真实环境中的一条直 线在图片中变成了曲线,这种畸变称为径向畸变。However, in actual photos, the lens of the camera often turns a straight line in the real environment into a curve in the picture, and this distortion is called radial distortion.
所述切向畸变,在相机的组装过程中由于不能使得透镜和成像面严格平行 而形成。The tangential distortion is formed during the assembly process of the camera because the lens and the imaging plane cannot be strictly parallel.
由于光线投射导致实际对象物体跟投影到2D平面的图像不一致,这种不 一致性是稳定的,可以通过对相机标定,计算出畸变参数来实现对后续图像的 畸变校正。Due to ray projection, the actual object is inconsistent with the image projected to the 2D plane. This inconsistency is stable, and the distortion correction of subsequent images can be achieved by calibrating the camera and calculating the distortion parameters.
对于径向畸变,用和距中心距离有关的二次及高次多项式函数进行纠正:For radial distortion, correct with quadratic and higher-order polynomial functions related to the distance from the center:
其中,[x,y]T是未纠正的点的坐标,[xcorrected,ycorrected]T为纠正后的点坐标,k1,k2,k3为相机的径向畸变系数,r为点P离坐标系原点的距离。Among them, [x,y] T is the coordinate of the uncorrected point, [x corrected ,y corrected ] T is the corrected point coordinate, k 1 , k 2 , k 3 are the radial distortion coefficients of the camera, and r is the point The distance of P from the origin of the coordinate system.
对于切向畸变可以使用另外的两个参数p1,p2来进行纠正:For tangential distortion, two other parameters p 1 , p 2 can be used to correct:
其中,[x,y]T是未纠正的点的坐标,[xcorrected,ycorrected]T为纠正后的点坐标,p1,p2为相机的切向畸变系数,r为点P离坐标系原点的距离。Among them, [x,y] T is the coordinate of the uncorrected point, [x corrected ,y corrected ] T is the corrected point coordinate, p 1 , p 2 are the tangential distortion coefficients of the camera, and r is the point P away from the coordinate The distance from the origin.
在相机使用前,通过标定相机的径向畸变系数和切向畸变系数,从二维 的图像中获取三维信息,实现图像的畸变校正、对象测量、三维重建等。Before the camera is used, by calibrating the radial distortion coefficient and tangential distortion coefficient of the camera, the three-dimensional information is obtained from the two-dimensional image, and the image distortion correction, object measurement, and three-dimensional reconstruction are realized.
图2揭示了根据本发明一实施例的相机标定用标定板,将图2所示的标定 板摆在相机可视的范围内,每拍一张照片,标定板换一个位置和朝向,检测出 图象中的特征点,求出相机的内参数、外参数,进而得到畸变系数。Fig. 2 discloses a calibration plate for camera calibration according to an embodiment of the present invention. The calibration plate shown in Fig. 2 is placed within the visible range of the camera. From the feature points in the image, the internal parameters and external parameters of the camera are obtained, and then the distortion coefficients are obtained.
优选的,使用MATLAB中的Camera Calibrator(相机校正)工具箱进行 求解相机参数。Preferably, the Camera Calibrator (camera calibration) toolbox in MATLAB is used to solve the camera parameters.
对于相机坐标系中的点P(X,Y,Z),本发明的步骤S1通过5个畸变系数进 行相机畸变校正,找到这个点在像素平面上的正确位置。For the point P (X, Y, Z) in the camera coordinate system, step S1 of the present invention performs camera distortion correction through 5 distortion coefficients to find the correct position of this point on the pixel plane.
相机畸变的校正步骤如下:The steps to correct camera distortion are as follows:
S11、将三维空间点投影到归一化图像平面。设它的归一化坐标为[x,y]T。S11. Project the three-dimensional space point to the normalized image plane. Let its normalized coordinates be [x,y] T .
S12、对归一化平面上的点进行径向畸变和切向畸变纠正。S12. Perform radial distortion and tangential distortion correction on the points on the normalized plane.
其中,[xcorrected,ycorrected]T是校正后的点坐标,p1,p2为相机的切向畸变系 数,k1,k2,k3为相机的径向畸变系数,r为点P离坐标系原点的距离。Among them, [x corrected , y corrected ] T is the corrected point coordinates, p 1 , p 2 are the tangential distortion coefficients of the camera, k 1 , k 2 , k 3 are the radial distortion coefficients of the camera, and r is the point P The distance from the origin of the coordinate system.
S13、将纠正后的点[xcorrected,ycorrected]T通过内参数矩阵投影到像素平面,得到该点在图像上的正确位置坐标[u,v]T。S13. Project the corrected point [x corrected , y corrected ] T to the pixel plane through the internal parameter matrix to obtain the correct position coordinates [u, v] T of the point on the image.
其中,fx,fy,cx,cy为相机的内参数。Among them, f x , f y , c x , and cy are the internal parameters of the camera.
步骤S2、获取图像帧序列。Step S2, acquiring a sequence of image frames.
利用Kinect相机获取RGB-D图像帧序列,图像帧序列包括RGB图像和 深度图像。The RGB-D image frame sequence is obtained by using the Kinect camera, and the image frame sequence includes RGB image and depth image.
步骤S3、图像预处理Step S3, image preprocessing
在一实施例中,采用RGB-D相机作为主要传感器,同时获得RGB图像与 深度图像,采用针孔相机模型进行RGB图像的像素点到真实三维空间的映射。In one embodiment, an RGB-D camera is used as the main sensor to simultaneously obtain an RGB image and a depth image, and a pinhole camera model is used to map the pixels of the RGB image to the real three-dimensional space.
图3a揭示了根据本发明一实施例的针孔相机的小孔成像模型图,图3b 揭示了根据本发明一实施例的针孔相机的相似三角形原理图,如图3a和图 3b所示,建立相机坐标系O-x-y-z,以相机的光心位置为坐标系原点O,约定 箭头方向为正向。Fig. 3a discloses a pinhole imaging model diagram of a pinhole camera according to an embodiment of the present invention, and Fig. 3b discloses a similar triangular schematic diagram of a pinhole camera according to an embodiment of the present invention, as shown in Figs. 3a and 3b, Establish the camera coordinate system O-x-y-z, take the position of the optical center of the camera as the origin O of the coordinate system, and agree that the direction of the arrow is the positive direction.
通过图3b所示的相似三角形的映射变换,在相机的成像平面上建立坐标 系O'-x'-y'-z',约定箭头方向为正向。Through the mapping transformation of similar triangles shown in Figure 3b, a coordinate system O'-x'-y'-z' is established on the imaging plane of the camera, and it is agreed that the direction of the arrow is positive.
假设P点坐标是[X,Y,Z]T,相机镜片的焦距为f,焦距是相机光心到物理成 像平面的距离。Assuming that the coordinates of point P are [X, Y, Z] T , the focal length of the camera lens is f, and the focal length is the distance from the optical center of the camera to the physical imaging plane.
点P穿过光心投影到成像平面的点P’,像素点P’的坐标[u,v]T。The point P is projected through the optical center to the point P' of the imaging plane, and the coordinates of the pixel point P' are [u, v] T .
根据相应的对应关系,映射关系对应着一个尺度的缩放以及平移的量,推 导可得:According to the corresponding relationship, the mapping relationship corresponds to the amount of scaling and translation of a scale, and the derivation can be obtained:
其中,K称为相机内参数矩阵,为固有参数,在步骤S1中已经进行标定, fx,fy,cx,cy为相机的内参数,P为真实三维空间点坐标,[u,v]T为像素点坐标。Among them, K is called the camera internal parameter matrix, which is an intrinsic parameter, which has been calibrated in step S1, f x , f y , c x , cy are the internal parameters of the camera, P is the real three-dimensional space point coordinates, [u, v] T is the pixel coordinate.
步骤S4、判断是否是关键帧,如果是,则转入步骤S6,如果不是,则转 入步骤S5;Step S4, judge whether it is a key frame, if yes, then go to step S6, if not, then go to step S5;
如果采用每一帧图像来进行视觉SLAM和语义分割计算,计算量太大,因 此,选取其中质量高的作为关键帧。If each frame of image is used for visual SLAM and semantic segmentation calculation, the amount of calculation is too large, therefore, the one with high quality is selected as the key frame.
本发明中,使用基于ORB(Oriented FAST and Rotated BRIEF,快速导向 与简要旋转)特征点的稀疏SLAM算法模型来筛选关键帧。In the present invention, a sparse SLAM algorithm model based on ORB (Oriented FAST and Rotated BRIEF, fast orientation and brief rotation) feature points is used to screen key frames.
每一个关键帧,都包含一张RGB图像和一张深度图像。Each keyframe contains an RGB image and a depth image.
步骤S5、动态模糊Step S5, dynamic blur
由于每帧图像中均可能存在动态物体,每次执行语义建图任务时,指定某 几种目标为动态目标。在图像序列中,如果在该帧图像中识别出该动态目标, 则本发明在二维像素点到三维空间坐标转化时,对相应的点云进行剔除,防止 动态物体在地图中留下残影,影响建图质量。Since there may be dynamic objects in each frame of image, each time the semantic mapping task is performed, certain kinds of objects are designated as dynamic objects. In the image sequence, if the dynamic target is identified in the frame image, the present invention will remove the corresponding point cloud when the two-dimensional pixel point is converted to the three-dimensional space coordinate, so as to prevent the dynamic object from leaving afterimages in the map. , which affects the quality of the map.
本发明的步骤S5中,如果该帧图像不是关键帧,因为运动模糊,ORB特 征点提取不足,在步骤S6的图像语义分割步骤之前,进行如下操作作为补充:In step S5 of the present invention, if this frame image is not a key frame, because of motion blur, ORB feature point extraction is insufficient, before the image semantic segmentation step of step S6, carry out the following operations as a supplement:
该帧图像每一个物体标注为一个具体的类,对于每一个分割出来的对象有 对应标注区域,分割出来的图像称为图像块,计算图像块B的矩mpq:Each object in the frame image is marked as a specific class, and there is a corresponding marked area for each segmented object. The segmented image is called an image block, and the moment m pq of the image block B is calculated:
质心位置C为:The centroid position C is:
该质心作为语义特征点,对ORB特征点的不足进行补充。The centroid serves as a semantic feature point to supplement the insufficiency of ORB feature points.
针对ORB特征点损失严重的模糊图像进行语义特征点的补充,抑制跟踪 算法使用属于动态对象的匹配,进而综合筛选关键帧,进行相机位姿的估计, 防止建图算法将移动对象包括为3D地图的一部分。For the blurred images with serious loss of ORB feature points, the semantic feature points are supplemented, and the tracking algorithm is suppressed from using the matching belonging to the dynamic objects, and then the key frames are comprehensively screened, and the camera pose is estimated to prevent the mapping algorithm from including the moving objects as 3D maps. a part of.
步骤S6、语义分割Step S6, Semantic Segmentation
针对每一个图像帧进行ORB特征点的提取,利用Mask RCNN(Mask Region-CNN,掩膜区域卷积神经网络)算法模型进行语义分割,获取该帧图像 每一个像素点的语义信息。The ORB feature points are extracted for each image frame, and the Mask RCNN (Mask Region-CNN, mask region convolutional neural network) algorithm model is used to perform semantic segmentation to obtain the semantic information of each pixel of the frame image.
基于语义分割结果,如果识别出动态目标,对于指定某种类别为动态物体 的区域所提取的ORB特征点进行剔除。Based on the results of semantic segmentation, if a dynamic object is identified, the ORB feature points extracted from the region that specifies a certain category as a dynamic object are eliminated.
抑制视觉SLAM算法在建图过程中将移动对象包括为3D地图的一部分。Suppressed vision SLAM algorithms include moving objects as part of the 3D map during the mapping process.
本发明的步骤S6中,Mask RCNN算法模型采用COCO数据集进行训练。In step S6 of the present invention, the Mask RCNN algorithm model adopts the COCO data set for training.
COCO的全称是Common Objects in COntext,是微软团队提供的一个可以 用来进行图像识别的数据集,可以获得80个类别的分类信息。The full name of COCO is Common Objects in COntext. It is a data set provided by the Microsoft team that can be used for image recognition. Classification information of 80 categories can be obtained.
图4揭示了根据本发明一实施例的Mask RCNN的系统流程图,如图4 所示,基于Mask R-CNN算法模型实现图像帧的RGB图像语义分割,所述基 于Mask R-CNN算法模型的卷积神经网络框架,进行语义分割的步骤如下所 示:FIG. 4 discloses a system flowchart of Mask RCNN according to an embodiment of the present invention. As shown in FIG. 4 , RGB image semantic segmentation of image frames is implemented based on the Mask R-CNN algorithm model. Convolutional neural network framework, the steps for semantic segmentation are as follows:
通过FPN(Feature Pyramid Networks,特征图金字塔网络)提取输入图像 不同层次上的特征;Extract features at different levels of the input image through FPN (Feature Pyramid Networks);
通过RPN(Region Proposal Network,区域生成网络)提出感兴趣提案;Propose proposals of interest through RPN (Region Proposal Network);
利用RoI Align(Region of Interest Align,感兴趣区域排列)进行提案区域 对齐;Use RoI Align (Region of Interest Align, region of interest alignment) to align proposal regions;
利用FCN(Fully Convolutional Networks,全卷积网络)进行掩膜分割;Use FCN (Fully Convolutional Networks, fully convolutional network) for mask segmentation;
利用FC(Fully Connected Layers,全连接层)进行区域坐标确定以及所 属类别分类。Use FC (Fully Connected Layers) to determine the regional coordinates and classify the category.
该帧图像经过Mask RCNN算法模型处理,生成像素级别的语义分类结果, 即每一个像素点的语义分类标签,同时输出包围框坐标以及该分类的置信分 数。The frame image is processed by the Mask RCNN algorithm model to generate pixel-level semantic classification results, that is, the semantic classification label of each pixel point, and output the bounding box coordinates and the confidence score of the classification.
本发明采用ORB特征点进行追踪、建图和位置识别任务,ORB特征点的 优点是具有旋转不变性和尺度不变性,并且能够迅速的提取特征和进行匹配, 能够满足实时操作的需求,能够在基于词袋的位置识别过程中,显示出良好的 精度。The present invention uses ORB feature points for tracking, mapping and position recognition tasks. The advantages of ORB feature points are that they have rotation invariance and scale invariance, and can quickly extract features and perform matching, which can meet the needs of real-time operations, and can be used in real-time operations. The bag-of-words-based position recognition process shows good accuracy.
S7位姿计算S7 pose calculation
视觉里程计位姿的估计是对于相邻两帧图像而言的,不难理解,多个这样 的帧间位姿估计累积就是相机的运动轨迹。The estimation of the visual odometry pose is for two adjacent frames of images. It is not difficult to understand that the accumulation of multiple such inter-frame pose estimates is the motion trajectory of the camera.
使用基于ORB(Oriented FAST and Rotated BRIEF,快速导向与简要旋转) 特征点的稀疏SLAM算法模型计算相机位姿。The camera pose is calculated using a sparse SLAM algorithm model based on ORB (Oriented FAST and Rotated BRIEF, fast orientation and brief rotation) feature points.
在提取图像帧特征点后,基于关键帧使用PnP进行相机位姿的估计。After extracting image frame feature points, PnP is used to estimate the camera pose based on key frames.
PnP为Perspective-n-Point(n点透视)的简称,是求解3D到2D点对的运 动的方法:即给出n个3D空间点及其投影位置时,如何求解相机的位姿。PnP is the abbreviation of Perspective-n-Point (n-point perspective), which is a method to solve the motion of 3D to 2D point pairs: that is, how to solve the pose of the camera when n 3D space points and their projection positions are given.
假设在时刻k,相机的位置为xk,相机输入数据为uk,wk为噪声,构建运 动方程:Assuming that at time k, the camera's position is x k , the camera input data is u k , and w k is noise, construct the motion equation:
xk=f(xk-1,uk,wk)。x k =f(x k-1 , uk , w k ) .
在xk位置上观测到路标点yj,产生一系列观测数据zk,j,vk,j为观测噪声, 构建观测方程:The landmark point y j is observed at the position of x k , a series of observation data z k,j are generated, v k,j is the observation noise, and the observation equation is constructed:
zk,j=h(yj,xk,vk,j)。z k,j =h(y j ,x k ,v k,j ).
本发明的步骤S7中,通过PnP问题求解可以初步计算相机位姿,进而利 用后端位姿图优化进一步计算更为精确的相机位姿。In step S7 of the present invention, the camera pose can be preliminarily calculated by solving the PnP problem, and then a more accurate camera pose can be further calculated by using the back-end pose graph optimization.
本发明的步骤S7中,把相机位姿估计的PnP问题,构建成一个定义域李 代数上的非线性最小二乘问题。In step S7 of the present invention, the PnP problem of camera pose estimation is constructed as a nonlinear least squares problem on Lie algebra of definition domain.
更进一步的,本发明步骤S7的相机位姿估计,构建为一个BA(Bundle Adjustment,光束平差法)问题,构建相机位姿估计的最小化重投影误差:Further, the camera pose estimation in step S7 of the present invention is constructed as a BA (Bundle Adjustment, beam adjustment method) problem, and the minimum reprojection error of the camera pose estimation is constructed:
其中,ui为像素坐标,Pi为相机坐标,ξ^为相机位姿对应的李代数,si为 特征点深度,K为相机内参数矩阵,n为点的个数。Among them, ui is the pixel coordinate, P i is the camera coordinate, ξ^ is the Lie algebra corresponding to the camera pose, s i is the depth of the feature point, K is the camera internal parameter matrix, and n is the number of points.
更进一步的,本发明的步骤S7进一步包括,采用基于DBOW(Direct index Bag ofwords,词袋模型)嵌入式位置识别模型进行重定位,来防止跟踪失败、 或已知地图场景重初始化、回环检测等。Further, step S7 of the present invention further includes, using a DBOW (Direct index Bag of words, bag of words model) embedded location recognition model for relocation to prevent tracking failure, or re-initialization of known map scenes, loopback detection, etc. .
本发明中采用稀疏SLAM算法模型进行关键帧的筛选和相机位姿计算,所 述稀疏SLAM算法模型,是在ORB-SLAM2(Oriented FAST and Rotated BRIEF- SimultaneousLocalization and Mapping 2,第二代快速导向与简要旋转的即时定 位与地图构建)的基础上进行改进得到。In the present invention, the sparse SLAM algorithm model is used to screen key frames and calculate the camera pose. The sparse SLAM algorithm model is in ORB-SLAM2 (Oriented FAST and Rotated BRIEF- Simultaneous Localization and Mapping 2, the second generation of fast guidance and brief Rotation real-time positioning and map construction) are improved on the basis of.
所述SLAM算法模型,由4个平行线程组成,包括跟踪线程、局部建图线 程、回环检测线程以及全局BA优化线程。The SLAM algorithm model consists of 4 parallel threads, including a tracking thread, a local mapping thread, a loopback detection thread and a global BA optimization thread.
更进一步的,全局BA优化线程,仅在回环检测线程确认后才执行。Furthermore, the global BA optimization thread is executed only after the loopback detection thread has confirmed it.
前三个线程为并行线程,定义分别如下:The first three threads are parallel threads, which are defined as follows:
1)跟踪线程。1) Trace the thread.
通过寻找对局部地图特征进行匹配,利用纯运动BA最小化重投影误差进 行定位每帧图片的相机。By finding matches to local map features, pure motion BA is used to minimize the reprojection error to locate the camera of each frame.
优选的,采用恒速模型进行匹配。Preferably, a constant velocity model is used for matching.
2)局部建图线程。2) Local mapping thread.
通过执行局部BA管理局部地图并优化,通过MapPoints(地图点)维护 关键帧之间的共视关系,通过局部BA优化共视关键帧位姿和MapPoints。The local map is managed and optimized by performing local BA, the co-view relationship between key frames is maintained through MapPoints (map points), and the co-view key frame pose and MapPoints are optimized through local BA.
3)回环检测线程。3) Loopback detection thread.
检测大的环并通过执行位姿图优化更正漂移误差,通过Bow加速闭环匹 配帧的筛选,并通过Sim3优化尺度,通过全局BA优化Essential Graph(本质 图)和MapPoints。所述Sim3变换就是相似变换。Detect large loops and correct drift errors by performing pose graph optimization, speed up the screening of closed-loop matching frames by Bow, optimize scale by Sim3, and optimize Essential Graph (essential graph) and MapPoints by global BA. The Sim3 transformation is the similarity transformation.
回环检测线程触发全局BA优化线程。The loopback detection thread triggers the global BA optimization thread.
全局BA线程,在位姿图优化之后,计算整个系统最优结构和运动结果。The global BA thread, after the pose graph optimization, calculates the optimal structure and motion results of the entire system.
与现有技术的稠密SLAM算法模型相比,在本发明的稀疏SLAM算法模 型,通过语义信息融合,在最终的建图过程中,增添了丰富的图像的语义分割 信息。Compared with the dense SLAM algorithm model of the prior art, in the sparse SLAM algorithm model of the present invention, through the fusion of semantic information, in the final mapping process, rich semantic segmentation information of the image is added.
步骤S8、三维语义建图Step S8, 3D semantic mapping
利用步骤S6的语义分割结果,结合步骤S7获取的帧间位姿信息以及图像 帧像素点的真实三维坐标,将语义信息输入到稀疏SLAM算法模型,将该帧图 像中语义包含的同种物体,以相同标注颜色投射到三维点云地图中,辅助稠密 语义地图构建,完成关键帧的遍历,实现全局点云地图的三维语义建图。Using the semantic segmentation result of step S6, combined with the inter-frame pose information obtained in step S7 and the real three-dimensional coordinates of the pixel points of the image frame, the semantic information is input into the sparse SLAM algorithm model, and the same kind of object semantically contained in the frame image, The same label color is projected into the 3D point cloud map to assist in the construction of dense semantic maps, complete the traversal of key frames, and realize 3D semantic map construction of the global point cloud map.
本发明的步骤S8进一步包括:Step S8 of the present invention further comprises:
S81、将第一帧关键帧生成的三维空间像素投影到一个初始点云中;S81. Project the three-dimensional space pixels generated by the key frame of the first frame into an initial point cloud;
S82、通过针孔模型计算得到的当前关键帧每一个像素点对应的三维空间 坐标,生成一幅点云地图;S82, generate a point cloud map through the three-dimensional space coordinates corresponding to each pixel point of the current key frame calculated by the pinhole model;
S83、计算得到当前关键帧与上一关键帧的位姿变化;S83. Calculate the pose change between the current key frame and the previous key frame;
S84、两幅点云地图通过位姿变换矩阵进行三维坐标点的叠加融合,生成 一幅信息更多的点云地图;S84. The two point cloud maps are superimposed and fused with three-dimensional coordinate points through the pose transformation matrix to generate a point cloud map with more information;
S85、上述步骤不断迭代,当完成所有关键帧的遍历,实现全局点云地图 的构建。S85. The above steps are continuously iterated. When the traversal of all key frames is completed, the construction of the global point cloud map is realized.
下面结合具体的试验,对本发明的基于视觉SLAM与二维语义分割的无人 机语义建图方法的试验结果作进一步详细说明。The experimental results of the UAV semantic mapping method based on visual SLAM and two-dimensional semantic segmentation of the present invention will be further described in detail below in conjunction with specific experiments.
本次试验基于操作系统Ubuntu16.04和硬件显卡Nvidia Geforce GTX 1050,借助Tensorflow、OpenCV、g2o、Point Cloud Library等软件工具,以真 实场景为实验条件,利用Kinect V1相机实拍数据。This experiment is based on the operating system Ubuntu16.04 and the hardware graphics card Nvidia Geforce GTX 1050. With the help of software tools such as Tensorflow, OpenCV, g2o, and Point Cloud Library, the real scene is used as the experimental condition, and the real shooting data of the Kinect V1 camera is used.
对于三维语义建图评估,Q1代表正确检测物品个数,Q2代表检测出物体 但是分类错误以及实际有物体但是没有检测出的数量,Q3代没有物体但是检测 出结果的数量,P代表三维物体正确检出率,计算方式下如所示:For 3D semantic mapping evaluation, Q 1 represents the number of correctly detected objects, Q 2 represents the number of objects detected but classified incorrectly and the number of objects that actually exist but not detected, Q 3 represents the number of detected objects without objects, and P represents the number of detected results The correct detection rate of 3D objects is calculated as follows:
P=Q1/(Q1+Q2+Q3)P=Q 1 /(Q 1 +Q 2 +Q 3 )
通过9次构建稠密语义建图进行实验记录,计算地图中的平均三维物体正 确检出率为48.1086%,具体实验结果如下表所示:Through 9 times of dense semantic mapping for experimental records, the average correct detection rate of 3D objects in the map is calculated to be 48.1086%. The specific experimental results are shown in the following table:
表1Table 1
本发明提供的一种基于视觉SLAM与二维语义分割的语义建图方法,基于 ORB特征点的Mask R-CNN算法模型和稀疏SLAM算法模型,建立剔除动态 物体的稠密语义地图,利用帧间信息以及图像帧上的语义信息来提升无人机语 义建图系统性能,提升针对动态场景进行特征点的提取与匹配的鲁棒性。The present invention provides a semantic mapping method based on visual SLAM and two-dimensional semantic segmentation, based on the Mask R-CNN algorithm model of ORB feature points and the sparse SLAM algorithm model, establishes a dense semantic map for eliminating dynamic objects, and uses inter-frame information. And semantic information on image frames to improve the performance of UAV semantic mapping system, and improve the robustness of feature point extraction and matching for dynamic scenes.
尽管为使解释简单化将上述方法图示并描述为一系列动作,但是应理解并 领会,这些方法不受动作的次序所限,因为根据一个或多个实施例,一些动作 可按不同次序发生和/或与来自本文中图示和描述或本文中未图示和描述但本 领域技术人员可以理解的其他动作并发地发生。Although the above-described methods are illustrated and described as a series of acts for simplicity of explanation, it should be understood and appreciated that these methods are not limited by the order of the acts, as some acts may occur in a different order in accordance with one or more embodiments and/or occur concurrently with other actions from or not shown and described herein but understood by those skilled in the art.
如本申请和权利要求书中所示,除非上下文明确提示例外情形,“一”、“一 个”、“一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包 括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成 一个排它性的罗列,方法或者设备也可能包含其他的步骤或元素。As shown in this application and in the claims, unless the context clearly dictates otherwise, the words "a", "an", "an" and/or "the" are not intended to be specific in the singular and may include the plural. In general, the terms "comprising" and "comprising" only imply that the clearly identified steps and elements are included, and these steps and elements do not constitute an exclusive list, and the method or apparatus may also include other steps or elements.
上述实施例是提供给熟悉本领域内的人员来实现或使用本发明的,熟悉本 领域的人员可在不脱离本发明的发明思想的情况下,对上述实施例做出种种修 改或变化,因而本发明的保护范围并不被上述实施例所限,而应该是符合权利 要求书提到的创新性特征的最大范围。The above-mentioned embodiments are provided for those skilled in the art to realize or use the present invention. Those skilled in the art can make various modifications or changes to the above-mentioned embodiments without departing from the inventive concept of the present invention. The protection scope of the present invention is not limited by the above-mentioned embodiments, but should be the maximum scope conforming to the innovative features mentioned in the claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010246158.2A CN111462135B (en) | 2020-03-31 | 2020-03-31 | Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010246158.2A CN111462135B (en) | 2020-03-31 | 2020-03-31 | Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111462135A true CN111462135A (en) | 2020-07-28 |
CN111462135B CN111462135B (en) | 2023-04-21 |
Family
ID=71680957
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010246158.2A Active CN111462135B (en) | 2020-03-31 | 2020-03-31 | Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111462135B (en) |
Cited By (63)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559320A (en) * | 2018-09-18 | 2019-04-02 | 华东理工大学 | Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network |
CN111950561A (en) * | 2020-08-25 | 2020-11-17 | 桂林电子科技大学 | A Semantic Segmentation-Based Method for Eliminating Semantic SLAM Dynamic Points |
CN112017188A (en) * | 2020-09-09 | 2020-12-01 | 上海航天控制技术研究所 | Space non-cooperative target semantic identification and reconstruction method |
CN112132893A (en) * | 2020-08-31 | 2020-12-25 | 同济人工智能研究院(苏州)有限公司 | Visual SLAM method suitable for indoor dynamic environment |
CN112183476A (en) * | 2020-10-28 | 2021-01-05 | 深圳市商汤科技有限公司 | Obstacle detection method and device, electronic equipment and storage medium |
CN112258575A (en) * | 2020-10-13 | 2021-01-22 | 浙江大学 | Method for quickly identifying object in synchronous positioning and map construction |
CN112308921A (en) * | 2020-11-09 | 2021-02-02 | 重庆大学 | A dynamic SLAM method for joint optimization based on semantics and geometry |
CN112348868A (en) * | 2020-11-06 | 2021-02-09 | 养哇(南京)科技有限公司 | Method and system for recovering monocular SLAM scale through detection and calibration |
CN112344922A (en) * | 2020-10-26 | 2021-02-09 | 中国科学院自动化研究所 | Monocular vision odometer positioning method and system |
CN112381841A (en) * | 2020-11-27 | 2021-02-19 | 广东电网有限责任公司肇庆供电局 | Semantic SLAM method based on GMS feature matching in dynamic scene |
CN112396595A (en) * | 2020-11-27 | 2021-02-23 | 广东电网有限责任公司肇庆供电局 | Semantic SLAM method based on point-line characteristics in dynamic environment |
CN112446882A (en) * | 2020-10-28 | 2021-03-05 | 北京工业大学 | Robust visual SLAM method based on deep learning in dynamic scene |
CN112465858A (en) * | 2020-12-10 | 2021-03-09 | 武汉工程大学 | Semantic vision SLAM method based on probability grid filtering |
CN112465021A (en) * | 2020-11-27 | 2021-03-09 | 南京邮电大学 | Pose track estimation method based on image frame interpolation method |
CN112507056A (en) * | 2020-12-21 | 2021-03-16 | 华南理工大学 | Map construction method based on visual semantic information |
CN112509051A (en) * | 2020-12-21 | 2021-03-16 | 华南理工大学 | Bionic-based autonomous mobile platform environment sensing and mapping method |
CN112571415A (en) * | 2020-12-03 | 2021-03-30 | 哈尔滨工业大学(深圳) | Robot autonomous door opening method and system based on visual guidance |
CN112734845A (en) * | 2021-01-08 | 2021-04-30 | 浙江大学 | Outdoor monocular synchronous mapping and positioning method fusing scene semantics |
CN112991436A (en) * | 2021-03-25 | 2021-06-18 | 中国科学技术大学 | Monocular vision SLAM method based on object size prior information |
CN112990195A (en) * | 2021-03-04 | 2021-06-18 | 佛山科学技术学院 | SLAM loop detection method for integrating semantic information in complex environment |
CN113034584A (en) * | 2021-04-16 | 2021-06-25 | 广东工业大学 | Mobile robot visual positioning method based on object semantic road sign |
CN113192200A (en) * | 2021-04-26 | 2021-07-30 | 泰瑞数创科技(北京)有限公司 | Method for constructing urban real scene three-dimensional model based on space-three parallel computing algorithm |
CN113269831A (en) * | 2021-05-19 | 2021-08-17 | 北京能创科技有限公司 | Visual repositioning method, system and device based on scene coordinate regression network |
CN113516692A (en) * | 2021-05-18 | 2021-10-19 | 上海汽车集团股份有限公司 | Multi-sensor fusion SLAM method and device |
CN113537208A (en) * | 2021-05-18 | 2021-10-22 | 杭州电子科技大学 | Visual positioning method and system based on semantic ORB-SLAM technology |
CN113610763A (en) * | 2021-07-09 | 2021-11-05 | 北京航天计量测试技术研究所 | Rocket engine structural member pose motion compensation method in vibration environment |
CN113628334A (en) * | 2021-07-16 | 2021-11-09 | 中国科学院深圳先进技术研究院 | Visual SLAM method, device, terminal equipment and storage medium |
CN113658257A (en) * | 2021-08-17 | 2021-11-16 | 广州文远知行科技有限公司 | Unmanned equipment positioning method, device, equipment and storage medium |
CN113674340A (en) * | 2021-07-05 | 2021-11-19 | 北京物资学院 | A method and device for binocular vision navigation based on landmarks |
CN113674416A (en) * | 2021-08-26 | 2021-11-19 | 中国电子科技集团公司信息科学研究院 | Three-dimensional map construction method and device, electronic equipment and storage medium |
CN113808251A (en) * | 2021-08-09 | 2021-12-17 | 杭州易现先进科技有限公司 | Dense reconstruction method, system, device and medium based on semantic segmentation |
CN113903011A (en) * | 2021-10-26 | 2022-01-07 | 江苏大学 | Semantic map construction and positioning method suitable for indoor parking lot |
WO2022021739A1 (en) * | 2020-07-30 | 2022-02-03 | 国网智能科技股份有限公司 | Humanoid inspection operation method and system for semantic intelligent substation robot |
CN114132360A (en) * | 2021-11-08 | 2022-03-04 | 卡斯柯信号有限公司 | Method, device and storage medium for anti-crowding turnout based on image discrimination of switch state |
CN114202579A (en) * | 2021-11-01 | 2022-03-18 | 东北大学 | A real-time multi-body SLAM system for dynamic scenes |
CN114359493A (en) * | 2021-12-20 | 2022-04-15 | 中国船舶重工集团公司第七0九研究所 | Method and system for generating three-dimensional semantic map for unmanned ship |
CN114488244A (en) * | 2022-02-16 | 2022-05-13 | 东南大学 | Wearable blind-person aided navigation device and method based on semantic VISLAM and GNSS positioning |
CN114529800A (en) * | 2022-01-12 | 2022-05-24 | 华南理工大学 | Obstacle avoidance method, system, device and medium for rotor unmanned aerial vehicle |
CN114550186A (en) * | 2022-04-21 | 2022-05-27 | 北京世纪好未来教育科技有限公司 | Method and device for correcting document image, electronic equipment and storage medium |
CN114581616A (en) * | 2022-01-28 | 2022-06-03 | 苏州大学 | Visual inertia SLAM system based on multitask feature extraction network |
CN114612525A (en) * | 2022-02-09 | 2022-06-10 | 浙江工业大学 | Robot RGB-D SLAM method based on grid segmentation and double-map coupling |
CN114627300A (en) * | 2020-12-11 | 2022-06-14 | 中国科学院深圳先进技术研究院 | Generation-based countermeasure network lifetime semantic SLAM system and method |
CN114708321A (en) * | 2022-01-12 | 2022-07-05 | 北京航空航天大学 | Semantic-based camera pose estimation method and system |
CN114972470A (en) * | 2022-07-22 | 2022-08-30 | 北京中科慧眼科技有限公司 | Road surface environment obtaining method and system based on binocular vision |
CN115128628A (en) * | 2022-06-01 | 2022-09-30 | 北京理工大学 | Construction method of road grid map based on laser SLAM and monocular vision |
CN115164918A (en) * | 2022-09-06 | 2022-10-11 | 联友智连科技有限公司 | Semantic point cloud map construction method and device and electronic equipment |
CN115409910A (en) * | 2021-05-28 | 2022-11-29 | 阿里巴巴新加坡控股有限公司 | A semantic map construction method, visual positioning method and related equipment |
CN115451939A (en) * | 2022-08-19 | 2022-12-09 | 中国人民解放军国防科技大学 | A Parallel SLAM Method Based on Detection Segmentation in Dynamic Scenes |
CN115564731A (en) * | 2022-09-30 | 2023-01-03 | 华东理工大学 | A method and system for manipulating deformable objects based on visual feedback |
WO2023015566A1 (en) * | 2021-08-13 | 2023-02-16 | 深圳市大疆创新科技有限公司 | Control method, control device, movable platform, and storage medium |
CN115731385A (en) * | 2022-11-22 | 2023-03-03 | 中国电子科技南湖研究院 | Image Feature Extraction Method, Device and SLAM System Based on Semantic Segmentation |
CN116339336A (en) * | 2023-03-29 | 2023-06-27 | 北京信息科技大学 | Method, device and system for collaborative operation of electric agricultural machinery cluster |
CN116342800A (en) * | 2023-02-21 | 2023-06-27 | 中国航天员科研训练中心 | Semantic three-dimensional reconstruction method and system for multi-mode pose optimization |
CN116681755A (en) * | 2022-12-29 | 2023-09-01 | 广东美的白色家电技术创新中心有限公司 | Pose prediction method and device |
CN116817887A (en) * | 2023-06-28 | 2023-09-29 | 哈尔滨师范大学 | Semantic visual SLAM map construction method, electronic equipment and storage medium |
CN116977189A (en) * | 2022-04-15 | 2023-10-31 | 中国移动通信有限公司研究院 | Synchronous positioning and mapping method, device and storage medium |
CN117392347A (en) * | 2023-10-13 | 2024-01-12 | 苏州煋海图科技有限公司 | Map construction method, device, computer equipment and readable storage medium |
CN117611762A (en) * | 2024-01-23 | 2024-02-27 | 常熟理工学院 | A multi-level map construction method, system and electronic device |
CN118447320A (en) * | 2024-05-13 | 2024-08-06 | 华智清创(苏州)农业科技有限公司 | Visual multitasking mounted agricultural inspection method and device based on deep learning |
CN118710609A (en) * | 2024-06-20 | 2024-09-27 | 广东省科学院智能制造研究所 | Capsule endoscope gastrointestinal tract positioning method based on visual SLAM |
CN118887288A (en) * | 2024-07-16 | 2024-11-01 | 中山大学 | A multi-feature indoor visual positioning method and system based on ground segmentation network |
CN119066885A (en) * | 2024-11-04 | 2024-12-03 | 中国科学院长春光学精密机械与物理研究所 | A method for processing measured surface data suitable for optical modeling |
CN119323772A (en) * | 2024-12-19 | 2025-01-17 | 杭州智元研究院有限公司 | Semantic map construction method based on RGBD visual segmentation algorithm |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110097553A (en) * | 2019-04-10 | 2019-08-06 | 东南大学 | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system |
WO2019169540A1 (en) * | 2018-03-06 | 2019-09-12 | 斯坦德机器人(深圳)有限公司 | Method for tightly-coupling visual slam, terminal and computer readable storage medium |
-
2020
- 2020-03-31 CN CN202010246158.2A patent/CN111462135B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019169540A1 (en) * | 2018-03-06 | 2019-09-12 | 斯坦德机器人(深圳)有限公司 | Method for tightly-coupling visual slam, terminal and computer readable storage medium |
CN110097553A (en) * | 2019-04-10 | 2019-08-06 | 东南大学 | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system |
Non-Patent Citations (2)
Title |
---|
卞贤掌等: "基于语义分割的增强现实图像配准技术", 《电子技术与软件工程》 * |
王廷银等: "基于北斗RDSS的核辐射监测应急通讯方法", 《计算机系统应用》 * |
Cited By (87)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109559320B (en) * | 2018-09-18 | 2022-11-18 | 华东理工大学 | Method and system for implementing visual SLAM semantic mapping function based on dilated convolutional deep neural network |
CN109559320A (en) * | 2018-09-18 | 2019-04-02 | 华东理工大学 | Realize that vision SLAM semanteme builds the method and system of figure function based on empty convolution deep neural network |
WO2022021739A1 (en) * | 2020-07-30 | 2022-02-03 | 国网智能科技股份有限公司 | Humanoid inspection operation method and system for semantic intelligent substation robot |
CN111950561A (en) * | 2020-08-25 | 2020-11-17 | 桂林电子科技大学 | A Semantic Segmentation-Based Method for Eliminating Semantic SLAM Dynamic Points |
CN112132893A (en) * | 2020-08-31 | 2020-12-25 | 同济人工智能研究院(苏州)有限公司 | Visual SLAM method suitable for indoor dynamic environment |
CN112132893B (en) * | 2020-08-31 | 2024-01-09 | 同济人工智能研究院(苏州)有限公司 | Visual SLAM method suitable for indoor dynamic environment |
CN112017188A (en) * | 2020-09-09 | 2020-12-01 | 上海航天控制技术研究所 | Space non-cooperative target semantic identification and reconstruction method |
CN112017188B (en) * | 2020-09-09 | 2024-04-09 | 上海航天控制技术研究所 | Space non-cooperative target semantic recognition and reconstruction method |
CN112258575A (en) * | 2020-10-13 | 2021-01-22 | 浙江大学 | Method for quickly identifying object in synchronous positioning and map construction |
CN112344922A (en) * | 2020-10-26 | 2021-02-09 | 中国科学院自动化研究所 | Monocular vision odometer positioning method and system |
CN112183476A (en) * | 2020-10-28 | 2021-01-05 | 深圳市商汤科技有限公司 | Obstacle detection method and device, electronic equipment and storage medium |
CN112446882A (en) * | 2020-10-28 | 2021-03-05 | 北京工业大学 | Robust visual SLAM method based on deep learning in dynamic scene |
CN112446882B (en) * | 2020-10-28 | 2025-01-10 | 北京工业大学 | A robust visual SLAM method based on deep learning in dynamic scenes |
CN112348868A (en) * | 2020-11-06 | 2021-02-09 | 养哇(南京)科技有限公司 | Method and system for recovering monocular SLAM scale through detection and calibration |
CN112308921B (en) * | 2020-11-09 | 2024-01-12 | 重庆大学 | Combined optimization dynamic SLAM method based on semantics and geometry |
CN112308921A (en) * | 2020-11-09 | 2021-02-02 | 重庆大学 | A dynamic SLAM method for joint optimization based on semantics and geometry |
CN112396595A (en) * | 2020-11-27 | 2021-02-23 | 广东电网有限责任公司肇庆供电局 | Semantic SLAM method based on point-line characteristics in dynamic environment |
CN112465021B (en) * | 2020-11-27 | 2022-08-05 | 南京邮电大学 | Pose track estimation method based on image frame interpolation method |
CN112396595B (en) * | 2020-11-27 | 2023-01-24 | 广东电网有限责任公司肇庆供电局 | Semantic SLAM method based on point-line characteristics in dynamic environment |
CN112381841A (en) * | 2020-11-27 | 2021-02-19 | 广东电网有限责任公司肇庆供电局 | Semantic SLAM method based on GMS feature matching in dynamic scene |
CN112465021A (en) * | 2020-11-27 | 2021-03-09 | 南京邮电大学 | Pose track estimation method based on image frame interpolation method |
CN112571415A (en) * | 2020-12-03 | 2021-03-30 | 哈尔滨工业大学(深圳) | Robot autonomous door opening method and system based on visual guidance |
CN112571415B (en) * | 2020-12-03 | 2022-03-01 | 哈尔滨工业大学(深圳) | A method and system for autonomous robot door opening based on vision guidance |
CN112465858A (en) * | 2020-12-10 | 2021-03-09 | 武汉工程大学 | Semantic vision SLAM method based on probability grid filtering |
CN114627300A (en) * | 2020-12-11 | 2022-06-14 | 中国科学院深圳先进技术研究院 | Generation-based countermeasure network lifetime semantic SLAM system and method |
CN112509051A (en) * | 2020-12-21 | 2021-03-16 | 华南理工大学 | Bionic-based autonomous mobile platform environment sensing and mapping method |
CN112507056A (en) * | 2020-12-21 | 2021-03-16 | 华南理工大学 | Map construction method based on visual semantic information |
CN112734845A (en) * | 2021-01-08 | 2021-04-30 | 浙江大学 | Outdoor monocular synchronous mapping and positioning method fusing scene semantics |
CN112990195A (en) * | 2021-03-04 | 2021-06-18 | 佛山科学技术学院 | SLAM loop detection method for integrating semantic information in complex environment |
CN112991436B (en) * | 2021-03-25 | 2022-09-06 | 中国科学技术大学 | Monocular vision SLAM method based on object size prior information |
CN112991436A (en) * | 2021-03-25 | 2021-06-18 | 中国科学技术大学 | Monocular vision SLAM method based on object size prior information |
CN113034584A (en) * | 2021-04-16 | 2021-06-25 | 广东工业大学 | Mobile robot visual positioning method based on object semantic road sign |
CN113192200A (en) * | 2021-04-26 | 2021-07-30 | 泰瑞数创科技(北京)有限公司 | Method for constructing urban real scene three-dimensional model based on space-three parallel computing algorithm |
CN113516692A (en) * | 2021-05-18 | 2021-10-19 | 上海汽车集团股份有限公司 | Multi-sensor fusion SLAM method and device |
CN113537208B (en) * | 2021-05-18 | 2024-06-11 | 杭州电子科技大学 | Visual positioning method and system based on semantic ORB-SLAM technology |
CN113537208A (en) * | 2021-05-18 | 2021-10-22 | 杭州电子科技大学 | Visual positioning method and system based on semantic ORB-SLAM technology |
CN113269831A (en) * | 2021-05-19 | 2021-08-17 | 北京能创科技有限公司 | Visual repositioning method, system and device based on scene coordinate regression network |
CN115409910A (en) * | 2021-05-28 | 2022-11-29 | 阿里巴巴新加坡控股有限公司 | A semantic map construction method, visual positioning method and related equipment |
CN113674340A (en) * | 2021-07-05 | 2021-11-19 | 北京物资学院 | A method and device for binocular vision navigation based on landmarks |
CN113610763A (en) * | 2021-07-09 | 2021-11-05 | 北京航天计量测试技术研究所 | Rocket engine structural member pose motion compensation method in vibration environment |
CN113628334B (en) * | 2021-07-16 | 2024-11-15 | 中国科学院深圳先进技术研究院 | Visual SLAM method, device, terminal equipment and storage medium |
CN113628334A (en) * | 2021-07-16 | 2021-11-09 | 中国科学院深圳先进技术研究院 | Visual SLAM method, device, terminal equipment and storage medium |
CN113808251A (en) * | 2021-08-09 | 2021-12-17 | 杭州易现先进科技有限公司 | Dense reconstruction method, system, device and medium based on semantic segmentation |
CN113808251B (en) * | 2021-08-09 | 2024-04-12 | 杭州易现先进科技有限公司 | Dense reconstruction method, system, device and medium based on semantic segmentation |
WO2023015566A1 (en) * | 2021-08-13 | 2023-02-16 | 深圳市大疆创新科技有限公司 | Control method, control device, movable platform, and storage medium |
CN113658257A (en) * | 2021-08-17 | 2021-11-16 | 广州文远知行科技有限公司 | Unmanned equipment positioning method, device, equipment and storage medium |
CN113674416A (en) * | 2021-08-26 | 2021-11-19 | 中国电子科技集团公司信息科学研究院 | Three-dimensional map construction method and device, electronic equipment and storage medium |
CN113674416B (en) * | 2021-08-26 | 2024-04-26 | 中国电子科技集团公司信息科学研究院 | Three-dimensional map construction method and device, electronic equipment and storage medium |
CN113903011B (en) * | 2021-10-26 | 2024-06-11 | 江苏大学 | Semantic map construction and positioning method suitable for indoor parking lot |
CN113903011A (en) * | 2021-10-26 | 2022-01-07 | 江苏大学 | Semantic map construction and positioning method suitable for indoor parking lot |
CN114202579B (en) * | 2021-11-01 | 2024-07-16 | 东北大学 | Dynamic scene-oriented real-time multi-body SLAM system |
CN114202579A (en) * | 2021-11-01 | 2022-03-18 | 东北大学 | A real-time multi-body SLAM system for dynamic scenes |
CN114132360A (en) * | 2021-11-08 | 2022-03-04 | 卡斯柯信号有限公司 | Method, device and storage medium for anti-crowding turnout based on image discrimination of switch state |
CN114132360B (en) * | 2021-11-08 | 2023-09-08 | 卡斯柯信号有限公司 | Method, equipment and storage medium for preventing turnout from being squeezed based on image discrimination of turnout state |
CN114359493A (en) * | 2021-12-20 | 2022-04-15 | 中国船舶重工集团公司第七0九研究所 | Method and system for generating three-dimensional semantic map for unmanned ship |
CN114529800A (en) * | 2022-01-12 | 2022-05-24 | 华南理工大学 | Obstacle avoidance method, system, device and medium for rotor unmanned aerial vehicle |
CN114708321A (en) * | 2022-01-12 | 2022-07-05 | 北京航空航天大学 | Semantic-based camera pose estimation method and system |
CN114581616A (en) * | 2022-01-28 | 2022-06-03 | 苏州大学 | Visual inertia SLAM system based on multitask feature extraction network |
CN114612525A (en) * | 2022-02-09 | 2022-06-10 | 浙江工业大学 | Robot RGB-D SLAM method based on grid segmentation and double-map coupling |
CN114612525B (en) * | 2022-02-09 | 2025-06-20 | 浙江工业大学 | Robot RGB-D SLAM method based on grid segmentation and dual map coupling |
CN114488244A (en) * | 2022-02-16 | 2022-05-13 | 东南大学 | Wearable blind-person aided navigation device and method based on semantic VISLAM and GNSS positioning |
CN116977189A (en) * | 2022-04-15 | 2023-10-31 | 中国移动通信有限公司研究院 | Synchronous positioning and mapping method, device and storage medium |
CN114550186A (en) * | 2022-04-21 | 2022-05-27 | 北京世纪好未来教育科技有限公司 | Method and device for correcting document image, electronic equipment and storage medium |
CN115128628A (en) * | 2022-06-01 | 2022-09-30 | 北京理工大学 | Construction method of road grid map based on laser SLAM and monocular vision |
CN114972470A (en) * | 2022-07-22 | 2022-08-30 | 北京中科慧眼科技有限公司 | Road surface environment obtaining method and system based on binocular vision |
CN115451939A (en) * | 2022-08-19 | 2022-12-09 | 中国人民解放军国防科技大学 | A Parallel SLAM Method Based on Detection Segmentation in Dynamic Scenes |
CN115451939B (en) * | 2022-08-19 | 2024-05-07 | 中国人民解放军国防科技大学 | Parallel SLAM method under dynamic scene based on detection segmentation |
CN115164918A (en) * | 2022-09-06 | 2022-10-11 | 联友智连科技有限公司 | Semantic point cloud map construction method and device and electronic equipment |
CN115564731A (en) * | 2022-09-30 | 2023-01-03 | 华东理工大学 | A method and system for manipulating deformable objects based on visual feedback |
CN115731385A (en) * | 2022-11-22 | 2023-03-03 | 中国电子科技南湖研究院 | Image Feature Extraction Method, Device and SLAM System Based on Semantic Segmentation |
CN116681755A (en) * | 2022-12-29 | 2023-09-01 | 广东美的白色家电技术创新中心有限公司 | Pose prediction method and device |
CN116681755B (en) * | 2022-12-29 | 2024-02-09 | 广东美的白色家电技术创新中心有限公司 | Pose prediction method and device |
CN116342800A (en) * | 2023-02-21 | 2023-06-27 | 中国航天员科研训练中心 | Semantic three-dimensional reconstruction method and system for multi-mode pose optimization |
CN116342800B (en) * | 2023-02-21 | 2023-10-24 | 中国航天员科研训练中心 | Semantic three-dimensional reconstruction method and system for multi-mode pose optimization |
CN116339336A (en) * | 2023-03-29 | 2023-06-27 | 北京信息科技大学 | Method, device and system for collaborative operation of electric agricultural machinery cluster |
CN116817887B (en) * | 2023-06-28 | 2024-03-08 | 哈尔滨师范大学 | Semantic visual SLAM map construction method, electronic equipment and storage medium |
CN116817887A (en) * | 2023-06-28 | 2023-09-29 | 哈尔滨师范大学 | Semantic visual SLAM map construction method, electronic equipment and storage medium |
CN117392347B (en) * | 2023-10-13 | 2024-04-30 | 苏州煋海图科技有限公司 | Map construction method, device, computer equipment and readable storage medium |
CN117392347A (en) * | 2023-10-13 | 2024-01-12 | 苏州煋海图科技有限公司 | Map construction method, device, computer equipment and readable storage medium |
CN117611762B (en) * | 2024-01-23 | 2024-04-30 | 常熟理工学院 | Multi-level map construction method, system and electronic equipment |
CN117611762A (en) * | 2024-01-23 | 2024-02-27 | 常熟理工学院 | A multi-level map construction method, system and electronic device |
CN118447320A (en) * | 2024-05-13 | 2024-08-06 | 华智清创(苏州)农业科技有限公司 | Visual multitasking mounted agricultural inspection method and device based on deep learning |
CN118447320B (en) * | 2024-05-13 | 2024-09-27 | 华智清创(苏州)农业科技有限公司 | Visual multitasking mounted agricultural inspection method and device based on deep learning |
CN118710609A (en) * | 2024-06-20 | 2024-09-27 | 广东省科学院智能制造研究所 | Capsule endoscope gastrointestinal tract positioning method based on visual SLAM |
CN118887288A (en) * | 2024-07-16 | 2024-11-01 | 中山大学 | A multi-feature indoor visual positioning method and system based on ground segmentation network |
CN119066885A (en) * | 2024-11-04 | 2024-12-03 | 中国科学院长春光学精密机械与物理研究所 | A method for processing measured surface data suitable for optical modeling |
CN119323772A (en) * | 2024-12-19 | 2025-01-17 | 杭州智元研究院有限公司 | Semantic map construction method based on RGBD visual segmentation algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN111462135B (en) | 2023-04-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111462135B (en) | Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation | |
CN109166149B (en) | Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU | |
CN110070615B (en) | Multi-camera cooperation-based panoramic vision SLAM method | |
CN112902953B (en) | An autonomous pose measurement method based on SLAM technology | |
CN107292965B (en) | Virtual and real shielding processing method based on depth image data stream | |
CN113327296B (en) | Laser radar and camera online combined calibration method based on depth weighting | |
CN103247075B (en) | Based on the indoor environment three-dimensional rebuilding method of variation mechanism | |
Ma et al. | CRLF: Automatic calibration and refinement based on line feature for LiDAR and camera in road scenes | |
CN107610175A (en) | The monocular vision SLAM algorithms optimized based on semi-direct method and sliding window | |
CN110097553A (en) | The semanteme for building figure and three-dimensional semantic segmentation based on instant positioning builds drawing system | |
CN111724439A (en) | A visual positioning method and device in a dynamic scene | |
CN111998862B (en) | BNN-based dense binocular SLAM method | |
CN112085790A (en) | Point-line combined multi-camera visual SLAM method, equipment and storage medium | |
CN111882602B (en) | Visual odometry implementation method based on ORB feature points and GMS matching filter | |
CN113744315B (en) | Semi-direct vision odometer based on binocular vision | |
CN114140527A (en) | Dynamic environment binocular vision SLAM method based on semantic segmentation | |
CN111161334B (en) | Semantic map construction method based on deep learning | |
CN114332394B (en) | A dynamic scene 3D reconstruction method based on semantic information | |
CN115187737A (en) | A Semantic Map Construction Method Based on Laser and Vision Fusion | |
CN113920191B (en) | 6D data set construction method based on depth camera | |
US20240062415A1 (en) | Terminal device localization method and related device therefor | |
CN110533716A (en) | A Semantic SLAM System and Method Based on 3D Constraints | |
CN114419259B (en) | A visual positioning method and system based on physical model imaging simulation | |
CN116128966A (en) | A Semantic Localization Method Based on Environmental Objects | |
CN117036484A (en) | Visual positioning and mapping method, system, equipment and medium based on geometry and semantics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |