[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109341703B - Visual SLAM algorithm adopting CNNs characteristic detection in full period - Google Patents

Visual SLAM algorithm adopting CNNs characteristic detection in full period Download PDF

Info

Publication number
CN109341703B
CN109341703B CN201811087509.9A CN201811087509A CN109341703B CN 109341703 B CN109341703 B CN 109341703B CN 201811087509 A CN201811087509 A CN 201811087509A CN 109341703 B CN109341703 B CN 109341703B
Authority
CN
China
Prior art keywords
training
layer
data set
cnns
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811087509.9A
Other languages
Chinese (zh)
Other versions
CN109341703A (en
Inventor
赵永嘉
张宁
雷小永
戴树岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201811087509.9A priority Critical patent/CN109341703B/en
Publication of CN109341703A publication Critical patent/CN109341703A/en
Application granted granted Critical
Publication of CN109341703B publication Critical patent/CN109341703B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • G01C21/32Structuring or formatting of map data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a visual SLAM algorithm for detecting the characteristics of CNNs in a whole period, which comprises the steps of firstly pre-training original image data by using an unsupervised model at the front end, then associating joint representation of movement and depth with local speed and direction change by using the pre-trained data through a CNN network architecture, and executing a visual odometer; and finally, executing path prediction. The invention also uses an OverFeat neural network model to perform a loop detection link, and is used for eliminating accumulated errors brought by the front end and constructing a visual slam framework based on deep learning. Meanwhile, a time and space continuity filter is constructed, a matching result is verified, the matching accuracy is improved, and mismatching is eliminated. The invention has great advantages and potentials in the aspects of improving the accuracy of the visual odometer and the closed-loop detection accuracy.

Description

Visual SLAM algorithm adopting CNNs characteristic detection in full period
Technical Field
The invention belongs to the technical field of simultaneous localization and map construction (SLAM) algorithms in computer vision, and particularly relates to a visual SLAM algorithm for detecting characteristics of CNNs in a full period.
Background
SLAM (Simultaneous Localization and mapping) is known under the name "Simultaneous Localization and mapping". SLAM is an attractive area of research and has found widespread use in robotics, navigation, and many other applications. Visual SLAM basically involves estimating camera motion from visual sensor information and attempting to construct a map of the surrounding environment, e.g., sequential frames from one or more cameras. The current SLAM problem research means is mainly to estimate the motion information of a robot body and the feature information of an unknown environment by installing a plurality of types of sensors on the robot body, and realize the accurate estimation of the pose of the robot and the space modeling of a scene by utilizing information fusion. Although SLAM uses many types of sensors, including laser and vision, its processing generally includes 3 parts: front-end visual odometry, back-end optimization and closed-loop detection.
The typical visual SLAM algorithm takes the pose of an estimated camera as a main target, and reconstructs a 3D map through a multi-view geometric theory. In order to improve the data processing speed, a partial visual SLAM algorithm firstly extracts sparse image features, and visual odometry and closed-loop detection are realized through matching between feature points, such as a visual SLAM [13] based on SIFT (scale innovative feature transform) features and a visual SLAM based on ORB (oriented FAST and rotated BRIEF) features. The SIFT and ORB features are widely applied in the field of visual SLAM by virtue of better robustness, better distinguishing capability and fast processing speed. The manually designed sparse image features currently have many limitations, on one hand, how to design the sparse image features to optimally represent image information is still an unsolved important problem in the field of computer vision, and on the other hand, the sparse image features still have more challenges in the aspects of dealing with illumination change, dynamic target motion, camera parameter change, lack of textures or environments with single textures and the like. Conventional Visual Odometry (VO) basically involves estimating motion from Visual information, such as sequential frames from one or more cameras. One common feature of most of these attributes is that they rely on keypoint detection and tracking and camera geometry to estimate visual range.
In recent years, learning-based methods have shown promising results in many fields of computer vision, and can overcome the defects existing in the conventional visual slam algorithm (sparse image features have more difficulties in dealing with illumination changes, dynamic target motion, camera parameter changes, and environments lacking textures or single textures). Models like Convolutional Neural Networks (CNNs) have proven to be very effective in various visual tasks, such as classification and localization, depth estimation, and so on. Unsupervised feature learning models demonstrate the ability to learn locally transformed representations in data through multiplicative interactions. Research shows that data after unsupervised model pre-training is applied to the CNNs network, noise can be filtered well, and overfitting can be prevented.
The Visual Odometer (VO) is responsible for estimating the initial values of the trajectory and map. The VO principle considers only the relation of adjacent frame pictures. The inevitable errors will accumulate over time, so that the overall system will have accumulated errors, the long-term estimation results will be unreliable, or alternatively, we cannot construct globally consistent trajectories and maps. The loop detection module can give the constraint that some time intervals are longer outside the adjacent frames. The key of loop detection is how to effectively detect the fact that a camera passes through the same place, which is related to the estimated estimation and the correctness of a map under a long time. Therefore, the improvement of the precision and the robustness of the whole SLAM system by the loop detection is very obvious.
And this appearance-based loop detection is essentially a task of image similarity matching. And verifying whether the two images are in the same place or not by performing feature matching on the two images. The traditional method of loop detection is to use a 'bag of words model' to generate a dictionary for feature matching. CNNs have now shown optimal performance in various classification tasks. Existing landmark test results show that deep features from CNNs at different levels perform better than SIFT at all times in descriptor matching, indicating that SIFT or SURF may no longer be the preferred descriptor for the matching task. Therefore, our invention is inspired by the excellent performance of CNNs in image classification and their proof of feasibility in feature matching. The traditional bag-of-words model method is abandoned, and the loop detection is carried out by using a hierarchical image feature extraction method represented by a deep learning technology based on CNNs. The deep learning algorithm is a mainstream recognition algorithm in the current computer vision field, depends on hierarchical feature representation of a multilayer neural network learning image, and can realize higher accuracy of feature extraction and position recognition compared with the traditional recognition method.
Disclosure of Invention
Aiming at the problems, the invention provides a visual SLAM algorithm adopting CNNs characteristic detection in a full period, which is an SLAM system adopting a convolutional neural network to process both front end (VO) and loop detection of SLAM algorithm operation so as to realize a full period deep learning algorithm.
The invention discloses a visual SLAM algorithm adopting CNNs characteristic detection in a full period, which comprises the following steps:
step 1, scanning surrounding environment information by using a binocular camera; and taking part of the collected video streams as a training data set and taking part of the collected video streams as a testing data set.
Step 2: and (3) pre-training the training data set in the video stream acquired in the step (1) by a synchronous detection method.
And step 3: visual odometry is performed using convolutional neural network training to derive local changes in velocity and local changes in direction.
And 4, step 4: and (3) recovering the motion path of the camera by using the local change of the speed and the local change information of the direction obtained in the step (3).
And 5: and performing closed loop detection by using a convolutional neural network to eliminate accumulated errors of path prediction.
The invention has the advantages that:
1. the visual SLAM algorithm of CNNs characteristic detection is adopted in the whole period, the convolutional neural network is used for the front end, compared with the traditional front end algorithm, the method is based on learning to replace complicated formula calculation, manual characteristic extraction and matching are not needed, the method is concise and intuitive, and the online operation speed is high.
2. The visual SLAM algorithm of CNNs characteristic detection is adopted in the whole period, the traditional method of closed-loop detection by utilizing a bag-of-words model is abandoned, and a better position identification effect is obtained through accurate characteristic matching.
3. The visual SLAM algorithm of CNNs characteristic detection is adopted in the whole period, deep level characteristics in the image can be learned through a neural network, and the recognition rate can reach a higher level. Compared with the traditional visual slam algorithm, the closed-loop detection accuracy can be improved, the image information is more sufficiently expressed, the robustness to environmental changes such as illumination and seasons is stronger, the similarity among 2 frames of images can be calculated, and therefore the more concise visual odometer is realized, and the design of the matching classifier can be synchronously completed while the characteristic design is completed by utilizing the database to pre-train the neural network.
Drawings
FIG. 1 is an overall flow chart of the visual SLAM algorithm for full-period CNNs characteristic detection according to the present invention;
FIG. 2 is a flow chart of the closed loop detection method in the visual SLAM algorithm of the present invention which adopts the CNNs characteristic detection in the whole period.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The visual SLAM algorithm for detecting the CNNs characteristics in the whole period comprises the following steps as shown in figure 1:
step 1, scanning surrounding environment information;
and (3) moving the binocular camera in the square area, acquiring environment image information of the real scene, and transmitting the obtained video stream to an upper computer in real time. The number of moving circles of the binocular camera is 1-2 circles, a closed loop is formed, and compensation of accumulated errors in a subsequent closed loop detection link is facilitated. And repeating the process, wherein part of the collected video streams is used as a training data set, and part of the collected video streams is used as a test data set.
And 2, pre-training the training data set in the video stream acquired in the step 1 by a synchronous detection method.
To obtain a combined representation of camera motion and image depth information, the training data set is pre-trained using random gradient descent training using an unsupervised learning model (SAE-D). The synchronization-based SAE-D is a single-layer model that allows for feature extraction from training data sets through local and Hebbian-type learning.
Training an SAE-D model on local binocular blocks which are cut immediately in a training data set, wherein the size of each binocular block is 16 × 5(space × time), and obtaining characteristic information represented by motion and depth in a combined mode; then the feature information represented by the motion and depth is de-whitened and returned to the image space, and the pre-training of the training data set is completed. The pre-trained training data set is used to initialize the first layer of CNN (convolutional neural network).
And 3, performing visual odometry by using Convolutional Neural Network (CNN) training.
Convolutional Neural Networks (CNNs) are a supervised-based learning model. The CNN is trained to associate local depth and motion representations with local changes in speed and direction to learn to perform visual odometry. Training the acquired joint representation of motion and depth through the CNN architecture correlates with the desired label (direction and velocity change).
Inputting the features obtained by SAE-D model training in step 2 into the first CNN layers with the same architecture for initializing the first CNN layers; two CNNs output local changes in velocity and direction, respectively, and there are two CNN remainders that associate the local changes in velocity and direction, respectively, with the desired label.
The total CNN network has 6 layers, the first layer is a 5 × 5 convolutional layer, and features are learned from left and right images. The second layer then element multiplies the features extracted from the left and right convolutional layers. The third layer 1 x 1, the fourth pooling layer, the fifth fully-connected layer, and the final output layer, which is the Softmax layer.
The inputs to both CNNs are 5-frame subsequences and the target output is a vector representation of local velocity and direction changes. The effect and accuracy of the local speed and direction change information can be evaluated by a binocular data set KITTI.
And 4, step 4: camera motion path prediction
And for the whole video stream, discretely recovering the motion path of the camera by using the speed and direction change information of each 5-frame subsequence obtained in the step 3.
And 5: using CNN to perform closed loop detection, and eliminating accumulated error of path prediction;
the local speed and direction change information obtained in the step 4 cannot be completely accurate, and certain errors exist. With the accumulation of errors, the difference between the predicted path and the real path from the starting point to the end point increases. Therefore, a subsequent closed loop detection link is required to eliminate the accumulated errors, and the difference between the predicted path and the real path is reduced. This algorithm consists of two parts: performing feature extraction by using a convolutional neural network; the location matching hypotheses are spatio-temporally filtered by comparing the signature responses.
In the closed loop detection link, a CNN-based algorithm is also adopted for processing, an applied CNN model is different from the CNN model and is independently applied to a visual odometer and closed loop detection, so that the accumulated error of the camera motion path prediction obtained in the step 4 is eliminated, and autonomous closed loop is realized, and the specific method comprises the following steps:
and extracting image features by using a pre-trained convolutional neural network. The invention adopts overfeat convolution neural network to extract image characteristics.
The overfeat convolutional neural network was pre-trained on the ImageNet 2012 dataset, which consisted of 120 million images and 1000 classes. The overfeat convolutional neural network includes five convolution stages and three fully connected stages. The first two convolution stages consist of a convolution layer, a max-pooling layer and a rectifying (ReLU) nonlinear layer. The third and fourth convolution stages consist of convolutional layers, zero-padding layers, and ReLU nonlinear layers. The fifth stage comprises a convolutional layer, a zero-padding layer, a ReLU layer and a Maxpooling layer. Finally, the sixth and seventh fully connected stages contain one fully connected layer and one ReLU layer, while the eighth stage is an output layer containing only fully connected layers. The whole convolutional neural network has 21 layers.
When an image I is input into the network, it produces a series of hierarchical activations. Use of L in the inventionk(I) K 1, …,21, to represent the corresponding output of the kth layer of a given input image I. Output feature vector L of each layerk(I) Is a deep-learning representation of the image I; the position recognition is performed by these comparisons of the corresponding feature vectors of the different images. The network is capable of processing any image with a size equal to or larger than 231 × 231 pixels, so the overfeat convolutional neural network input uses an image resized to 256 × 256 pixels.
Therefore, the training data set and the testing data set collected in the step one are used as input, and the pre-trained overfeat convolutional neural network is used for feature extraction.
Step 6: and generating a mixing matrix by feature matching and carrying out space-time continuity detection.
As shown in fig. 2, features extracted from the pictures in each test data set are matched to features proposed from each training data set by an overfeat convolutional neural network.
In order to compare the performance difference of the characteristics of each layer of image in the overfeat convolution neural network on scene recognition, a mixing matrix is further constructed by utilizing the characteristics of each layer:
Mk(i,j)=d(Lk(Ii),Lk(Ij)),i=1,…,R,j=1,…,T
wherein, IiRepresenting the image input in the training dataset of the ith frame, IjRepresenting the image input in the test data set of frame j, Lk(Ii) Represents and IiCorresponding k-th layer output, Mk(i, j) represents the Euclidean distance between the k-th training sample i and the test sample j, namely, the matching degree between the two is described; r and T represent the number of training images and the number of testing images respectively. Each column of the above-mentioned mixing matrix stores the average eigenvector difference between the test image of the jth frame and all the training images.
To find the strongest position matching hypothesis, the element in each column of the mixing matrix with the lowest feature vector difference is searched.
Figure BDA0001803527160000051
And for possible position matching hypothesis in the mixing matrix, further constructing a spatial continuity filter and a time continuity filter for comprehensive verification, and improving the matching accuracy. Meanwhile, the characteristic performance trained by each layer of network is explored, and the characteristic description of the network middle layer is found to have a good effect on image matching with similar visual angles, and the middle and rear layers have stronger adaptability and robustness to scene visual angle changes.
With accurate position matching, the accumulated error caused by the visual odometer without loop can be compensated, and a globally consistent track is constructed.

Claims (3)

1. A visual SLAM method for full-period CNNs characteristic detection comprises the following steps:
step 1, scanning surrounding environment information by using a binocular camera; using a part of the collected video streams as a training data set, and using a part of the collected video streams as a test data set;
step 2: pre-training a training data set in the video stream acquired in the step 1 by a synchronous detection method; pre-training a training data set by adopting an unsupervised learning model and random gradient descent training to obtain characteristic information jointly represented by motion and depth, and then removing whitening from the characteristic information jointly represented by the motion and the depth and returning the characteristic information to an image space;
and step 3: performing a visual odometer using a convolutional neural network training to obtain local changes in velocity and local changes in direction; inputting the features obtained by the unsupervised learning model training in the step 2 into the first CNN layers with the same architecture for initializing the first CNN layers; two CNNs output local changes in velocity and direction, respectively, and two CNN remainders associate the local changes in velocity and direction, respectively, with a desired label;
the whole CNN network has 6 layers, the first layer is a convolution layer of 5 by 5, and the left and right image learning characteristics are respectively obtained; next, the second layer carries out element multiplication on the features extracted from the left convolution layer and the right convolution layer; a third 1 x 1 convolutional layer, a fourth pooling layer, a fifth fully-connected layer, and a final output layer which is a Softmax layer;
the inputs to both CNNs are 5-frame subsequences and the target output is a vector representation of local velocity and direction changes; the effect and the precision of the local speed and direction change information are evaluated through a binocular data set KITTI;
and 4, step 4: restoring the motion path of the camera by using the local change of the speed and the local change information of the direction obtained in the step 3;
and 5: performing closed loop detection by using a convolutional neural network to eliminate accumulated errors of path prediction;
firstly, using a training data set and a test data set acquired in the step 1 as input, and performing feature extraction by using an overfeat convolutional neural network pre-trained on an imagenet data set; subsequently, matching features extracted from the pictures in each test data set with features proposed from each training data set;
constructing a mixed matrix by using the characteristics of each layer of the overfeat convolution neural network:
Mk(i,j)=d(Lk(Ii),Lk(Ij)),i=1,…,R,j=1,…,T
wherein, IiRepresenting the image input in the training dataset of the ith frame, IjRepresenting the image input in the test data set of frame j, Lk(Ii) Represents and IiCorresponding k-th layer output, Mk(i, j) represents the Euclidean distance between the k-th training sample i and the test sample j, namely, the matching degree between the two is described; r and T respectively represent the number of training images and the number of testing images; each column of the mixing matrix stores the average feature vector difference between the jth frame of test image and all training images;
searching the element with the lowest feature vector difference in each column of the mixing matrix;
Figure FDA0003628828740000021
2. the visual SLAM method of claim 1, wherein the full-cycle detection of CNNs features is performed by: in the step 1, a binocular camera is used to move along the annular area, the number of moving circles is 1-2, and a closed loop is formed.
3. The visual SLAM method of claim 1, wherein the full-cycle detection of CNNs features is performed by: in step 3, the features obtained by training in step 2 are input into the first CNN layers with the same structure, and the obtained joint representation of the motion and the depth is trained to be associated with the expected label through the CNN structure.
CN201811087509.9A 2018-09-18 2018-09-18 Visual SLAM algorithm adopting CNNs characteristic detection in full period Active CN109341703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811087509.9A CN109341703B (en) 2018-09-18 2018-09-18 Visual SLAM algorithm adopting CNNs characteristic detection in full period

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811087509.9A CN109341703B (en) 2018-09-18 2018-09-18 Visual SLAM algorithm adopting CNNs characteristic detection in full period

Publications (2)

Publication Number Publication Date
CN109341703A CN109341703A (en) 2019-02-15
CN109341703B true CN109341703B (en) 2022-07-01

Family

ID=65305452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811087509.9A Active CN109341703B (en) 2018-09-18 2018-09-18 Visual SLAM algorithm adopting CNNs characteristic detection in full period

Country Status (1)

Country Link
CN (1) CN109341703B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109840598B (en) * 2019-04-29 2019-08-09 深兰人工智能芯片研究院(江苏)有限公司 A kind of method for building up and device of deep learning network model
CN110146099B (en) * 2019-05-31 2020-08-11 西安工程大学 Synchronous positioning and map construction method based on deep learning
CN110296705B (en) * 2019-06-28 2022-01-25 苏州瑞久智能科技有限公司 Visual SLAM loop detection method based on distance metric learning
CN110399821B (en) * 2019-07-17 2023-05-30 上海师范大学 Customer satisfaction acquisition method based on facial expression recognition
CN110487274B (en) * 2019-07-30 2021-01-29 中国科学院空间应用工程与技术中心 SLAM method and system for weak texture scene, navigation vehicle and storage medium
CN110738128A (en) * 2019-09-19 2020-01-31 天津大学 repeated video detection method based on deep learning
CN110659619A (en) * 2019-09-27 2020-01-07 昆明理工大学 Depth space-time information-based correlation filtering tracking method
CN111144550A (en) * 2019-12-27 2020-05-12 中国科学院半导体研究所 Simplex deep neural network model based on homologous continuity and construction method
CN111243021A (en) * 2020-01-06 2020-06-05 武汉理工大学 Vehicle-mounted visual positioning method and system based on multiple combined cameras and storage medium
CN111241986B (en) * 2020-01-08 2021-03-30 电子科技大学 Visual SLAM closed loop detection method based on end-to-end relationship network
CN111753789A (en) * 2020-07-01 2020-10-09 重庆邮电大学 Robot vision SLAM closed loop detection method based on stack type combined self-encoder
CN113066152B (en) * 2021-03-18 2022-05-27 内蒙古工业大学 AGV map construction method and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106651830A (en) * 2016-09-28 2017-05-10 华南理工大学 Image quality test method based on parallel convolutional neural network
CN106780631B (en) * 2017-01-11 2020-01-03 山东大学 Robot closed-loop detection method based on deep learning
CN107563430A (en) * 2017-08-28 2018-01-09 昆明理工大学 A kind of convolutional neural networks algorithm optimization method based on sparse autocoder and gray scale correlation fractal dimension
CN107808132A (en) * 2017-10-23 2018-03-16 重庆邮电大学 A kind of scene image classification method for merging topic model
CN107944386B (en) * 2017-11-22 2019-11-22 天津大学 Visual scene recognition methods based on convolutional neural networks

Also Published As

Publication number Publication date
CN109341703A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN109341703B (en) Visual SLAM algorithm adopting CNNs characteristic detection in full period
Zhou et al. To learn or not to learn: Visual localization from essential matrices
Schönberger et al. Semantic visual localization
CN111311666B (en) Monocular vision odometer method integrating edge features and deep learning
CN110781262B (en) Semantic map construction method based on visual SLAM
CN110135249B (en) Human behavior identification method based on time attention mechanism and LSTM (least Square TM)
CN113313763B (en) Monocular camera pose optimization method and device based on neural network
Vaquero et al. Dual-branch CNNs for vehicle detection and tracking on LiDAR data
Tinchev et al. Skd: Keypoint detection for point clouds using saliency estimation
Saleem et al. Neural network-based recent research developments in SLAM for autonomous ground vehicles: A review
CN113781563B (en) Mobile robot loop detection method based on deep learning
Getahun et al. A deep learning approach for lane detection
CN112767546B (en) Binocular image-based visual map generation method for mobile robot
Feng et al. Localization and mapping using instance-specific mesh models
Tsintotas et al. The revisiting problem in simultaneous localization and mapping
Felton et al. Deep metric learning for visual servoing: when pose and image meet in latent space
Jo et al. Mixture density-PoseNet and its application to monocular camera-based global localization
Xi et al. Multi-motion segmentation: Combining geometric model-fitting and optical flow for RGB sensors
CN111862147B (en) Tracking method for multiple vehicles and multiple lines of human targets in video
Esfahani et al. From local understanding to global regression in monocular visual odometry
Tsintotas et al. Online Appearance-Based Place Recognition and Mapping: Their Role in Autonomous Navigation
CN111578956A (en) Visual SLAM positioning method based on deep learning
CN116958057A (en) Strategy-guided visual loop detection method
Han et al. BASL-AD SLAM: A Robust Deep-Learning Feature-Based Visual SLAM System With Adaptive Motion Model
CN114140524A (en) Closed loop detection system and method for multi-scale feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant