CN109341703B - Visual SLAM algorithm adopting CNNs characteristic detection in full period - Google Patents
Visual SLAM algorithm adopting CNNs characteristic detection in full period Download PDFInfo
- Publication number
- CN109341703B CN109341703B CN201811087509.9A CN201811087509A CN109341703B CN 109341703 B CN109341703 B CN 109341703B CN 201811087509 A CN201811087509 A CN 201811087509A CN 109341703 B CN109341703 B CN 109341703B
- Authority
- CN
- China
- Prior art keywords
- training
- layer
- data set
- cnns
- visual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
- G01C21/28—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
- G01C21/30—Map- or contour-matching
- G01C21/32—Structuring or formatting of map data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Automation & Control Theory (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a visual SLAM algorithm for detecting the characteristics of CNNs in a whole period, which comprises the steps of firstly pre-training original image data by using an unsupervised model at the front end, then associating joint representation of movement and depth with local speed and direction change by using the pre-trained data through a CNN network architecture, and executing a visual odometer; and finally, executing path prediction. The invention also uses an OverFeat neural network model to perform a loop detection link, and is used for eliminating accumulated errors brought by the front end and constructing a visual slam framework based on deep learning. Meanwhile, a time and space continuity filter is constructed, a matching result is verified, the matching accuracy is improved, and mismatching is eliminated. The invention has great advantages and potentials in the aspects of improving the accuracy of the visual odometer and the closed-loop detection accuracy.
Description
Technical Field
The invention belongs to the technical field of simultaneous localization and map construction (SLAM) algorithms in computer vision, and particularly relates to a visual SLAM algorithm for detecting characteristics of CNNs in a full period.
Background
SLAM (Simultaneous Localization and mapping) is known under the name "Simultaneous Localization and mapping". SLAM is an attractive area of research and has found widespread use in robotics, navigation, and many other applications. Visual SLAM basically involves estimating camera motion from visual sensor information and attempting to construct a map of the surrounding environment, e.g., sequential frames from one or more cameras. The current SLAM problem research means is mainly to estimate the motion information of a robot body and the feature information of an unknown environment by installing a plurality of types of sensors on the robot body, and realize the accurate estimation of the pose of the robot and the space modeling of a scene by utilizing information fusion. Although SLAM uses many types of sensors, including laser and vision, its processing generally includes 3 parts: front-end visual odometry, back-end optimization and closed-loop detection.
The typical visual SLAM algorithm takes the pose of an estimated camera as a main target, and reconstructs a 3D map through a multi-view geometric theory. In order to improve the data processing speed, a partial visual SLAM algorithm firstly extracts sparse image features, and visual odometry and closed-loop detection are realized through matching between feature points, such as a visual SLAM [13] based on SIFT (scale innovative feature transform) features and a visual SLAM based on ORB (oriented FAST and rotated BRIEF) features. The SIFT and ORB features are widely applied in the field of visual SLAM by virtue of better robustness, better distinguishing capability and fast processing speed. The manually designed sparse image features currently have many limitations, on one hand, how to design the sparse image features to optimally represent image information is still an unsolved important problem in the field of computer vision, and on the other hand, the sparse image features still have more challenges in the aspects of dealing with illumination change, dynamic target motion, camera parameter change, lack of textures or environments with single textures and the like. Conventional Visual Odometry (VO) basically involves estimating motion from Visual information, such as sequential frames from one or more cameras. One common feature of most of these attributes is that they rely on keypoint detection and tracking and camera geometry to estimate visual range.
In recent years, learning-based methods have shown promising results in many fields of computer vision, and can overcome the defects existing in the conventional visual slam algorithm (sparse image features have more difficulties in dealing with illumination changes, dynamic target motion, camera parameter changes, and environments lacking textures or single textures). Models like Convolutional Neural Networks (CNNs) have proven to be very effective in various visual tasks, such as classification and localization, depth estimation, and so on. Unsupervised feature learning models demonstrate the ability to learn locally transformed representations in data through multiplicative interactions. Research shows that data after unsupervised model pre-training is applied to the CNNs network, noise can be filtered well, and overfitting can be prevented.
The Visual Odometer (VO) is responsible for estimating the initial values of the trajectory and map. The VO principle considers only the relation of adjacent frame pictures. The inevitable errors will accumulate over time, so that the overall system will have accumulated errors, the long-term estimation results will be unreliable, or alternatively, we cannot construct globally consistent trajectories and maps. The loop detection module can give the constraint that some time intervals are longer outside the adjacent frames. The key of loop detection is how to effectively detect the fact that a camera passes through the same place, which is related to the estimated estimation and the correctness of a map under a long time. Therefore, the improvement of the precision and the robustness of the whole SLAM system by the loop detection is very obvious.
And this appearance-based loop detection is essentially a task of image similarity matching. And verifying whether the two images are in the same place or not by performing feature matching on the two images. The traditional method of loop detection is to use a 'bag of words model' to generate a dictionary for feature matching. CNNs have now shown optimal performance in various classification tasks. Existing landmark test results show that deep features from CNNs at different levels perform better than SIFT at all times in descriptor matching, indicating that SIFT or SURF may no longer be the preferred descriptor for the matching task. Therefore, our invention is inspired by the excellent performance of CNNs in image classification and their proof of feasibility in feature matching. The traditional bag-of-words model method is abandoned, and the loop detection is carried out by using a hierarchical image feature extraction method represented by a deep learning technology based on CNNs. The deep learning algorithm is a mainstream recognition algorithm in the current computer vision field, depends on hierarchical feature representation of a multilayer neural network learning image, and can realize higher accuracy of feature extraction and position recognition compared with the traditional recognition method.
Disclosure of Invention
Aiming at the problems, the invention provides a visual SLAM algorithm adopting CNNs characteristic detection in a full period, which is an SLAM system adopting a convolutional neural network to process both front end (VO) and loop detection of SLAM algorithm operation so as to realize a full period deep learning algorithm.
The invention discloses a visual SLAM algorithm adopting CNNs characteristic detection in a full period, which comprises the following steps:
step 1, scanning surrounding environment information by using a binocular camera; and taking part of the collected video streams as a training data set and taking part of the collected video streams as a testing data set.
Step 2: and (3) pre-training the training data set in the video stream acquired in the step (1) by a synchronous detection method.
And step 3: visual odometry is performed using convolutional neural network training to derive local changes in velocity and local changes in direction.
And 4, step 4: and (3) recovering the motion path of the camera by using the local change of the speed and the local change information of the direction obtained in the step (3).
And 5: and performing closed loop detection by using a convolutional neural network to eliminate accumulated errors of path prediction.
The invention has the advantages that:
1. the visual SLAM algorithm of CNNs characteristic detection is adopted in the whole period, the convolutional neural network is used for the front end, compared with the traditional front end algorithm, the method is based on learning to replace complicated formula calculation, manual characteristic extraction and matching are not needed, the method is concise and intuitive, and the online operation speed is high.
2. The visual SLAM algorithm of CNNs characteristic detection is adopted in the whole period, the traditional method of closed-loop detection by utilizing a bag-of-words model is abandoned, and a better position identification effect is obtained through accurate characteristic matching.
3. The visual SLAM algorithm of CNNs characteristic detection is adopted in the whole period, deep level characteristics in the image can be learned through a neural network, and the recognition rate can reach a higher level. Compared with the traditional visual slam algorithm, the closed-loop detection accuracy can be improved, the image information is more sufficiently expressed, the robustness to environmental changes such as illumination and seasons is stronger, the similarity among 2 frames of images can be calculated, and therefore the more concise visual odometer is realized, and the design of the matching classifier can be synchronously completed while the characteristic design is completed by utilizing the database to pre-train the neural network.
Drawings
FIG. 1 is an overall flow chart of the visual SLAM algorithm for full-period CNNs characteristic detection according to the present invention;
FIG. 2 is a flow chart of the closed loop detection method in the visual SLAM algorithm of the present invention which adopts the CNNs characteristic detection in the whole period.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The visual SLAM algorithm for detecting the CNNs characteristics in the whole period comprises the following steps as shown in figure 1:
step 1, scanning surrounding environment information;
and (3) moving the binocular camera in the square area, acquiring environment image information of the real scene, and transmitting the obtained video stream to an upper computer in real time. The number of moving circles of the binocular camera is 1-2 circles, a closed loop is formed, and compensation of accumulated errors in a subsequent closed loop detection link is facilitated. And repeating the process, wherein part of the collected video streams is used as a training data set, and part of the collected video streams is used as a test data set.
And 2, pre-training the training data set in the video stream acquired in the step 1 by a synchronous detection method.
To obtain a combined representation of camera motion and image depth information, the training data set is pre-trained using random gradient descent training using an unsupervised learning model (SAE-D). The synchronization-based SAE-D is a single-layer model that allows for feature extraction from training data sets through local and Hebbian-type learning.
Training an SAE-D model on local binocular blocks which are cut immediately in a training data set, wherein the size of each binocular block is 16 × 5(space × time), and obtaining characteristic information represented by motion and depth in a combined mode; then the feature information represented by the motion and depth is de-whitened and returned to the image space, and the pre-training of the training data set is completed. The pre-trained training data set is used to initialize the first layer of CNN (convolutional neural network).
And 3, performing visual odometry by using Convolutional Neural Network (CNN) training.
Convolutional Neural Networks (CNNs) are a supervised-based learning model. The CNN is trained to associate local depth and motion representations with local changes in speed and direction to learn to perform visual odometry. Training the acquired joint representation of motion and depth through the CNN architecture correlates with the desired label (direction and velocity change).
Inputting the features obtained by SAE-D model training in step 2 into the first CNN layers with the same architecture for initializing the first CNN layers; two CNNs output local changes in velocity and direction, respectively, and there are two CNN remainders that associate the local changes in velocity and direction, respectively, with the desired label.
The total CNN network has 6 layers, the first layer is a 5 × 5 convolutional layer, and features are learned from left and right images. The second layer then element multiplies the features extracted from the left and right convolutional layers. The third layer 1 x 1, the fourth pooling layer, the fifth fully-connected layer, and the final output layer, which is the Softmax layer.
The inputs to both CNNs are 5-frame subsequences and the target output is a vector representation of local velocity and direction changes. The effect and accuracy of the local speed and direction change information can be evaluated by a binocular data set KITTI.
And 4, step 4: camera motion path prediction
And for the whole video stream, discretely recovering the motion path of the camera by using the speed and direction change information of each 5-frame subsequence obtained in the step 3.
And 5: using CNN to perform closed loop detection, and eliminating accumulated error of path prediction;
the local speed and direction change information obtained in the step 4 cannot be completely accurate, and certain errors exist. With the accumulation of errors, the difference between the predicted path and the real path from the starting point to the end point increases. Therefore, a subsequent closed loop detection link is required to eliminate the accumulated errors, and the difference between the predicted path and the real path is reduced. This algorithm consists of two parts: performing feature extraction by using a convolutional neural network; the location matching hypotheses are spatio-temporally filtered by comparing the signature responses.
In the closed loop detection link, a CNN-based algorithm is also adopted for processing, an applied CNN model is different from the CNN model and is independently applied to a visual odometer and closed loop detection, so that the accumulated error of the camera motion path prediction obtained in the step 4 is eliminated, and autonomous closed loop is realized, and the specific method comprises the following steps:
and extracting image features by using a pre-trained convolutional neural network. The invention adopts overfeat convolution neural network to extract image characteristics.
The overfeat convolutional neural network was pre-trained on the ImageNet 2012 dataset, which consisted of 120 million images and 1000 classes. The overfeat convolutional neural network includes five convolution stages and three fully connected stages. The first two convolution stages consist of a convolution layer, a max-pooling layer and a rectifying (ReLU) nonlinear layer. The third and fourth convolution stages consist of convolutional layers, zero-padding layers, and ReLU nonlinear layers. The fifth stage comprises a convolutional layer, a zero-padding layer, a ReLU layer and a Maxpooling layer. Finally, the sixth and seventh fully connected stages contain one fully connected layer and one ReLU layer, while the eighth stage is an output layer containing only fully connected layers. The whole convolutional neural network has 21 layers.
When an image I is input into the network, it produces a series of hierarchical activations. Use of L in the inventionk(I) K 1, …,21, to represent the corresponding output of the kth layer of a given input image I. Output feature vector L of each layerk(I) Is a deep-learning representation of the image I; the position recognition is performed by these comparisons of the corresponding feature vectors of the different images. The network is capable of processing any image with a size equal to or larger than 231 × 231 pixels, so the overfeat convolutional neural network input uses an image resized to 256 × 256 pixels.
Therefore, the training data set and the testing data set collected in the step one are used as input, and the pre-trained overfeat convolutional neural network is used for feature extraction.
Step 6: and generating a mixing matrix by feature matching and carrying out space-time continuity detection.
As shown in fig. 2, features extracted from the pictures in each test data set are matched to features proposed from each training data set by an overfeat convolutional neural network.
In order to compare the performance difference of the characteristics of each layer of image in the overfeat convolution neural network on scene recognition, a mixing matrix is further constructed by utilizing the characteristics of each layer:
Mk(i,j)=d(Lk(Ii),Lk(Ij)),i=1,…,R,j=1,…,T
wherein, IiRepresenting the image input in the training dataset of the ith frame, IjRepresenting the image input in the test data set of frame j, Lk(Ii) Represents and IiCorresponding k-th layer output, Mk(i, j) represents the Euclidean distance between the k-th training sample i and the test sample j, namely, the matching degree between the two is described; r and T represent the number of training images and the number of testing images respectively. Each column of the above-mentioned mixing matrix stores the average eigenvector difference between the test image of the jth frame and all the training images.
To find the strongest position matching hypothesis, the element in each column of the mixing matrix with the lowest feature vector difference is searched.
And for possible position matching hypothesis in the mixing matrix, further constructing a spatial continuity filter and a time continuity filter for comprehensive verification, and improving the matching accuracy. Meanwhile, the characteristic performance trained by each layer of network is explored, and the characteristic description of the network middle layer is found to have a good effect on image matching with similar visual angles, and the middle and rear layers have stronger adaptability and robustness to scene visual angle changes.
With accurate position matching, the accumulated error caused by the visual odometer without loop can be compensated, and a globally consistent track is constructed.
Claims (3)
1. A visual SLAM method for full-period CNNs characteristic detection comprises the following steps:
step 1, scanning surrounding environment information by using a binocular camera; using a part of the collected video streams as a training data set, and using a part of the collected video streams as a test data set;
step 2: pre-training a training data set in the video stream acquired in the step 1 by a synchronous detection method; pre-training a training data set by adopting an unsupervised learning model and random gradient descent training to obtain characteristic information jointly represented by motion and depth, and then removing whitening from the characteristic information jointly represented by the motion and the depth and returning the characteristic information to an image space;
and step 3: performing a visual odometer using a convolutional neural network training to obtain local changes in velocity and local changes in direction; inputting the features obtained by the unsupervised learning model training in the step 2 into the first CNN layers with the same architecture for initializing the first CNN layers; two CNNs output local changes in velocity and direction, respectively, and two CNN remainders associate the local changes in velocity and direction, respectively, with a desired label;
the whole CNN network has 6 layers, the first layer is a convolution layer of 5 by 5, and the left and right image learning characteristics are respectively obtained; next, the second layer carries out element multiplication on the features extracted from the left convolution layer and the right convolution layer; a third 1 x 1 convolutional layer, a fourth pooling layer, a fifth fully-connected layer, and a final output layer which is a Softmax layer;
the inputs to both CNNs are 5-frame subsequences and the target output is a vector representation of local velocity and direction changes; the effect and the precision of the local speed and direction change information are evaluated through a binocular data set KITTI;
and 4, step 4: restoring the motion path of the camera by using the local change of the speed and the local change information of the direction obtained in the step 3;
and 5: performing closed loop detection by using a convolutional neural network to eliminate accumulated errors of path prediction;
firstly, using a training data set and a test data set acquired in the step 1 as input, and performing feature extraction by using an overfeat convolutional neural network pre-trained on an imagenet data set; subsequently, matching features extracted from the pictures in each test data set with features proposed from each training data set;
constructing a mixed matrix by using the characteristics of each layer of the overfeat convolution neural network:
Mk(i,j)=d(Lk(Ii),Lk(Ij)),i=1,…,R,j=1,…,T
wherein, IiRepresenting the image input in the training dataset of the ith frame, IjRepresenting the image input in the test data set of frame j, Lk(Ii) Represents and IiCorresponding k-th layer output, Mk(i, j) represents the Euclidean distance between the k-th training sample i and the test sample j, namely, the matching degree between the two is described; r and T respectively represent the number of training images and the number of testing images; each column of the mixing matrix stores the average feature vector difference between the jth frame of test image and all training images;
searching the element with the lowest feature vector difference in each column of the mixing matrix;
2. the visual SLAM method of claim 1, wherein the full-cycle detection of CNNs features is performed by: in the step 1, a binocular camera is used to move along the annular area, the number of moving circles is 1-2, and a closed loop is formed.
3. The visual SLAM method of claim 1, wherein the full-cycle detection of CNNs features is performed by: in step 3, the features obtained by training in step 2 are input into the first CNN layers with the same structure, and the obtained joint representation of the motion and the depth is trained to be associated with the expected label through the CNN structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811087509.9A CN109341703B (en) | 2018-09-18 | 2018-09-18 | Visual SLAM algorithm adopting CNNs characteristic detection in full period |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811087509.9A CN109341703B (en) | 2018-09-18 | 2018-09-18 | Visual SLAM algorithm adopting CNNs characteristic detection in full period |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109341703A CN109341703A (en) | 2019-02-15 |
CN109341703B true CN109341703B (en) | 2022-07-01 |
Family
ID=65305452
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811087509.9A Active CN109341703B (en) | 2018-09-18 | 2018-09-18 | Visual SLAM algorithm adopting CNNs characteristic detection in full period |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109341703B (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109840598B (en) * | 2019-04-29 | 2019-08-09 | 深兰人工智能芯片研究院(江苏)有限公司 | A kind of method for building up and device of deep learning network model |
CN110146099B (en) * | 2019-05-31 | 2020-08-11 | 西安工程大学 | Synchronous positioning and map construction method based on deep learning |
CN110296705B (en) * | 2019-06-28 | 2022-01-25 | 苏州瑞久智能科技有限公司 | Visual SLAM loop detection method based on distance metric learning |
CN110399821B (en) * | 2019-07-17 | 2023-05-30 | 上海师范大学 | Customer satisfaction acquisition method based on facial expression recognition |
CN110487274B (en) * | 2019-07-30 | 2021-01-29 | 中国科学院空间应用工程与技术中心 | SLAM method and system for weak texture scene, navigation vehicle and storage medium |
CN110738128A (en) * | 2019-09-19 | 2020-01-31 | 天津大学 | repeated video detection method based on deep learning |
CN110659619A (en) * | 2019-09-27 | 2020-01-07 | 昆明理工大学 | Depth space-time information-based correlation filtering tracking method |
CN111144550A (en) * | 2019-12-27 | 2020-05-12 | 中国科学院半导体研究所 | Simplex deep neural network model based on homologous continuity and construction method |
CN111243021A (en) * | 2020-01-06 | 2020-06-05 | 武汉理工大学 | Vehicle-mounted visual positioning method and system based on multiple combined cameras and storage medium |
CN111241986B (en) * | 2020-01-08 | 2021-03-30 | 电子科技大学 | Visual SLAM closed loop detection method based on end-to-end relationship network |
CN111753789A (en) * | 2020-07-01 | 2020-10-09 | 重庆邮电大学 | Robot vision SLAM closed loop detection method based on stack type combined self-encoder |
CN113066152B (en) * | 2021-03-18 | 2022-05-27 | 内蒙古工业大学 | AGV map construction method and system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106651830A (en) * | 2016-09-28 | 2017-05-10 | 华南理工大学 | Image quality test method based on parallel convolutional neural network |
CN106780631B (en) * | 2017-01-11 | 2020-01-03 | 山东大学 | Robot closed-loop detection method based on deep learning |
CN107563430A (en) * | 2017-08-28 | 2018-01-09 | 昆明理工大学 | A kind of convolutional neural networks algorithm optimization method based on sparse autocoder and gray scale correlation fractal dimension |
CN107808132A (en) * | 2017-10-23 | 2018-03-16 | 重庆邮电大学 | A kind of scene image classification method for merging topic model |
CN107944386B (en) * | 2017-11-22 | 2019-11-22 | 天津大学 | Visual scene recognition methods based on convolutional neural networks |
-
2018
- 2018-09-18 CN CN201811087509.9A patent/CN109341703B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN109341703A (en) | 2019-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109341703B (en) | Visual SLAM algorithm adopting CNNs characteristic detection in full period | |
Zhou et al. | To learn or not to learn: Visual localization from essential matrices | |
Schönberger et al. | Semantic visual localization | |
CN111311666B (en) | Monocular vision odometer method integrating edge features and deep learning | |
CN110781262B (en) | Semantic map construction method based on visual SLAM | |
CN110135249B (en) | Human behavior identification method based on time attention mechanism and LSTM (least Square TM) | |
CN113313763B (en) | Monocular camera pose optimization method and device based on neural network | |
Vaquero et al. | Dual-branch CNNs for vehicle detection and tracking on LiDAR data | |
Tinchev et al. | Skd: Keypoint detection for point clouds using saliency estimation | |
Saleem et al. | Neural network-based recent research developments in SLAM for autonomous ground vehicles: A review | |
CN113781563B (en) | Mobile robot loop detection method based on deep learning | |
Getahun et al. | A deep learning approach for lane detection | |
CN112767546B (en) | Binocular image-based visual map generation method for mobile robot | |
Feng et al. | Localization and mapping using instance-specific mesh models | |
Tsintotas et al. | The revisiting problem in simultaneous localization and mapping | |
Felton et al. | Deep metric learning for visual servoing: when pose and image meet in latent space | |
Jo et al. | Mixture density-PoseNet and its application to monocular camera-based global localization | |
Xi et al. | Multi-motion segmentation: Combining geometric model-fitting and optical flow for RGB sensors | |
CN111862147B (en) | Tracking method for multiple vehicles and multiple lines of human targets in video | |
Esfahani et al. | From local understanding to global regression in monocular visual odometry | |
Tsintotas et al. | Online Appearance-Based Place Recognition and Mapping: Their Role in Autonomous Navigation | |
CN111578956A (en) | Visual SLAM positioning method based on deep learning | |
CN116958057A (en) | Strategy-guided visual loop detection method | |
Han et al. | BASL-AD SLAM: A Robust Deep-Learning Feature-Based Visual SLAM System With Adaptive Motion Model | |
CN114140524A (en) | Closed loop detection system and method for multi-scale feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |