Abstract
Loop closure detection (LCD) is crucial for the simultaneous localization and mapping system of an autonomous robot. Image features from a convolution neural network (CNN) have been widely used for LCD in recent years. Instead of directly using the feature vectors to compute the image similarity, we propose a novel and easy-to-implement method that manages features from a CNN via a novel approach to improve the performance. In this method, the elements of feature maps from the higher layer of the CNN are clustered to generate CNN words (CNNW). To encode spatial information of CNNW, we create word pairs (CNNWP) that are based on single words to improve the performance. In addition, traditional tricks that are used in methods that are based on bag of words (BoW) are integrated into our approach. We also demonstrate that the feature maps from lower layers can be used as descriptors to conduct local region matching between images. Via this approach, we can perform geometric verification for possible loop closures, similar to BoW methods, in our approach. The experimental results demonstrate that our method substantially outperforms state-of-the-art methods that directly use CNN features for LCD.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cummins M, Newman P (2008) FAB-MAP: probabilistic localization and mapping in the space of appearance. Int J Robot Res 27(6):647–665
Falliat D (2007) A visual bag of words method for interactive qualitative localization and mapping. In: Proceedings of 2007 IEEE international conference on robotics and automation, pp 3921–3926
Naseer T, Spinello L, Burgard W, Stachniss C (2014) Robust visual robot localization across seasons using network flows. In: Twenty-eighth AAAI conference on artificial intelligence, pp 2564–2570
Glover AJ, Maddern WP, Wyeth MJ, Milford GF (2010) FAB-MAP + RatSLAM: appearance-based SLAM for multiple times of day. In: IEEE international conference on robotics and automation, pp 3507–3512
Chen Z, Lam O, Jacobson A, Milford M (2014) Convolutional neural network-based place recognition. arXiv preprint arXiv:1411.509
Sünderhauf N, Shirazi S, Dayoub F, Upcroft B, Milford M (2015) On the performance of ConvNet features for place recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 4297–4304
Sünderhauf N, Shirazi S, Jacobson A, Dayoub F, Pepperell E, Upcroft B, Milford M (2015) Place recognition with ConvNet landmarks: viewpoint-robust, condition-robust, training-free. In: Proceedings of robotics: science and systems XII
Lowry S, Newman P, Leonard J, Cox D, Corke P, Milford M (2016) Visual place recognition: a survey. IEEE Trans Robot 32(1):1–19
Cummins M, Newman P (2011) Appearance-only SLAM at large scale with FAB-MAP 2.0. Int J Robot Res 30(9):1100–1123
Loukas B, Amanatiadis A, Gasteratos A (2018) Fast loop-closure detection using visual-word-vectors from image sequences. Int J Robot Res 37(1):62–82
Nicosevici T, Garcia R (2012) Automatic visual bag-of-words for online robot navigation and mapping. IEEE Trans Robot 28(4):886–898
Johns E, Yang GZ (2013) Feature co-occurrence maps: appearance-based localisation throughout the day. In: IEEE international conference on robotics and automation, pp 3212–3218
Kejriwal N, Kumar S, Shibata T (2016) High performance loop closure detection using bag of word pairs. Robot Auton Syst 77(C):55–65
Konolige K, Bowman J, Chen J, Mihelich P, Calonder M, Lepetit V, Fua P (2010) View-based maps. Int J Robot Res 29(8):941–957
Loquercio A, Dymczyk M, Zeisl B, Lynen S, Gilitschenski I, Siegwart R (2017) Efficient descriptor learning for large scale localization. In: IEEE international conference on robotics and automation (ICRA), pp 3170–3177
Gao X, Zhang T (2017) Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Auton. Robot 41(1):1–18
Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN Architecture for weakly supervised place recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5297–5307
Chen Z, Jacobson A, Sünderhauf N, Upcroft B, Liu L, Shen C (2017) Deep learning features at scale for visual place recognition. In: IEEE international conference on robotics and automation (ICRA), pp 3223–3230
Naseer T, Oliveira GL, Brox T, Burgard W (2017) Semantics-aware visual localization under challenging perceptual conditions. In: IEEE international conference on robotics and automation, pp 2614–2620
Hou Y, Zhang H, Zhou S (2018) Evaluation of object proposals and ConvNet features for landmark-based visual place recognition. J Intell Rob Syst 92(4):505–520
Cascianelli S, Costante G, Bellocchio E, Valigi P, Fravolini M, Ciarfuglia T (2017) Robust visual semi-semantic loop closure detection by a covisibility graph and CNN features. Robot Auton Syst 92:53–65
Milford MJ, Wyeth GF (2012) SeqSLAM: visual route-based navigation for sunny summer days and stormy winter nights. In: 2012 IEEE international conference on robotics and automation (ICRA), pp 1643–1649
Hansen, Peter, rowning B (2014) Visual place recognition using HMM sequence matching. In: IEEE international conference on intelligent robots and systems, pp 4549–4555
Bampis L, Amanatiadis A, Gasteratos A (2016) Encoding the description of image sequences: A two-layered pipeline for loop closure detection. In: IEEE international conference on intelligent robots and systems, pp 4530–4536
Angeli A, Filliat D, Doncieux S, Meyer JA (2008) Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans Robot 24(5):1027–1037
Wang XL, Peng G, Zhang H (2018) Combining multiple image descriptions for loop closure detection. J Intell Rob Syst 92(3):565–585
Arroyo R, Alcantarilla PF, Bergasa LM, Romera E (2016) Fusion and binarization of CNN features for robust topological localization across seasons. In: IEEE international conference on intelligent robots and systems, pp 4656–4663
Li Q, Li K, You X, Bu S, Liu Z (2016) Place recognition based on deep feature and adaptive weighting of similarity matrix. Neurocomputing 199(2):114–127
Galvez-López D, Tardos JD (2012) Bags of binary words for fast place recognition in image sequences. IEEE Trans Robot 28(5):1188–1197
Siam SM, Zhang H (2017) Fast-SeqSLAM: A fast appearance based place recognition algorithm. In: IEEE international conference on robotics and automation. Piscataway, pp 5702–5708
Labbé M, Michaud F (2013) Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Trans Robot 29(3):734–745
Endres F, Hess J, Sturm J, Cremers D, Burgard W (2017) 3-D mapping with an RGB-D camera. IEEE Trans Robot 30(1):177–187
Blanco JL, Moreno FA, González J (2009) A collection of outdoor robotic datasets with centimeter-accuracy ground truth. Auton. Robots 27(4):327–351
Simonyan, K., Zisserman, A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR)
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Acknowledgments
The authors express sincere appreciation to the editors and reviewers for their efforts to improve this paper. We also want to thank Arren Glover, Mark Cummins, and Blanco Jose Luis for providing the Garden Point, City Center, New College, and Malaga Parking 6L datasets.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, Q., Duan, F. Loop closure detection using CNN words. Intel Serv Robotics 12, 303–318 (2019). https://doi.org/10.1007/s11370-019-00284-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11370-019-00284-9