Abstract
A shared vocabulary between humans and robots for describing spatial concepts is essential for effective human robot interaction. Towards this goal, we present a novel technique for place categorization from visual cues called PLISS (Place Labeling through Image Sequence Segmentation). PLISS is different from existing place categorization systems in two major ways—it inherently works on video and image streams rather than single images, and it can detect “unknown” place labels, i.e. place categories that it does not know about. PLISS uses changepoint detection to temporally segment image sequences which are subsequently labeled. Changepoint detection and labeling are performed inside a systematic probabilistic framework. Unknown place labels are detected by using a probabilistic classifier and keeping track of its label uncertainty. We present experiments and comparisons on the large and extensive VPC dataset. We also demonstrate results using models learned from images downloaded from Google’s image search.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adams, R. P., & MacKay, D. J. C. (2007). Bayesian online changepoint detection (Technical report). University of Cambridge, Cambridge, UK. arXiv:0710.3742v1 [stat.ML].
Andreasson, H., Treptow, A., & Duckett, T. (2005). Localization for mobile robots using panoramic vision, local features and particle filter. In IEEE intl. conf. on robotics and automation (ICRA).
Bosch, A., Zisserman, A., & Munoz, X. (2007). Image classification using random forests and ferns. In Intl. conf. on computer vision (ICCV) (pp. 1–8).
Casella, G., & Robert, C. P. (1996). Rao-Blackwellisation of sampling schemes. Biometrika, 83(1), 81–94.
Chang, C.-C., & Lin, C.-J. (2001). LIBSVM: a library for support vector machines.
Chopin, N. (2007). Dynamic detection of change points in long time series. Annals of the Institute of Statistical Mathematics, 59(2), 349–366.
Csato, L., & Opper, M. (2002). Sparse online Gaussian processes. Neural Computation, 14(2), 641–669.
Dasgupta, S., Hsu, D. J., & Verma, N. (2006). A concentration theorem for projections. In Conf. on uncertainty in artificial intelligence (UAI).
Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2008). Image retrieval: Ideas, influences, and trends of the new age.. ACM Computing Surveys (CSUR), 40(2), 1–60.
Diaconis, P., & Freedman, D. (1984). Asymptotics of graphical projection pursuit. Annals of Statistics, 12, 793–815.
Esterby, S. R., & El-Shaarawi, A. H. (1981). Inference about the point of change in a regression model. Applied Statistics, 30(3), 277–285.
Fearnhead, P., & Clifford, P. (2003). Online inference for hidden Markov models. Journal of the Royal Statistical Society: Series B, 65, 887–899.
Fearnhead, P., & Liu, Z. (2007). On-line inference for multiple changepoint problems. Journal of the Royal Statistical Society: Series B, 69(4), 589–605.
Fei-Fei, L., & Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In IEEE conf. on computer vision and pattern recognition (CVPR).
Gaspar, J., Winters, N., & Santos-Victor, J. (2000). Vision-based navigation and environmental representations with an omnidirectional camera. IEEE Transactions on Robotics and Automation, 16(6), 890–898.
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1995). Bayesian data analysis. London: Chapman and Hall.
Grauman, K., & Darrell, T. (2007). The pyramid match kernel: Efficient learning with sets of features. Journal of Machine Learning Research, 8, 725–760.
Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2010). Gaussian processes for object categorization. International Journal of Computer Vision, 88, 169–188.
Kuipers, B. J. (2000). The spatial semantic hierarchy. Artificial Intelligence, 119, 191–233.
Kuipers, B., & Beeson, P. (2002). Bootstrap learning for place recognition. In Nat. conf. on artificial intelligence (AAAI) (pp. 174–180).
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In IEEE conf. on computer vision and pattern recognition (CVPR).
Madsen, R. E., Kauchak, D., & Elkan, C. (2005). Modeling word burstiness using the Dirichlet distribution. In Intl. conf. on machine learning (ICML) (pp. 545–552).
Malik, J., Belongie, S., Leung, T., & Shi, J. (2001). Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43, 7–27.
Martínez Mozos, O., Rottmann, A., Triebel, R., Jensfelt, P., & Burgard, W. (2006). Semantic labeling of places using information extracted from laser and vision sensor data. In Proc. of the IEEE/RSJ IROS 2006 workshop: from sensors to human spatial concepts.
Menegatti, E., Maeda, T., & Ishiguro, H. (2004). Image-based memory for robot navigation using properties of the omnidirectional images. Journal of Robotics and Autonomous Systems, 47(4), 251–267.
Minka, T. P. Estimating a Dirichlet distribution (2003).
Minka, T. P. (2003). The ‘summation hack’ as an outlier model.
Naor, A., & Romik, D. (2003). Projecting the surface measure of the sphere of \(l_{p}^{n}\). Annales de l’Institut Henri Poincare (B), Probability and Statistics, 39, 241–261.
Oliva, A., & Torralba, A. (2006). Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research, 155.
Page, E. S. (1954). Continuous inspection scheme. Biometrika, 41, 100–115.
Posner, I., Schroeter, D., & Newman, P. (2006). Using scene similarity for place labeling. In International symposium of experimental robotics.
Posner, I., Cummins, M., & Newman, P. (2009). A generative framework for fast urban labeling using spatial and temporal context. Autonomous Robots, 26, 153–170.
Pronobis, A., Mozos, O. M., Caputo, B., & Jensfelt, P. (2010). Multi-modal semantic place classification. International Journal of Robotics Research, 29(2–3), 298–320.
Ranganathan, A. (2010). Pliss: Detecting and labeling places using online change-point detection. In Proceedings of robotics: science and systems.
Ranganathan, A., & Dellaert, F. (2007). Semantic modeling of places using objects. In Robotics: science and systems (RSS), Atlanta, USA.
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge: MIT Press.
Rauch, H. (1963). Solutions to the linear smoothing problem. IEEE Transactions on Automatic Control, 8(4), 371–372.
Rottmann, A., Martinez Mozos, O., Stachniss, C., & Burgard, W. (2005). Semantic place classification of indoor environments with mobile robots using boosting. In Nat. conf. on artificial intelligence (AAAI).
Salakhutdinov, R., & Hinton, G. (2009). Semantic hashing. International Journal of Approximate Reasoning, 50(7), 969–978.
Schölkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Advances in kernel methods—support vector learning. Cambridge: MIT Press.
Siagian, C., & Itti, L. (2007). Biologically-inspired robotics vision Monte-Carlo localization in the outdoor environment. In IEEE/RSJ intl. conf. on intelligent robots and systems (IROS).
Tapus, A., Tomatis, N., & Siegwart, R. (2004). Topological global localization and mapping with fingerprint and uncertainty. In Proceedings of the international symposium on experimental robotics.
Taylan Cemgil, A., Zajdel, W., & Krose, B. (2005). A hybrid graphical model for robust feature extraction from video. In IEEE conf. on computer vision and pattern recognition (CVPR).
Topp, E. A., Hüttenrauch, H., Christensen, H. I., & Eklundh, K. S. (2006). Bringing together human and robotic environment representations—a pilot study. In IEEE/RSJ intl. conf. on intelligent robots and systems (IROS), Beijing, China, October 2006.
Torralba, A., Murphy, K. P., Freeman, W. T., & Rubin, M. A. (2003). Context-based vision system for place and object recognition. In Intl. conf. on computer vision (ICCV) (Vol. 1, pp. 273–280).
Tsechpenakis, G., Metaxas, D., Hadjiliadis, O., & Neidle, C. (2006). Robust online change-point detection in video sequences. In 2nd IEEE workshop on vision for human computer interaction (V4HCI), in conjunction with the IEEE conference on computer vision and pattern recognition.
Ulrich, I., & Nourbakhsh, I. (2000). Appearance-based place recognition for topological localization. In IEEE intl. conf. on robotics and automation (ICRA), April (Vol. 2, pp. 1023–1029).
Weiss, Y., Torralba, A., & Fergus, R. (2008). Spectral hashing. In Advances in neural information processing systems (NIPS).
Wiiliams, C. K. I., & Barber, D. (1998). Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1342–1351.
Wu, J., & Rehg, J. M. (2008). Where am i: Place instance and category recognition using spatial pact. In IEEE conf. on computer vision and pattern recognition (CVPR).
Wu, J., Christensen, H., & Rehg, J. M. (2009). Visual place categorization: Problem, dataset, and algorithm. In IEEE/RSJ intl. conf. on intelligent robots and systems (IROS).
Zabih, R., & Woodfill, J. (1994). Non-parametric local transforms for computing visual correspondence. In Eur. conf. on computer vision (ECCV) (Vol. 2, pp. 151–158).
Zender, H., Jensfelt, P., Mozos, O. M., Kruijff, G.-J., & Burgard, W. (2007). An integrated robotic system for spatial understanding and situated interaction in indoor environments. In Nat. conf. on artificial intelligence (AAAI).
Zhai, Y., & Shah, M. (2005). A general framework for temporal video scene segmentation. In Intl. conf. on computer vision (ICCV) (Vol. 2, pp. 1111–1116).
Zivkovic, Z., Booij, O., & Kröse, B. (2007). From images to rooms. Journal of Robotics and Autonomous Systems, 55(5), 411–418.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
(MOV 8.3 MB)
Rights and permissions
About this article
Cite this article
Ranganathan, A. PLISS: labeling places using online changepoint detection. Auton Robot 32, 351–368 (2012). https://doi.org/10.1007/s10514-012-9273-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10514-012-9273-4