Web-enhanced object category learning for domestic robots

Christian I Penaloza¹,
Yasushi Mae¹,
Kenichi Ohara¹,
Tomohito Takubo¹ &
…
Tatsuo Arai¹

320 Accesses
3 Citations
Explore all metrics

Abstract

We present a system architecture for domestic robots that allows them to learn object categories after one sample object was initially learned. We explore the situation in which a human teaches a robot a novel object, and the robot enhances such learning by using a large amount of image data from the Internet. The main goal of this research is to provide a robot with capabilities to enhance its learning while minimizing time and effort required for a human to train a robot. Our active learning approach consists of learning the object name using speech interface, and creating a visual object model by using a depth-based attention model adapted to the robot’s personal space. Given the object’s name (keyword), a large amount of object-related images from two main image sources (Google Images and the LabelMe website) are collected. We deal with the problem of separating good training samples from noisy images by performing two steps: (1) Similar image selection using a Simile Selector Classifier, and (2) non-real image filtering by implementing a variant of Gaussian Discriminant Analysis. After web image selection, object category classifiers are then trained and tested using different objects of the same category. Our experiments demonstrate the effectiveness of our robot learning approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Agarwal S, Awan A, Roth D (2004) Learning to detect objects in images via a sparse, part-based representation. IEEE PAMI 20(11):1475–1490
Article Google Scholar
Leibe B, Leonardis A, Schiele B (2004) Combined object categorization and segmentation with an implicit shape model. In: Workshop on statistical learning in computer vision, ECCV
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the CVPR, pp 511–518
Opelt A, Fessenegger A, Auer P (2004) Weak hypotheses and boosting for generic object detection and recognition. In: ECCV
Thomaz AL, Cakmak M (2009) Learning about objects with human teachers. In: Proceedings of the international conference on human robot interaction (HRI)
Fergus R, Fei-Fei L, Perona P, Zisserman A (2005) Learning object categories from Google’s Image search. ICCV 2
Vijayanarasimhan S, Grauman K (2008) Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. In: CVPR
Vijayanarasimhan S, Grauman K (2011) Large-scale live active learning: training object detectos with crawled data and crowds. In: CVPR
Tsai D, Jing Y, Liu Y, Rowley H, Ioffe S, Rehg JM (2011) Large-scale image annotation using visual synset. In: ICCV
Li L-J, Fei-Fei L (2007) Optimol: automatic object picture collection via incremental model learning. In: CVPR
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77(1–3):157–173
Article Google Scholar
Breazeal C, Thomaz AL (2008) Learning from human teachers with socially guided exploration. In: Proceedings of the international conference on robots and automation (ICRA)
Vogel A, Raghunathan K, Jurafsky D (2005) Dialog with robots. In: AAAI
Mansur A, Sakata K, Rukhsana T, Kobayashi Y, Kuno Y (2008) Human robot interaction through simple expressions for object recognition. The 17th IEEE international symposium on robot and human interactive communication, RO-MAM
Cao L, Kobayashi Y, Kuno Y (2009) Spatial relation model for object recognition in human-robot interaction. In: Proceedings of the 5th international conference on Emerging intelligent computing technology and applications, ICIC
Microsoft Speech Application Programming Interface (API) and SDK, Version 5.1, Microsoft Corporation, http://www.microsoft.com/speech
Drummond C, Holte R (2003) Class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on international conference on machine learning, ICML
Ha TM, Bunke H (1997) Off-line, handwritten numeral recognition by perturbation method. IEEE Trans Pattern Anal Mach Intell 19(5):535–539
Google Scholar
Itti L, Koch C (2001) Computational modeling of visual attention. Nat Rev: Neurosci 2:194–203
Article Google Scholar
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Sun Y, Fisher R (2003) Object-based visual attention for computer vision. Artif Intell 146:77–123
Google Scholar
Frintrop S (2006) VOCUS: a visual attention system for object detection and goal-directed search.Springer, Heidelberg, vol 3899. LNAI 3–540-32759-2
Hall ET (1966) The hidden dimension. Anchor Books, New York
Google Scholar
Microsoft Knect for Windows SDK BETA from Microsoft Research, http://research.microsoft.com/en-us/um/redmond/projects/kinectsdk
Shotton J, Fitzgibbon AW, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR, pp 1297–1304
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Leibe B, Leonardis A, Schiele B (2008) Robust object detection with interleaved categorization and segmentation. Int J Comput Vis 77(1-3):259–289
Google Scholar
Gall J, Lempitsky V (2009) Class-specific hough forests for object detection. IEEE conference on computer vision and pattern recognition, pp 1022–1029

Download references

Acknowledgments

This work was supported in part by Grant-in-Aid for Scientific Research (C) 23500242.

Author information

Authors and Affiliations

Graduate School of Engineering Science, Osaka University, Machikaneyama-cho 1-3, Toyonaka, Osaka, 560-8531, Japan
Christian I Penaloza, Yasushi Mae, Kenichi Ohara, Tomohito Takubo & Tatsuo Arai

Authors

Christian I Penaloza
View author publications
You can also search for this author in PubMed Google Scholar
Yasushi Mae
View author publications
You can also search for this author in PubMed Google Scholar
Kenichi Ohara
View author publications
You can also search for this author in PubMed Google Scholar
Tomohito Takubo
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuo Arai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian I Penaloza.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Penaloza, C.I., Mae, Y., Ohara, K. et al. Web-enhanced object category learning for domestic robots. Intel Serv Robotics 6, 53–67 (2013). https://doi.org/10.1007/s11370-012-0126-y

Download citation

Received: 30 March 2012
Accepted: 07 November 2012
Published: 27 November 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s11370-012-0126-y

Web-enhanced object category learning for domestic robots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

From passive to interactive object learning and recognition through self-identification on a humanoid robot

Fast Learning for Accurate Object Recognition Using a Pre-trained Deep Neural Network

Automatic Creation of Large Scale Object Databases from Web Resources: A Case Study in Robot Vision

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Web-enhanced object category learning for domestic robots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

From passive to interactive object learning and recognition through self-identification on a humanoid robot

Fast Learning for Accurate Object Recognition Using a Pre-trained Deep Neural Network

Automatic Creation of Large Scale Object Databases from Web Resources: A Case Study in Robot Vision

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation