Abstract
Many applications of 3D object recognition, such as augmented reality or robotic manipulation, require an accurate solution for the 3D pose of the recognized objects. This is best accomplished by building a metrically accurate 3D model of the object and all its feature locations, and then fitting this model to features detected in new images. In this chapter, we describe a system for constructing 3D metric models from multiple images taken with an uncalibrated handheld camera, recognizing these models in new images, and precisely solving for object pose. This is demonstrated in an augmented reality application where objects must be recognized, tracked, and superimposed on new images taken from arbitrary viewpoints without perceptible jitter. This approach not only provides for accurate pose, but also allows for integration of features from multiple training images into a single model that provides for more reliable recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
ARToolKit: http://www.hitl.washington.edu/artoolkit/
Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1000–1006 (1997)
Chia, K.W., Cheok, A.D., Prince, S.J.D.: Online 6 DOF augmented reality registration from natural features. In: Proceedings of the International Symposium on Mixed and Augmented Reality, pp. 305–313 (2002)
Cornelis, K., Pollefeys, M., Vergauwen, M., Van Gool, L.: Augmented Reality Using Uncalibrated Video Sequences. In: Pollefeys, M., Van Gool, L., Zisserman, A., Fitzgibbon, A.W. (eds.) SMILE 2000. LNCS, vol. 2018, pp. 144–160. Springer, Heidelberg (2001)
Ferrari, V., Tuytelaars, T., Van Gool, L.: Markerless augmented reality with a real-time affine region tracker. In: Proceedings of the IEEE and ACM International Symposium on Augmented Reality, pp. 87–96 (2001)
Fischler, M., Bolles, R.: RANdom SAmple Consensus: a paradigm for model fitting with application to image analysis and automated cartography. Communications of the Association for Computing Machinery 24(6), 381–395 (1981)
Genc, Y., Riedel, S., Souvannavong, F., Akinlar, C., Navab, N.: Marker-less tracking for AR: A learning-based approach. In: Proceedings of the International Symposium on Mixed and Augmented Reality, pp. 295–304 (2002)
Gordon, I., Lowe, D.G.: Scene modeling, recognition and tracking with invariant image features. In: International Symposium on Mixed and Augmented Reality (ISMAR), Arlington, VA, pp. 110–119 (2004)
Harris, C.J., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)
Klein, G., Drummond, T.: Robust visual tracking for non-instrumented augmented reality. In: Proceedings of the 2nd IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 113–122 (2003)
Kutulakos, K.N., Vallino, J.R.: Calibration-free augmented reality. IEEE Transactions on Visualization and Computer Graphics 4(1), 1–20 (1998)
Lepetit, V., Vacchetti, L., Thalmann, D., Fua, P.: Fully automated and stable registration for augmented reality applications. In: Proceedings of the 2nd IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 93–102 (2003)
Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, pp. 1150–1157 (1999)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 674–679 (1981)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge (1992)
Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3D Object Modeling and Recognition Using Affine-Invariant Patches and Multi-View Spatial Constraints. In: IEEE Conference on Computer Vision and Pattern Recognition, Madison, WI, pp. 272–277 (2003)
Sawhney, H.S., Guo, Y., Asmuth, J., Kumar, R.: Multi-view 3D estimation and applications to match move. In: Proceedings of the IEEE Workshop on Multi-View Modeling and Analysis of Visual Scenes, pp. 21–28 (1999)
Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or How do I organize my holiday snaps? In: Proceedings of the 7th European Conference on Computer Vision, pp. 414–431 (2002)
Seo, Y., Hong, K.S.: Calibration-free augmented reality in perspective. IEEE Transactions on Visualization and Computer Graphics 6(4), 346–359 (2000)
Szeliski, R., Kang, S.B.: Recovering 3D shape and motion from image streams using nonlinear least squares. Journal of Visual Communication and Image Representation 5(1), 10–28 (1994)
Yao, A., Calway, A.: Robust estimation of 3-D camera motion for uncalibrated augmented reality. Technical Report CSTR-02-001, Department of Computer Science, University of Bristol (March 2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Gordon, I., Lowe, D.G. (2006). What and Where: 3D Object Recognition with Accurate Pose. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_4
Download citation
DOI: https://doi.org/10.1007/11957959_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)