Abstract
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arya, S. and Mount, D.M. 1993. Approximate nearest neighbor queries in fixed dimensions. In Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'93),pp. 271–280.
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., and Wu, A.Y. 1998. Anoptimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45:891–923.
Ballard, D.H. 1981. Generalizing the Hough transform to detect arbitrary patterns. Pattern Recognition, 13(2):111–122.
Basri, R. and Jacobs, D.W. 1997. Recognition using region correspondences. International Journal of Computer Vision, 25(2):145–166.
Baumberg, A. 2000. Reliable feature matching across widely separated views. In Conference on ComputerVision andPattern Recognition, Hilton Head, South Carolina, pp. 774–781.
Beis, J. and Lowe, D.G. 1997. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In Conference on Computer Vision and Pattern Recognition, Puerto Rico, pp. 1000–1006.
Brown, M. and Lowe, D.G. 2002. Invariant features from interest point groups. In British Machine Vision Conference, Cardiff, Wales, pp. 656–665.
Carneiro, G. and Jepson, A.D. 2002. Phase-based local features. In European Conference on Computer Vision (ECCV), Copenhagen, Denmark, pp. 282–296.
Crowley, J.L. and Parker, A.C. 1984. Arepresentation for shape based on peaks and ridges in the difference of low-pass transform. IEEE Trans. on Pattern Analysis and Machine Intelligence, 6(2):156–170.
Edelman, S., Intrator, N., and Poggio, T. 1997. Complex cells and object recognition. Unpublished manuscript: http://kybele.psych.cornell.edu/~edelman/archive.html
Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, pp. 264–271.
Friedman, J.H., Bentley, J.L., and Finkel, R.A. 1977. An algorithm for finding best matches in logarithmic expected time. ACMTransactions on Mathematical Software, 3(3):209–226.
Funt, B.V. and Finlayson, G.D. 1995. Color constant color indexing. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17(5):522–529.
Grimson, E. 1990. Object Recognition by Computer: The Role of Geometric Constraints, The MIT Press: Cambridge, MA.
Harris, C. 1992. Geometry from visual motion. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press, pp. 263–284.
Harris, C. and Stephens, M. 1988. Acombined corner and edge detector. In Fourth Alvey Vision Conference, Manchester, UK, pp. 147–151.
Hartley, R. and Zisserman, A. 2000. Multiple view geometry in computer vision, Cambridge University Press: Cambridge, UK.
Hough, P.V.C. 1962. Method and means for recognizing complex patterns. U.S. Patent 3069654.
Koenderink, J.J. 1984. The structure of images. Biological Cybernetics, 50:363–396.
Lindeberg, T. 1993. Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention. International Journal of Computer Vision, 11(3):283–318.
Lindeberg, T. 1994. Scale-space theory: A basic tool for analysing structures at different scales. Journal of Applied Statistics, 21(2):224–270.
Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(5):441–450.
Lowe, D.G. 1999. Object recognition from local scale-invariant features. In International Conference on Computer Vision, Corfu, Greece, pp. 1150–1157.
Lowe, D.G. 2001. Local feature view clustering for 3D object recognition. IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp. 682–688.
Luong, Q.T. and Faugeras, O.D. 1996. The fundamental matrix: Theory, algorithms, and stability analysis. International Journal of Computer Vision, 17(1):43–76.
Matas, J., Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference, Cardiff, Wales, pp. 384–393.
Mikolajczyk, K. 2002. Detection of local features invariant to affine transformations, Ph.D. thesis, Institut National Polytechnique de Grenoble, France.
Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In European Conference on Computer Vision (ECCV), Copenhagen, Denmark, pp. 128–142.
Mikolajczyk, K., Zisserman, A., and Schmid, C. 2003. Shape recognition with edge-based features. In Proceedings of the British Machine Vision Conference, Norwich, U.K.
Moravec, H. 1981. Rover visual obstacle avoidance. In International Joint Conference on Artificial Intelligence, Vancouver, Canada, pp. 785–790.
Nelson, R.C. and Selinger, A. 1998. Large-scale tests of a keyed, appearance-based 3-D object recognition system. Vision Research, 38(15):2469–2488.
Pope, A.R. and Lowe, D.G. 2000. Probabilistic models of appearance for 3-D object recognition. International Journal of Computer Vision, 40(2):149–167.
Pritchard, D. and Heidrich,W. 2003. Cloth motion capture. Computer Graphics Forum (Eurographics 2003), 22(3):263–271.
Schaffalitzky, F. and Zisserman, A. 2002. Multi-view matching for unordered image sets, or 'How do I organize my holiday snaps?'” In European Conference on Computer Vision, Copenhagen, Denmark, pp. 414–431.
Schiele, B. and Crowley, J.L. 2000. Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision, 36(1):31–50.
Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(5):530–534.
Se, S., Lowe, D.G., and Little, J. 2001. Vision-based mobile robot localization and mapping using scale-invariant features. In International Conference on Robotics and Automation, Seoul, Korea, pp. 2051–2058.
Se, S., Lowe, D.G., and Little, J. 2002. Global localization using distinctive visual features. In International Conference on Intelligent Robots and Systems, IROS 2002, Lausanne, Switzerland, pp. 226–231.
Shokoufandeh, A., Marsic, I., and Dickinson, S.J. 1999. View-based object recognition using saliency maps. Image and Vision Computing, 17:445–460.
Torr, P. 1995. Motion segmentation and outlier detection, Ph.D. Thesis, Dept. of Engineering Science, University of Oxford, UK.
Tuytelaars, T. and Van Gool, L. 2000. Wide baseline stereo based on local, affinely invariant regions. In British Machine Vision Conference, Bristol, UK, pp. 412–422.
Weber, M., Welling, M., and Perona, P. 2000. Unsupervised learning of models for recognition. In European Conference on Computer Vision, Dublin, Ireland, pp. 18–32.
Witkin, A.P. 1983. Scale-space filtering. In International Joint Conference on Artificial Intelligence, Karlsruhe, Germany, pp. 1019–1022.
Zhang, Z., Deriche, R., Faugeras, O., and Luong, Q.T. 1995. Arobust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence, 78:87–119.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Issue Date:
DOI: https://doi.org/10.1023/B:VISI.0000029664.99615.94