Abstract
Methods based on distinguished regions (transformation covariant detectable regions) have achieved considerable success in object recognition, retrieval and matching problems in both still images and videos. The chapter focuses on a method exploiting local coordinate systems (local affine frames) established on maximally stable extremal regions. We provide a taxonomy of affine-covariant constructions of local coordinate systems, prove their affine covariance and present algorithmic details on their computation. Exploiting processes proposed for computation of affine-invariant local frames of reference, tentative region-to-region correspondences are established. Object recognition is formulated as a problem of finding a maximal set of geometrically consistent matches.
State of the art results are reported on standard, publicly available, object recognition tests (COIL-100, ZuBuD, FOCUS). Change of scale, illumination conditions, out-of-plane rotation, occlusion , locally anisotropic scale change and 3D translation of the viewpoint are all present in the test problems.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ballester, C., Gonzalez, M.: Affine invariant texture segmentation and shape from texture by variational methods. Journal of Mathematical Imaging and Vision 9, 141–171 (1998)
Caputo, B., Hornegger, J., Paulus, D., Niemann, H.: A spin-glass markov random field for 3-D object recognition. Technical Report LME-TR-2002-01, Lehrstuhl für Mustererkennung, Institut für Informatik, Universität Erlangen-Nürnberg (2002)
Chum, O., Matas, J., Obdržálek, Š.: Enhancing RANSAC by generalized model optimization. In: Proc. of the Asian Conference on Computer Vision (ACCV), vol. 2, pp. 812–817 (January 2004)
Cohen, S.: Finding color and shape patterns in images. Technical Report STAN-CS-TR-99-1620, Stanford University (May 1999)
Douglas, D., Peucker, T.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Canadian Cartographer 10, 112–122 (1973)
Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation by image exploration. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 40–54. Springer, Heidelberg (2004)
Finlayson, G., Drew, M., Funt, B.: Color constancy: Generalized diagonal transforms suffice. Journal of the Optical Society of America 11, 3011–3019 (1994)
Finlayson, G., Drew, M., Funt, B.: Spectral sharpening: Sensor transformations for improved color constancy. Journal of the Optical Society of America 11, 1553–1563 (1994)
Forssén, P.-E., Granlund, G.: Robust Multi-scale Extraction of Blob Features. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 11–18. Springer, Heidelberg (2003)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–152 (1988)
Healey, G.: Using color for geometry-insensitive segmentation. Journal of the Optical Society of America 6, 86–103 (1989)
Heikkilä, J.: Pattern matching with affine moment descriptors. Pattern Recognition 37(9), 1825–1834 (2004)
Jain, A.K.: Fundamentals of Digital Image Processing (1986)
Lindeberg, T.: Feature detection with automatic scale selection. International Journal on Computer Vision 30(2), 79–116 (1998)
Liu, X., Srivastava, A.: A spectral representation for appearance-based classification and recognition. In: Proceedings of the International Conference on Pattern Recognition, pp. 37–40 (2002)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision 20(2), 91–110 (2004)
Marée, R., Geurts, P., Piater, J., Wehenkel, L.: Random subwindows for robust image classification. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (2005)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22(10), 761–767 (2004)
Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proceedings of the International Conference on Computer Vision, pp. 525–531 (2001)
Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Proceedings of the European Conference on Computer Vision, pp. 128–142 (2002)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., van Gool, L.: A comparison of affine region detectors. International Journal of Computer Vision 65(7), 43–72 (2005)
Mokhtarian, F., Mackworth, A.K.: A theory of multiscale, curvature-based shape representation for planar curves. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(8), 789–805 (1992)
Mundy, J., Zisserman, A.: Geometric Invariance in Computer Vision (1992)
Obdržálek, Š., Matas, J.: Object recognition using local affine frames on distinguished regions. In: Proceedings of the British Machine Vision Conference (2002)
Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Computer Graphics and Image Processing 1, 244–259 (1972)
Shao, H., Svoboda, T., Tuytelaars, T., Van Gool, L.: HPAT indexing for fast object/scene recognition based on local appearance. In: International Conference on Image and Video Retrieval, pp. 71–80 (2003)
Shao, H., Svoboda, T., Van Gool, L.: ZuBuD — Zurich Buildings Database for Image Based Recognition. Technical Report 260, Computer Vision Laboratory, Swiss Federal Institute of Technology (March 2003), http://www.vision.ee.ethz.ch/showroom/zubud
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision, pp. 1470–1477 (2003)
Tuytelaars, T., Van Gool, L.: Content-based image retrieval based on local affinely invariant regions. In: Visual Information and Information Systems, pp. 493–500 (1999)
Tuytelaars, T., Van Gool, L.: Wide baseline stereo matching based on local, affinely invariant regions. In: Proceedings of the British Machine Vision Conference (2000)
Vasconcelos, N., Ho, P., Moreno, P.J.: The Kullback-Leibler kernel as a framework for discriminant and localized representations for visual recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3023, pp. 430–441. Springer, Heidelberg (2004)
Yang, M.H., Roth, D., Ahuja, N.: Learning to Recognize 3D Objects with SNoW. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 439–454. Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Obdržálek, Š., Matas, J. (2006). Object Recognition Using Local Affine Frames on Maximally Stable Extremal Regions. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_5
Download citation
DOI: https://doi.org/10.1007/11957959_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)