Abstract
We present a new class of statistical models for part-based object recognition. These models are explicitly parametrized according to the degree of spatial structure that they can represent. This provides a way of relating different spatial priors that have been used in the past such as joint Gaussian models and tree-structured models. By providing explicit control over the degree of spatial structure, our models make it possible to study questions such as the extent to which additional spatial constraints among parts are helpful in detection and localization, and the tradeoff between representational power and computational cost. We consider these questions for object classes that have substantial geometric structure, such as airplanes, faces and motorbikes, using datasets employed by other researchers to facilitate evaluation. We find that for these classes of objects, a relatively small amount of spatial structure in the model can provide statistically indistinguishable recognition performance from more powerful models, and at a substantially lower computational cost.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Amit, Y.: 2D Object Detection and Recognition, Models, Algorithms, and Networks. MIT Press, Cambridge (2002)
Amit, Y., Trouvé, A.: Pop: Patchwork of parts models for object recognition (2005)
Bertele, U., Brioschi, F.: Nonserial Dynamic Programming. Academic Press, London (1972)
Burl, M.C., Perona, P.: Recognition of planar object classes. In: IEEE Conference on Computer Vision and Pattern Recognition (1996)
Burl, M.C., Weber, M., Perona, P.: A probabilistic approach to object recognition using local photometry and global geometry. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407. Springer, Heidelberg (1998)
Carlsson, S.: Geometric structure and view invariant recognition. Phil. Trans. R. Soc. Lond. A 359 (1740) (1998)
Cowell, R.F., Dawid, A.P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic Networks and Expert Systems. Springer, Heidelberg (1999)
DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L.: Comparing the areas under two or more correlated roc curves: a non-parametric approach. Biometrics 44(3) (1998)
Felzenszwalb, P.F., Huttenlocher, D.P.: Distance transforms of sampled functions, Cornell Computing and Information Science Technical Report TR2004-1963 (September 2004)
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. International Journal of Computer Vision 61(1) (2005)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: IEEE Conference on Computer Vision and Pattern Recognition (2003)
Fischler, M.A., Elschlager, R.A.: The representation and matching of pictorial structures. IEEE Transactions on Computer 22(1) (1973)
Huttenlocher, D.P., Ullman, S.: Recognizing solid objects by alignment with an image. International Journal of Computer Vision 5(2), 195–212 (1990)
Ioffe, S., Forsyth, D.A.: Probabilistic methods for finding people. International Journal of Computer Vision 43(1) (2001)
Lipson, P., Grimson, E., Sinha, P.: Configuration based scene classification and image indexing. In: IEEE Conference on Computer Vision and Pattern Recognition (1997)
Rose, D.J.: On simple characterizations of k-trees. Discrete Mathematics 7(3-4), 317–322 (1974)
Schneiderman, H., Kanade, T.: Probabilistic formulation for object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (1998)
Wells III., W.M.: Efficient synthesis of Gaussian filters by cascaded uniform filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 8(2) (1986)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Crandall, D., Felzenszwalb, P., Huttenlocher, D. (2006). Object Recognition by Combining Appearance and Geometry. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_24
Download citation
DOI: https://doi.org/10.1007/11957959_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)