Abstract
One of the difficulties of object recognition stems from the need to overcome the variability in object appearance caused by pose and other factors, such as illumination. The influence of these factors can be countered by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Difficulties of another kind arise in daily life situations that require categorization, rather than recognition, of objects. Although categorization cannot rely on interpolation between stored examples, we show that knowledge of several representative members, or prototypes, of each of the categories of interest can provide the necessary computational substrate for the categorization of new instances. We describe a system that represents input shapes by their similarities to several prototypical objects, and show that it can recognize new views of the familiar objects, discriminate among views of previously unseen shapes, and attribute the latter to familiar categories.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adini, Y., Moses, Y., and Ullman, S. 1997. Face recognition: the problem of compensating for illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:721–732.
Amit, Y. and Geman, D. 1997. Shape quantization and recognition with randomized trees. Neural Computation, 9:1545–1588.
Basri, R. 1996. Recognition by prototypes. International Journal of Computer Vision, 19:147–168.
Baxter, J. 1997. The canonical distortion measure for vector quantization and function approximation. In Proc. 14th Intl. Conf. on Machine Learning, D.H. Fisher, J. (Ed.), Nashville, TN, pp. 39–47.
Biederman, I. 1987. Recognition by components: a theory of human image understanding. Psychol. Review, 94:115–147.
Biederman, I. and Ju, G. 1988. Surface versus edge-based determinants of visual recognition. Cognitive Psychology, 20:38–64.
Breuel, T.M. 1992. Geometric Aspects of Visual Object Recognition. Ph.D. Thesis, MIT.
Broomhead, D.S. and Lowe, D. 1988. Multivariable functional interpolation and adaptive networks. Complex Systems, 2:321–355.
Burge, M., Burger, W., and Mayr, W. 1997. Recognition and learning with polymorphic structural components. Journal of Computing and Information Technology, 4:39–51.
Burl, M., Weber, M., Leung, T., and Perona, P. 1998. From Segmentation to Interpretation and Back: Mathematical Methods in Computer Vision, T. Noons, E., J. Pauwels, and L.J. van God (Eds.), chapter “Recognition of Visual Object Classes”, Springer-Verlag, in press.
Colin de Verdiére, V. and Crowley, J.L. 1998. Visual recognition using local appearance. In Proc. 4th Europ. Conf. Comput. Vision, H. Burkhardt and B. Neumann (Eds.), LNCS-Series vol. 1406–1407, Springer-Verlag, vol. 1, pp. 640–654.
Cover, T. and Hart, P. 1967. Nearest neighbor pattern classification. IEEE Trans. on Information Theory, IT-13:21–27.
Duda, R.O. and Hart, P.E. 1973. Pattern Classification and Scene Analysis. Wiley: New York.
Duvdevani-Bar, S. 1997. Similarity to Prototypes in 3D Shape Representation. Ph.D. Thesis, Weizmann Institute of Science.
Duvdevani-Bar, S., Edelman, S., Howell, A.J., and Buxton, H. 1998. A similarity-based method for the generalization of face recognition over pose and expression. In Proc. 3rd Intl. Symposium on Face and Gesture Recognition (FG98), S. Akamatsu and K. Mase (Eds.), Washington, DC. IEEE, pp. 118–123.
Edelman, S. 1993. On learning to recognize 3D objects from examples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15:833–837.
Edelman, S. 1995. Representation, Similarity, and the Chorus of Prototypes. Minds and Machines, 5:45–68.
Edelman, S. 1998. Representation is representation of similarity. Behavioral and Brain Sciences, 21:449–498.
Edelman, S. and Duvdevani-Bar, S. 1997a. Similarity-based viewspace interpolation and the categorization of 3D objects. In Proc. Similarity and Categorization Workshop, Dept. of AI, University of Edinburgh, pp. 75–81,.
Edelman, S. and Duvdevani-Bar, S. 1997b. Similarity, connectionism, and the problem of representation in vision. Neural Computation, 9:701–720.
Edelman, S. and Intrator, N. 1997. Learning as extraction of lowdimensional representations. In Mechanisms of Perceptual Learning, D. Medin, R. Goldstone, and P. Schyns (Eds.), Academic Press, pp. 353–380.
Edelman, S., Reisfeld, D., and Yeshurun, Y. 1992. Learning to recognize faces from examples. In Proc. 2nd European Conf. on Computer Vision, Lecture Notes in Computer Science, G. Sandini (Ed.), Springer Verlag, vol. 588, pp. 787–791.
Fillenbaum, S. and Rapoport, A. 1979. Structures in the Subjective Lexicon. Academic Press: New York.
Gersho, A. and Gray, R.M. 1992. Vector Quantization and Signal Compression. Kluwer Academic Publishers: Boston.
Jacobs, D.W. 1996. The space requirements of indexing under perspective projections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18:330–333.
Jolicoeur, P., Gluck, M., and Kosslyn, S.M. 1984. Pictures and names: making the connection. Cognitive Psychology, 16:243–275.
Kanatani, K. 1990. Group-Theoretical Methods in Image Understanding. Springer: Berlin.
Kendall, D.G. 1984. Shape manifolds, Procrustean metrics and complex projective spaces. Bull. Lond. Math. Soc., 16:81–121.
Lando, M. and Edelman, S. 1995. Receptive field spaces and classbased generalization from a single view in face recognition. Network, 6:551–576.
Linde, Y., Buzo, A., and Gray, R. 1980. An algorithm for vector quantizer design. IEEE Transactions on Communications, COM-28:84–95.
Lowe, D.G. 1986. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers: Boston, MA.
Lowe, D.G. 1987. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence, 31:355–395.
MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. Proc. 5th Berkeley Symposium, 1:281–297.
Marr, D. and Nishihara, H.K. 1978. Representation and recognition of the spatial organization of three dimensional structure. In Proceedings of the Royal Society of London B, vol. 200, pp. 269–294.
Mel, B. 1997. SEEMORE: Combining color, shape, and texture histogramming in a neurally-inspired approach to visual object recognition. Neural Computation, 9:777–804.
Moody, J. and Darken, C. 1989. Fast learning in networks of locally tuned processing units. Neural Computation, 1:281–289.
Murase, H. and Nayar, S. 1995. Visual learning and recognition of 3D objects from appearance. International Journal of Computer Vision, 14:5–24.
Nelson, R.C. and Selinger, A. 1998. Large-scale tests of a keyed, appearance-based 3-D object recognition system. Vision Research, 38:2469–2488.
Palmer, S.E., Rosch, E., and Chase, P. 1981. Canonical perspective and the perception of objects. In Attention and Performance IX, J. Long and A. Baddeley (Eds.), Erlbaum: Hillsdale, NJ, pp. 135–151.
Poggio, T. and Edelman, S. 1990. A network that learns to recognize three-dimensional objects. Nature, 343:263–266.
Poggio, T. and Girosi, F. 1989. A theory of networks for approximation and learning. A.I. Memo No. 1140, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.
Poggio, T. and Girosi, F. 1990. Regularization algorithms for learning that are equivalent to multilayer networks. Science, 247:978–982.
Poggio, T. and Vetter, T. 1992. Recognition and structure from one 2D model view: observations on prototypes, object classes, and symmetries. A.I. Memo No. 1347, Artificial Intelligence Laboratory, Massachusetts Institute of Technology.
Price, C.J. and Humphreys, G.W. 1989. The effects of surface detail on object categorization and naming. Quarterly J. Exp. Psych. A, 41:797–828.
Riesenhuber, M. and Poggio, T. 1998. Just one view: Invariances in inferotemporal cell tuning. In Advances in Neural Information Processing, M.I. Jordan, M.J. Kearns, and S.A. Solla (Eds.), MIT Press, vol. 10, in press.
Rosch, E. 1978. Principles of categorization. In Cognition and Categorization, E. Rosch and B. Lloyd (Eds.), Erlbaum: Hillsdale, NJ, pp. 27–48.
SAS 1989. User's Guide, Version 6. SAS Institute Inc.: Cary, NC.
Schiele, B. and Crowley, J.L. 1996. Object recognition using multidimensional receptive field histograms. In Proc. ECCV'96, B. Buxton and R. Cipolla (Eds.), volume 1 of Lecture Notes in Computer Science, Springer: Berlin, pp. 610–619.
Shapira, Y. and Ullman, S. 1991. A pictorial approach to object classification. In Proceedings IJCAI, pp. 1257–1263.
Shepard, R.N. 1980. Multidimensional scaling, tree-fitting, and clustering. Science, 210:390–397.
Smith, E.E. 1990. Categorization. In An Invitation to Cognitive Science: Thinking, D.N. Osherson and E.E. Smith (Eds.), MIT Press: Cambridge, MA, vol. 2, pp. 33–53.
Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: Afactorization method. International Journal of Computer Vision, 9:137–154.
Ullman, S. 1989. Aligning pictorial descriptions: an approach to object recognition. Cognition, 32:193–254.
Ullman, S. 1996. High Level Vision. MIT Press: Cambridge, MA.
Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:992–1005.
Vetter, T., Hurlbert, A., and Poggio, T. 1995. View-based models of 3d object recognition: Invariance to imaging transformations. Cerebral Cortex, 5:261–269.
Vetter, T. and Poggio, T. 1997. Linear object classes and image synthesis from a single example image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:733–742.
Weiss, Y. and Edelman, S. 1995. Representation of similarity as a goal of early visual processing. Network, 6:19–41.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Duvdevani-Bar, S., Edelman, S. Visual Recognition and Categorization on the Basis of Similarities to Multiple Class Prototypes. International Journal of Computer Vision 33, 201–228 (1999). https://doi.org/10.1023/A:1008102413960
Issue Date:
DOI: https://doi.org/10.1023/A:1008102413960