Joint Image and Word Sense Discrimination for Image Retrieval

Aurelien Lucchi^21,22 &
Jason Weston²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7572))

Included in the following conference series:

European Conference on Computer Vision

10k Accesses
6 Citations

Abstract

We study the task of learning to rank images given a text query, a problem that is complicated by the issue of multiple senses. That is, the senses of interest are typically the visually distinct concepts that a user wishes to retrieve. In this paper, we propose to learn a ranking function that optimizes the ranking cost of interest and simultaneously discovers the disambiguated senses of the query that are optimal for the supervised task. Note that no supervised information is given about the senses. Experiments performed on web images and the ImageNet dataset show that using our approach leads to a clear gain in performance.

Download to read the full chapter text

Chapter PDF

Machine Learning for Visual Concept Recognition and Ranking for Images

Learning to Re-rank Medical Images Using a Bayesian Network-Based Thesaurus

Multimodal-Based Supervised Learning for Image Search Reranking

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Grangier, D., Bengio, S.: A discriminative kernel-based model to rank images from text queries. PAMI 30, 1371–1384 (2008)
Article Google Scholar
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: COLT, pp. 144–152 (1992)
Google Scholar
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: ICMR, pp. 275–278 (2003)
Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: SIGIR (2003)
Google Scholar
Makadia, A., Pavlovic, V., Kumar, S.: A New Baseline for Image Annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)
Chapter Google Scholar
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV, pp. 309–316 (2009)
Google Scholar
Grangier, D., Bengio, S.: A Neural Network to Retrieve Images from Text Queries. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006, Part II. LNCS, vol. 4132, pp. 24–34. Springer, Heidelberg (2006)
Chapter Google Scholar
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38, 39–41 (1995)
Article Google Scholar
Pedersen, T., Bruce, R.: Distinguishing word senses in untagged text. In: EMNLP, vol. 2, pp. 197–207 (1997)
Google Scholar
Purandare, A., Pedersen, T.: Word sense discrimination by clustering contexts in vector and similarity spaces. In: CoNLL, pp. 41–48 (2004)
Google Scholar
Basile, P., Caputo, A., Semeraro, G.: Exploiting disambiguation and discrimination in information retrieval systems. In: WI/IAT Workshops, pp. 539–542 (2009)
Google Scholar
Agirre, E., Edmonds, P.: Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology), 1st edn. Springer (2007)
Google Scholar
Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41, 10 (2009)
Article Google Scholar
Berg, T.L., Forsyth, D.A.: Animals on the web. In: CVPR (2006)
Google Scholar
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting Image Databases from the Web. PAMI 33, 754–766 (2011)
Article Google Scholar
Loeff, N., Alm, C., Forsyth, D.: Discriminating image senses by clustering with multimodal features. In: ACL, pp. 547–554 (2006)
Google Scholar
Saenko, K., Darrell, T.: Filtering abstract senses from image search results. In: NIPS, pp. 1589–1597 (2009)
Google Scholar
Wan, K.W., Tan, A.H., Lim, J.H., Chia, L.T., Roy, S.: A latent model for visual disambiguation of keyword-based image search. In: BMVC (2009)
Google Scholar
Chang, Y.-C., Chen, H.-H.: Image Sense Classification in Text-Based Image Retrieval. In: Lee, G.G., Song, D., Lin, C.-Y., Aizawa, A., Kuriyama, K., Yoshioka, M., Sakai, T. (eds.) AIRS 2009. LNCS, vol. 5839, pp. 124–135. Springer, Heidelberg (2009)
Chapter Google Scholar
Barnard, K., Johnson, M.: Word sense disambiguation with pictures. Artif. Intell. 167, 13–30 (2005)
Article Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)
Google Scholar
Bergeron, C., Zaretzki, J., Breneman, C., Bennett, K.P.: Multiple instance ranking. In: ICML (2008)
Google Scholar
Boyd, S., Mutapcic, A.: Subgradient methods. notes for ee364b, Stanford university (2007)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009)
Google Scholar
Weston, J., Bengio, S., Usunier, N.: Wsabie: Scaling up to large vocabulary image annotation. In: IJCAI, pp. 2764–2770 (2011)
Google Scholar
Grauman, K., Trevor, D.: The pyramid match kernel: Efficient learning with sets of features. JMLR 8, 725–760 (2007)
MATH Google Scholar
Leung, T., Malik, J.: Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons. IJCV 43, 29–44 (2001)
Article MATH Google Scholar
Schoelkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Advances in Kernel Methods - Support Vector Learning, pp. 327–352. MIT Press (1999)
Google Scholar
Barla, A., Odone, F., Verri, A.: Histogram intersection kernel for image classification. In: ICIP, pp. 513–516 (2003)
Google Scholar
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. JMLR 2, 265–292 (2001)
Google Scholar
Zien, A., De Bona, F., Ong, C.S.: Training and approximation of a primal multiclass support vector machine. In: ASMDA (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Google, New York, USA
Aurelien Lucchi & Jason Weston
EPFL, Lausanne, Switzerland
Aurelien Lucchi

Authors

Aurelien Lucchi
View author publications
You can also search for this author in PubMed Google Scholar
Jason Weston
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lucchi, A., Weston, J. (2012). Joint Image and Word Sense Discrimination for Image Retrieval. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7572. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33718-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-33718-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33717-8
Online ISBN: 978-3-642-33718-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Joint Image and Word Sense Discrimination for Image Retrieval

Abstract

Chapter PDF

Similar content being viewed by others

Machine Learning for Visual Concept Recognition and Ranking for Images

Learning to Re-rank Medical Images Using a Bayesian Network-Based Thesaurus

Multimodal-Based Supervised Learning for Image Search Reranking

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Joint Image and Word Sense Discrimination for Image Retrieval

Abstract

Chapter PDF

Similar content being viewed by others

Machine Learning for Visual Concept Recognition and Ranking for Images

Learning to Re-rank Medical Images Using a Bayesian Network-Based Thesaurus

Multimodal-Based Supervised Learning for Image Search Reranking

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation