More Web Proxy on the site http://driver.im/

Article

Automatic image annotation and retrieval using cross-media relevance models

Authors:

R. ManmathaAuthors Info & Claims

SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

Pages 119 - 126

https://doi.org/10.1145/860435.860459

Published: 28 July 2003 Publication History

Abstract

Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.

References

[1]

K. Barnard, P. Duygulu, N. de~Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003.]]

Digital Library

[2]

K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, Vol.2, pages 408--415, 2001.]]

[3]

D. Blei, Michael, and M. I. Jordan. Modeling annotated data. To appear in the Proceedings of the 26th annual international ACM SIGIR conference]]

Digital Library

[4]

Berger, A. and Lafferty, J. Information retrieval as statistical translation. In Proceedings of the 22nd annual international ACM SIGIR conference, pages 222--229, 1999.]]

Digital Library

[5]

P. Brown, S. D. Pietra, V. D. Pietra, and R. Mercer. The mathematics of statistical machine translation: Parameter estimation. In Computational Linguistics, 19(2):263--311, 1993.]]

Digital Library

[6]

W. B. Croft. Combining Approaches to Information Retrieval, in Advances in Information Retrieval ed. W. B. Croft, Kluwer Academic Publishers, Boston, MA.]]

[7]

C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik. Blobworld: A system for region-based image indexing and retrieval. In Third International Conference on Visual Information Systems, Lecture Notes in Computer Science, 1614, pages 509--516, 1999.]]

Digital Library

[8]

M. Das and R. Manmatha and E. M. Riseman, Indexing Flowers by Color Names using Domain Knowledge-driven Segmentation, IEEE Intelligent Systems, 14(5):24--33, 1999.]]

Digital Library

[9]

P. Duygulu, K. Barnard, N. de~Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Seventh European Conference on Computer Vision, pages 97--112, 2002.]]

Digital Library

[10]

D. Forsyth and J. Ponce, Computer Vision: A Modern Approach Prentice Hall, 2003]]

Digital Library

[11]

D. Hiemstra Using Language Models for Information Retrieval. PhD dissertation, University of Twente, Enschede, The Netherlands, 2001.]]

[12]

J. M. Ponte, and W. B. Croft, A language modeling approach to information retrieval. Proceedings of the 21st annual international ACM SIGIR Conference, pages 275--281, 1998.]]

Digital Library

[13]

V. Lavrenko and W. Croft. Relevance-based language models. Proceedings of the 24th annual international ACM SIGIR conference, pages 120--127, 2001.]]

Digital Library

[14]

V. Lavrenko, M. Choquette, and W. Croft. Cross-lingual relevance models. Proceedings of the 25th annual international ACM SIGIR conference, pages 175--182, 2002.]]

Digital Library

[15]

Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.]]

[16]

R. W. Picard and T. P. Minka", Vision Texture for Annotation, In Multimedia Systems, 3(1):3--14, 1995.]]

Digital Library

[17]

J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.]]

Digital Library

[18]

J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR Conference, pages 111--119, 2001.]]

Digital Library

Cited By

Mangalika U(2024)Object Recognition to Content Based Image Retrieval: A Study of the Developments and Applications of Computer VisionJournal of Computing and Natural Science10.53759/181X/JCNS202404005(41-52)Online publication date: 5-Jan-2024
https://doi.org/10.53759/181X/JCNS202404005
Yue KChen BGeiping JLi HGoldstein TLim S(2024)Object Recognition as Next Token Prediction2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01575(16645-16656)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01575
Sun JMin X(2024)Research on image caption generation method based on multi-modal pre-training model and text mixup optimizationSignal, Image and Video Processing10.1007/s11760-024-03268-018:8-9(5743-5761)Online publication date: 28-May-2024
https://doi.org/10.1007/s11760-024-03268-0
Show More Cited By

Index Terms

Automatic image annotation and retrieval using cross-media relevance models
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Automatic medical image annotation and retrieval

The demand for automatically annotating and retrieving medical images is growing faster than ever. In this paper, we present a novel medical image retrieval method for a special medical image retrieval problem where the images in the retrieval database ...
Automatic image annotation and semantic based image retrieval for medical domain

Automatic image annotation is the process of assigning meaningful words to an image taking into account its content. This process is of great interest as it allows indexing, retrieving, and understanding of large collections of image data. This paper ...
Dual cross-media relevance model for image annotation
MM '07: Proceedings of the 15th ACM international conference on Multimedia

Image annotation has been an active research topic in recent years due to its potential impact on both image understanding and web image retrieval. Existing relevance-model-based methods perform image annotation by maximizing the joint probability of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

July 2003

490 pages

ISBN:1581136463

DOI:10.1145/860435

General Chairs:
Charles Clarke
University of Waterloo, Canada
,
Gordon Cormack
University of Waterloo, Canada
,
Program Chairs:
Jamie Callan
Carnegie Mellon University, Pittsburgh, PA
,
David Hawking
Australian National University, Australia
,
Alan Smeaton
Dublin City University, Ireland

Copyright © 2003 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 July 2003

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGIR03

Sponsor:

SIGIR

SIGIR03: The 26th ACM/SIGIR International Symposium on Information Retrieval

July 28 - August 1, 2003

Toronto, Canada

Acceptance Rates

SIGIR '03 Paper Acceptance Rate 46 of 266 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

573
Total Citations
View Citations
4,678
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)1

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mangalika U(2024)Object Recognition to Content Based Image Retrieval: A Study of the Developments and Applications of Computer VisionJournal of Computing and Natural Science10.53759/181X/JCNS202404005(41-52)Online publication date: 5-Jan-2024
https://doi.org/10.53759/181X/JCNS202404005
Yue KChen BGeiping JLi HGoldstein TLim S(2024)Object Recognition as Next Token Prediction2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01575(16645-16656)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01575
Sun JMin X(2024)Research on image caption generation method based on multi-modal pre-training model and text mixup optimizationSignal, Image and Video Processing10.1007/s11760-024-03268-018:8-9(5743-5761)Online publication date: 28-May-2024
https://doi.org/10.1007/s11760-024-03268-0
Salar AAhmadi A(2024)Improving loss function for deep convolutional neural network applied in automatic image annotationThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-02873-340:3(1617-1629)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s00371-023-02873-3
Avasthi STripathi SSanwal TMahmud M(2024)Deep Learning Models for Image Annotation Application in a 6G Network EnvironmentDevelopment of 6G Networks and Technology10.1002/9781394230686.ch8(201-221)Online publication date: 15-Nov-2024
https://doi.org/10.1002/9781394230686.ch8
Agarwal LVerma B(2023)Comparison of Deep Learning Models for Automatic Image Descriptors2023 IEEE 20th India Council International Conference (INDICON)10.1109/INDICON59947.2023.10440731(914-919)Online publication date: 14-Dec-2023
https://doi.org/10.1109/INDICON59947.2023.10440731
Kender JDube PHan ZBhattacharjee B(2023)G2L: A High-Dimensional Geometric Approach for Automatic Generation of Highly Accurate Pseudo-labels2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00117(1085-1094)Online publication date: 2-Oct-2023
https://doi.org/10.1109/ICCVW60793.2023.00117
Agarwal LVerma B(2023)From methods to datasets: A survey on Image-Caption GeneratorsMultimedia Tools and Applications10.1007/s11042-023-16560-x83:9(28077-28123)Online publication date: 31-Aug-2023
https://doi.org/10.1007/s11042-023-16560-x
Mohammadi Kashani MAmiri S(2022)Scalable Image Annotation by Summarizing Training Samples into Labeled PrototypesSignal and Data Processing10.52547/jsdp.18.4.4918:4(49-68)Online publication date: 1-Mar-2022
https://doi.org/10.52547/jsdp.18.4.49
Li ZTang J(2022)A survey on social image semantic analysisChinese Science Bulletin10.1360/TB-2022-093868:25(3368-3384)Online publication date: 11-Nov-2022
https://doi.org/10.1360/TB-2022-0938
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents