article

Enhancing relevance feedback in image retrieval using unlabeled data

Authors:

Zhi-Hua Zhou,

Ke-Jia Chen,

Hong-Bin DaiAuthors Info & Claims

ACM Transactions on Information Systems (TOIS), Volume 24, Issue 2

Pages 219 - 244

https://doi.org/10.1145/1148020.1148023

Published: 01 April 2006 Publication History

Get Access

Abstract

Relevance feedback is an effective scheme bridging the gap between high-level semantics and low-level features in content-based image retrieval (CBIR). In contrast to previous methods which rely on labeled images provided by the user, this article attempts to enhance the performance of relevance feedback by exploiting unlabeled images existing in the database. Concretely, this article integrates the merits of semisupervised learning and active learning into the relevance feedback process. In detail, in each round of relevance feedback two simple learners are trained from the labeled data, that is, images from user query and user feedback. Each learner then labels some unlabeled images in the database for the other learner. After retraining with the additional labeled data, the learners reclassify the images in the database and then their classifications are merged. Images judged to be positive with high confidence are returned as the retrieval result, while those judged with low confidence are put into the pool which is used in the next round of relevance feedback. Experiments show that using semisupervised learning and active learning simultaneously in CBIR is beneficial, and the proposed method achieves better performance than some existing methods.

References

[1]

Abe, N. and Mamitsuka, H. 1998. Query learning strategies using boosting and bagging. In Proceedings of the 15th International Conference on Machine Learning (Madison, WI). 1--9.

Crossref

Google Scholar

[2]

Blum, A. and Chawla, S. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning (Williamston, MA). 19--26.

Crossref

Google Scholar

[3]

Blum, A. and Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory (Madison, WI). 92--100.

Crossref

Google Scholar

[4]

Bookstein, A. 1983. Information retrieval: A sequential learning process. J. American Society Inf. Sci. 34, 4, 331--342.

Google Scholar

[5]

Chen, J.-Y., Bouman, C. A., and Dalton, J. 2000. Hierarchical browsing and search of large image databases. IEEE Trans. Image Proces. 9, 3, 442--445.

Crossref

Google Scholar

[6]

Ciocca, G. and Schettini, R. 1999. A relevance feedback mechanism for content-based image retrieval. Inf. Proces. Management 35, 5, 605--632.

Crossref

Google Scholar

[7]

Cohen, I., Cozman, F. G., Sebe, N., Cirelo, M. C., and Huang, T. S. 2004. Semisupervised learning of classifiers: Theory, algorithm, and their application to human-computer interaction. IEEE Trans. Pattern Anal. Mach. Intel. 26, 12, 1553--1567.

Crossref

Google Scholar

[8]

Cox, I. J., Miller, M., Minka, T. P., Papathomas, T., and Yianilos, P. 2000. The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments. IEEE Trans. Image Proces. 9, 1, 20--37.

Crossref

Google Scholar

[9]

Cozman, F. G. and Cohen, I. 2002. Unlabeled data can degrade classificaion performance of generative classifiers. In Proceedings of the 15th International Conference of the Florida Artificial Intelligence Research Society (Pensacola, FL). 327--331.

Crossref

Google Scholar

[10]

Dasgupta, S., Littman, M., and McAllester, D. 2002. PAC generalization bounds for co-training. In Advances in Neural Information Processing Systems 14, T. G. Dietterich et al., eds. MIT Press, Cambridge, MA. 375--382.

Google Scholar

[11]

Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society, Series B 39, 1, 1--38.

Google Scholar

[12]

Dong, A. and Bhanu, B. 2003. A new semi-supervised EM algorithm for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Madison, WI). 662--667.

Google Scholar

[13]

Goldman, S. and Zhou, Y. 2000. Enhancing supervised learning with unlabeled data. In Proceedings of the 17th International Conference on Machine Learning (San Francisco, CA). 327--334.

Crossref

Google Scholar

[14]

Huijsmans, D. P. and Sebe, N. 2005. How to complete performance graphs in content-based image retrieval: Add generality and normalize scope. IEEE Trans. Pattern Anal. Mach. Intel. 27, 2, 245--251.

Crossref

Google Scholar

[15]

Hwa, R., Osborne, M., Sarkar, A., and Steedman, M. 2003. Corrected co-training for statistical parsers. In Working Notes of the ICML'03 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (Washington, DC).

Google Scholar

[16]

Ishikawa, Y., Subramanya, R., and Faloutsos, C. 1998. MindReader: Query databases through multiple examples. In Proceedings of the 24th International Conference on Very Large Data Bases (New York, NY). 218--227.

Crossref

Google Scholar

[17]

Joachims, T. 1999. Transductive inference for text classification using support vector machines. In Proceedings of the 16th International Conference on Machine Learning (Bled, Slovenia). 200--209.

Crossref

Google Scholar

[18]

Kherfi, M. L., Ziou, D., and Bernardi, A. 2002. Learning from negative example in relevance feedback for content-based image retrieval. In Proceedings of the 16th International Conference on Pattern Recognition (Quebec, Canada). 933--936.

Google Scholar

[19]

Lewis, D. 1992. Representation and learning in information retrieval. Ph.D. thesis, Dept. of Computer Science, University of Massachusetts.

Crossref

Google Scholar

[20]

Lewis, D. and Gale, W. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Dublin, Ireland). 3--12.

Crossref

Google Scholar

[21]

Manjunath, B. S. and Ma, W. Y. 1996. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intel. 18, 8, 837--842.

Crossref

Google Scholar

[22]

Mehtre, B. M., Kankanhalli, M. S., Narasimhalu, A. D., and Man, G. C. 1995. Color matching for image retrieval. Pattern Recogn. Lett. 16, 3, 325--331.

Crossref

Google Scholar

[23]

Miller, D. J. and Uyar, H. S. 1997. A mixture of experts classifier with learning based on both labelled and unlabelled data. In Advances in Neural Information Processing Systems 9, M. Mozer et al., eds. MIT Press, Cambridge, MA. 571--577.

Google Scholar

[24]

Müller, H., Müller, W., Squire, D. M., Marchand-Maillet, S., and Pun, T. 2001. Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recogn. Lett. 22, 5, 593--601.

Crossref

Google Scholar

[25]

Muslea, I., Minton, S., and Knoblock, C. A. 2000. Selective sampling with redundant views. In Proceedings of the 17th National Conference on Artificial Intelligence (Austin, TX). 621--626.

Crossref

Google Scholar

[26]

Nastar, C., Mitschke, M., and Meilhac, C. 1998. Efficient query refinement for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Santa Barbara, CA). 547--552.

Crossref

Google Scholar

[27]

Nigam, K. and Ghani, R. 2000. Analyzing the effectiveness and applicability of co-training. In Proceedings of the 9th ACM International Conference on Information and Knowledge Management (Washington, DC). 86--93.

Crossref

Google Scholar

[28]

Nigam, K., McCallum, A. K., Thrun, S., and Mitchell, T. 2000. Text classification from labeled and unlabeled documents using EM. Mach. Learn. 39, 2-3, 103--134.

Crossref

Google Scholar

[29]

Picard, R. W., Minka, T. P., and Szummer, M. 1996. Modeling user subjectivity in image libraries. In Proceedings of the International Conference on Image Processing (Lausanne, Switzerland). 777--780.

Google Scholar

[30]

Pierce, D. and Cardie, C. 2001. Limitations of co-training for natural language learning from large data sets. In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (Pittsburgh, PA). 1--9.

Google Scholar

[31]

Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8, 5, 644--655.

Crossref

Google Scholar

[32]

Sarkar, A. 2001. Applying co-training methods to statistical parsing. In Proceedings of the 2nd Annual Meeting of the North American Chapter of the Association for Computational Linguistics (Pittsburgh, PA). 95--102.

Crossref

Google Scholar

[33]

Seung, H., Opper, M., and Sompolinsky, H. 1992. Query by committee. In Proceedings of the 5th ACM Workshop on Computational Learning Theory (Pittsburgh, PA). 287--294.

Crossref

Google Scholar

[34]

Shahshahani, B. and Landgrebe, D. 1994. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon. IEEE Trans. Geosci. Remote Sensing 32, 5, 1087--1095.

Google Scholar

[35]

Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intel. 22, 12, 1349--1380.

Crossref

Google Scholar

[36]

Steedman, M., Osborne, M., Sarkar, A., Clark, S., Hwa, R., Hockenmaier, J., Ruhlen, P., Baker, S., and Crim, J. 2003. Bootstrapping statistical parsers from small data sets. In Proceedings of the 11th Conference on the European Chapter of the Association for Computational Linguistics (Budapest, Hungary). 331--338.

Crossref

Google Scholar

[37]

Tian, Q., Yu, J., Xue, Q., and Sebe, N. 2004. A new analysis of the value of unlabeled data in semi-supervised learning for image retrieval. In Proceedings of the IEEE International Conference on Multimedia Exposition (Taibei). 1019--1022.

Google Scholar

[38]

Tieu, K. and Viola, P. 2000. Boosting image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Hilton Head, SC). 228--235.

Google Scholar

[39]

Tong, S. and Chang, E. 2001. Support vector machine active learning for image retrieval. In Proceedings of the 9th ACM International Conference on Multimedia (Ottawa, Canada). 107--118.

Crossref

Google Scholar

[40]

Vasconcelos, N. and Lippman, A. 2000. Learning from user feedback in image retrieval systems. In Advances in Neural Information Processing Systems 12, S. A. Solla et al., eds. MIT Press, Cambridge, MA. 977--986.

Google Scholar

[41]

Wang, H. F., Jin, X. Y., and Sun, Z. 2002. Semantic image retrieval (in Chinese). J. Comput. Research Development 39, 5, 513--523.

Google Scholar

[42]

Wu, Y., Tian, Q., and Huang, T. S. 2000. Discriminant-EM algorithm with application to image retrieval. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Hilton Head, SC). 222--227.

Google Scholar

[43]

Yao, J. and Zhang, Z. 2005. Object detection in aerial imagery based on enhanced semi-supervised learning. In Proceedings of the 10th IEEE International Conference on Computer Vision (Beijing). 1012--1017.

Crossref

Google Scholar

[44]

Zhang, C. and Chen, T. 2002. An active learning framework for content-based information retrieval. IEEE Trans. Multimedia 4, 2, 260--268.

Crossref

Google Scholar

[45]

Zhang, R. and Zhang, Z. 2004. Stretching Bayesian learning in the relevance feedback of image retrieval. In Proceedings of the 8th European Conference on Computer Vision (Prague, Czech). 355--367.

Google Scholar

[46]

Zhou, X. S. and Huang, T. S. 2001. Small sample learning during multimedia retrieval using BiasMap. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (Kauai, HI). 11--17.

Google Scholar

[47]

Zhou, X. S. and Huang, T. S. 2003. Relevance feedback in image retrieval: A comprehensive review. Multimedia Syst. 8, 6, 536--544.

Google Scholar

[48]

Zhou, Z.-H., Chen, K.-J., and Jiang, Y. 2004. Exploiting unlabeled data in content-based image retrieval. In Proceedings of the 15th European Conference on Machine Learning (Pisa, Italy). 525--536.

Google Scholar

[49]

Zhou, Z.-H. and Li, M. 2005a. Semi-supervised learning with co-training. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (Edinburgh, Scotland). 908--913.

Crossref

Google Scholar

[50]

Zhou, Z.-H. and Li, M. 2005b. Tri-training: Exploiting unlabeled data using three classifiers. IEEE Trans. Knowledge Data Engineering 17, 11, 1529--1541.

Crossref

Google Scholar

Cited By

View all

Li HFang MHe BDong DTian J(2024)Enhancing Prognostic Prediction of Gastrointestinal Stromal Tumors Using Semi-Supervised Regression Based on CT Imaging Data2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)10.1109/EMBC53108.2024.10782508(1-4)Online publication date: 15-Jul-2024
https://doi.org/10.1109/EMBC53108.2024.10782508
Zhang RHu SMa CZhang TLi H(2024)Laser-induced breakdown spectroscopy (LIBS) in biomedical analysisTrAC Trends in Analytical Chemistry10.1016/j.trac.2024.117992181(117992)Online publication date: Dec-2024
https://doi.org/10.1016/j.trac.2024.117992
Murel JSmith D(2024)Self-training and Active Learning with Pseudo-relevance Feedback for Handwriting Detection in Historical PrintDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70543-4_18(305-324)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-70543-4_18
Show More Cited By

Index Terms

Enhancing relevance feedback in image retrieval using unlabeled data
1. Computing methodologies
  1. Machine learning
2. Information systems

Recommendations

SVM-based active feedback in image retrieval using clustering and unlabeled data

In content-based image retrieval, relevance feedback is studied extensively to narrow the gap between low-level image feature and high-level semantic concept. However, most methods are challenged by small sample size problem since users are usually not ...
Laplacian optimal design for image retrieval
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Relevance feedback is a powerful technique to enhance Content-Based Image Retrieval (CBIR) performance. It solicits the user's relevance judgments on the retrieved images returned by the CBIR systems. The user's labeling is then used to learn a ...
A Unified Log-Based Relevance Feedback Scheme for Image Retrieval

Relevance feedback has emerged as a powerful tool to boost the retrieval performance in content-based image retrieval (CBIR). In the past, most research efforts in this field have focused on designing effective algorithms for traditional relevance ...

Reviews

Reviewer: Richard CHBEIR

Relevance feedback in content-based image retrieval (CBIR) is addressed in this paper, which provides an interesting approach based on a preliminary method-semi-supervised active image retrieval with asymmetry (SSAIRA). This approach involves three issues: a small sample size, an asymmetric training sample, and a real-time requirement. In this work, the authors propose a learning method by considering two learners trained from the labeled data. The user query is considered as the labeled positive example, while the image database is considered initially as a set of unlabeled data. The two learners are defined with respect to the Minkowski distance, and they are differentiated by the order of this distance. The defined learners are easy to update, which makes the relevance feedback process more efficient. In addition, the learning algorithm deals with negative image examples. The authors consider each image to be representative of a semantic class, and images close to a negative example may belong to the same class. To define the representative of the class, they calculate the k-nearest neighbors of negative examples. One may wonder why they use the Euclidian distance in the neighborhood calculation, when other more adaptive methods can be applied and used instead. However, several rich and satisfactory experimental tests have been conducted by the authors to validate their approach and to test its relevance compared with current approaches. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ACM Transactions on Information Systems Volume 24, Issue 2

April 2006

150 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/1148020

Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2006

Published in TOIS Volume 24, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

115
Total Citations
View Citations
1,488
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)1

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Li HFang MHe BDong DTian J(2024)Enhancing Prognostic Prediction of Gastrointestinal Stromal Tumors Using Semi-Supervised Regression Based on CT Imaging Data2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)10.1109/EMBC53108.2024.10782508(1-4)Online publication date: 15-Jul-2024
https://doi.org/10.1109/EMBC53108.2024.10782508
Zhang RHu SMa CZhang TLi H(2024)Laser-induced breakdown spectroscopy (LIBS) in biomedical analysisTrAC Trends in Analytical Chemistry10.1016/j.trac.2024.117992181(117992)Online publication date: Dec-2024
https://doi.org/10.1016/j.trac.2024.117992
Murel JSmith D(2024)Self-training and Active Learning with Pseudo-relevance Feedback for Handwriting Detection in Historical PrintDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70543-4_18(305-324)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-70543-4_18
Liu LHuang PYu HMin F(2023)Safe co-training for semi-supervised regressionIntelligent Data Analysis10.3233/IDA-22671827:4(959-975)Online publication date: 20-Jul-2023
https://doi.org/10.3233/IDA-226718
Liu LZhang JMin F(2022)Semi-supervised Regression with Data Partitioning and Feature Mapping2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA54385.2022.10032446(1-10)Online publication date: 13-Oct-2022
https://doi.org/10.1109/DSAA54385.2022.10032446
Liu SLi S(2022)A semi-supervised soft sensor method based on vine copula regression and tri-training algorithm for complex chemical processesJournal of Process Control10.1016/j.jprocont.2022.11.004120(115-128)Online publication date: Dec-2022
https://doi.org/10.1016/j.jprocont.2022.11.004
Gong YYi JChen DZhang JZhou JZhou ZShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Inferring the Importance of Product Appearance with Semi-supervised Multi-modal EnhancementProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3481538(1120-1128)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3481538
Gu BZhai ZDeng CHuang H(2021)Efficient Active Learning by Querying Discriminative and Representative Samples and Fully Exploiting Unlabeled DataIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2020.301692832:9(4111-4122)Online publication date: Sep-2021
https://doi.org/10.1109/TNNLS.2020.3016928
Ge Z(2021)Semi-supervised data modeling and analytics in the process industry: Current research status and challengesIFAC Journal of Systems and Control10.1016/j.ifacsc.2021.100150(100150)Online publication date: Mar-2021
https://doi.org/10.1016/j.ifacsc.2021.100150
Dammak FKammoun H(2021)Combining semi-supervised and active learning to rank algorithms: application to Document RetrievalInformation Retrieval Journal10.1007/s10791-021-09396-2Online publication date: 4-Oct-2021
https://doi.org/10.1007/s10791-021-09396-2
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

SVM-based active feedback in image retrieval using clustering and unlabeled data

Laplacian optimal design for image retrieval

A Unified Log-Based Relevance Feedback Scheme for Image Retrieval

Reviews

Access critical reviews of Computing literature here