More Web Proxy on the site http://driver.im/

research-article

A new web-supervised method for image dataset constructions

Authors:

Zhenmin TangAuthors Info & Claims

Neurocomputing, Volume 236, Issue C

Pages 23 - 31

Published: 02 May 2017 Publication History

Abstract

The goal of this work is to automatically collect a large number of highly relevant natural images from Internet for given queries. A novel automatic image dataset construction framework is proposed by employing multiple query expansions. In specific, the given queries are first expanded by searching in the Google Books Ngrams Corpora to obtain a richer semantic descriptions, from which the visually non-salient and less relevant expansions are then filtered. After retrieving images from the Internet with filtered expansions, we further filter noisy images by clustering and progressively Convolutional Neural Networks (CNN) based methods. To evaluate the performance of our proposed method for image dataset construction, we build an image dataset with 10 categories. We then run object detections on our image dataset with three other image datasets which were constructed by weak supervised, web supervised and full supervised learning, the experimental results indicated the effectiveness of our method is superior to weak supervised and web supervised state-of-the-art methods. In addition, we do a cross-dataset classification to evaluate the performance of our dataset with two publically available manual labelled dataset STL-10 and CIFAR-10.

References

[1]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image database, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248255.

[2]

F. Schroff, A. Criminisi, A. Zisserman, Harvesting image databases from the web, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011) 754-766.

Digital Library

[3]

W.-H. Lin, R. Jin, A. Hauptmann, Web image retrieval re-ranking with relevance model, in: Proceedings of the IEEE Conference on Web Intelligence (WI), 2003, pp. 242248.

Digital Library

[4]

R. Fergus, L. Fei-Fei, P. Perona, A. Zisserman, Learning object categories from googles image search, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2005, pp. 18161823.

Digital Library

[5]

S. Vijayanarasimhan, K. Grauman, Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008, pp. 18.

[6]

L.-J. Li, L. Fei-Fei, Optimol: automatic online picture collection via incremental model learning, Int. J. Comput. Vis., 88 (2010) 147-168.

Digital Library

[7]

T. Hofmann, Probabilistic latent semantic analysis, in: Proceedings of the Morgan Kaufmann Fifteenth Conference on Uncertainty in Artificial Intelligence, 1999, pp. 289296.

Digital Library

[8]

T.L. Berg, D. Forsyth, et al., Animals on the web, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006, pp. 14631470.

Digital Library

[9]

P. Siva, T. Xiang, Weakly supervised object detector learning with model drift detection, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2011, pp. 343350.

Digital Library

[10]

S.K. Divvala, A. Farhadi, C. Guestrin, Learning everything about anything: webly-supervised visual concept learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 32703277.

Digital Library

[11]

P.F. Felzenszwalb, R.B. Girshick, D. McAllester, D. Ramanan, Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 32 (2010) 1627-1645.

Digital Library

[12]

B. Collins, J. Deng, K. Li, L. Fei-Fei, Towards scalable dataset construction: An active learning approach, in: Proceedings of the Springer Europe Conference on Computer Vision (ECCV), 2008, pp. 8698.

Digital Library

[13]

B. Siddiquie, A. Gupta, Beyond active noun tagging: modeling contextual interactions for multi-class active learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 29792986.

[14]

S. Vijayanarasimhan, K. Grauman, Large-scale live active learning: training object detectors with crawled data and crowds, Int. J. Comput. Vis., 108 (2014) 97-114.

Digital Library

[15]

X.-S. Hua, J. Li, Prajna: Towards recognizing whatever you want from images without image labeling, in: Proceedings of the Twenty- Ninth AAAI Conference on Artificial Intelligence, (AAAI), 2015, pp. 137144.

Digital Library

[16]

J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for word representation, in: Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP), 2014, pp. 15321543.

[17]

Yang Yang, Zheng-Jun Zha, Yue Gao, Xiaofeng Zhu, Tat-Seng Chua, Exploiting web images for semantic video indexing via robust sample-specific loss, IEEE Trans. Multimed., 16 (2014) 1677-1689.

[18]

J. Weston, S. Bengio, N. Usunier, Wsabie: scaling up to large vocabulary image annotation, in: Proceedings of the International Joint Conference on Artificial Intelligence, (IJCAI), 2011, pp. 27642770.

Digital Library

[19]

A. Frome, G.S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al., Devise: A deep visual-semantic embedding model, in: Proceedings of the Advances in Neural Information Processing Systems, (NIPS), 2013, pp. 21212129.

Digital Library

[20]

G.A. Miller, Wordnet: a lexical database for english, Commun. ACM, 38 (1995) 39-41.

Digital Library

[21]

Y. Lin, J.-B. Michel, E.L. Aiden, J. Orwant, W. Brockman, S. Petrov, Syntactic annotations for the google books ngram corpus, in: Proceedings of the ACL 2012 System Demonstrations, Association for Computational Linguistics, 2012, pp. 169174.

Digital Library

[22]

R.L. Cilibrasi, P.M. Vitanyi, The google similarity distance, IEEE Trans. Knowl. Data Eng., 19 (2007) 370-383.

Digital Library

[23]

C.-C. Chang, C.-J. Lin, Libsvm: a library for support vector machines, ACM Trans. Intell. Syst. Technol., 2 (2011) 27.

Digital Library

[24]

C. Sinclair, B. Spurr, M. Ahmad, Modified anderson darling test, Commun. Stat.-Theory Methods, 19 (1990) 3677-3686.

[25]

Q. You, J. Luo, H. Jin, J. Yang, Robust image sentiment analysis using progressively trained and domain transferred deep networks, in: Proceedings of the Twenty- Ninth AAAI Conference on Artificial Intelligence (AAAI), 2015.

Digital Library

[26]

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, in: Proceedings of the ACM International Conference on Multimedia, (MM), 2014, pp. 675678.

Digital Library

[27]

Y. Bai, K. Yang, W. Yu, C. Xu, W.-Y. Ma, T. Zhao, Automatic image dataset construction from click-through logs using deep neural network, in: Proceedings of the ACM International Conference on Multimedia, (MM), 2015, pp. 441450.

Digital Library

[28]

Y.-Z. Yao, J. Zhang, F.-M. Shen, X.-S. Hua, J.-S. Xu, Z.-M. Tang, Automatic image dataset construction with multiple textual metadata, in: Proceedings of IEEE International Conference on Multimedia and Expo, 2016.

[29]

M. Pandey, S. Lazebnik, Scene recognition and weakly supervised object localization with deformable part-based models, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2011, pp. 13071314.

Digital Library

[30]

M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, Int. J. Comput. Vis., 88 (2010) 303-338.

Digital Library

[31]

A. Torralba, A. Efros, et al., Unbiased look at dataset bias, in: Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 15211528.

Digital Library

[32]

Yang Yang, Zhigang Ma, Yi Yang, Feiping Nie, Heng Tao Shen, Multitask spectral clustering by exploring intertask correlation, IEEE Trans. Cybern., 45 (2015) 1083-1094.

[33]

Jan Chorowski, Jian Wang, Jacek M. Zurada, Review and performance comparison of svm-and elm-based classifiers, Neurocomputing, 128 (2014) 507-516.

Digital Library

[34]

N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005, pp. 886893.

Digital Library

[35]

A. Bergamo, L. Torresani, Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach, in: Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2010, pp. 181189.

Digital Library

[36]

A. Coates, A.Y. Ng, H. Lee, An analysis of single-layer networks in unsupervised feature learning, in: Proceedings of the International conference on artificial intelligence and statistics, 2011, pp. 215223.

[37]

A. Krizhevsky, G. Hinton, Learning Multiple Layers of Features from Tiny Images, Citeseer, 2009.

[38]

Yanming Guo, Yu Liu, Ard Oerlemans, Songyang Lao, Song Wu, Michael S. Lew, Deep learning for visual understanding: a review, Neurocomputing, 187 (2016) 27-48.

Digital Library

[39]

Deepak Kumar, Manoj Thakur, Weighted multicategory nonparallel planes svm classifiers, Neurocomputing (2016).

Digital Library

Recommendations

A Domain Robust Approach For Image Dataset Construction
MM '16: Proceedings of the 24th ACM international conference on Multimedia

There have been increasing research interests in automatically constructing image dataset by collecting images from the Internet. However, existing methods tend to have a weak domain adaptation ability, known as the "dataset bias problem". To address ...
Generating Diverse Image Datasets with Limited Labeling
MM '16: Proceedings of the 24th ACM international conference on Multimedia

Image datasets play a pivotal role in advancing multimedia and image analysis research. However, most of these datasets are created by extensive human effort and extremely expensive to scale up. There is high chance that we may have no instances for ...
A New Microorganism Dataset for Image Segmentation and Classification Evaluation
ISICDM 2020: The Fourth International Symposium on Image Computing and Digital Medicine

Environmental Microorganism Data Set Fifth Version (EMDS-5) is a microscopic image dataset including original Environmental Microorganism (EM) images and two sets of Ground Truth (GT) images. The GT image sets include a single-object GT image set and a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 236, Issue C

May 2017

160 pages

ISSN:0925-2312

Issue’s Table of Contents

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 02 May 2017

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents