[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A new web-supervised method for image dataset constructions

Published: 02 May 2017 Publication History

Abstract

The goal of this work is to automatically collect a large number of highly relevant natural images from Internet for given queries. A novel automatic image dataset construction framework is proposed by employing multiple query expansions. In specific, the given queries are first expanded by searching in the Google Books Ngrams Corpora to obtain a richer semantic descriptions, from which the visually non-salient and less relevant expansions are then filtered. After retrieving images from the Internet with filtered expansions, we further filter noisy images by clustering and progressively Convolutional Neural Networks (CNN) based methods. To evaluate the performance of our proposed method for image dataset construction, we build an image dataset with 10 categories. We then run object detections on our image dataset with three other image datasets which were constructed by weak supervised, web supervised and full supervised learning, the experimental results indicated the effectiveness of our method is superior to weak supervised and web supervised state-of-the-art methods. In addition, we do a cross-dataset classification to evaluate the performance of our dataset with two publically available manual labelled dataset STL-10 and CIFAR-10.

References

[1]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: A large-scale hierarchical image database, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248255.
[2]
F. Schroff, A. Criminisi, A. Zisserman, Harvesting image databases from the web, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011) 754-766.
[3]
W.-H. Lin, R. Jin, A. Hauptmann, Web image retrieval re-ranking with relevance model, in: Proceedings of the IEEE Conference on Web Intelligence (WI), 2003, pp. 242248.
[4]
R. Fergus, L. Fei-Fei, P. Perona, A. Zisserman, Learning object categories from googles image search, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2005, pp. 18161823.
[5]
S. Vijayanarasimhan, K. Grauman, Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008, pp. 18.
[6]
L.-J. Li, L. Fei-Fei, Optimol: automatic online picture collection via incremental model learning, Int. J. Comput. Vis., 88 (2010) 147-168.
[7]
T. Hofmann, Probabilistic latent semantic analysis, in: Proceedings of the Morgan Kaufmann Fifteenth Conference on Uncertainty in Artificial Intelligence, 1999, pp. 289296.
[8]
T.L. Berg, D. Forsyth, et al., Animals on the web, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006, pp. 14631470.
[9]
P. Siva, T. Xiang, Weakly supervised object detector learning with model drift detection, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2011, pp. 343350.
[10]
S.K. Divvala, A. Farhadi, C. Guestrin, Learning everything about anything: webly-supervised visual concept learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 32703277.
[11]
P.F. Felzenszwalb, R.B. Girshick, D. McAllester, D. Ramanan, Object detection with discriminatively trained part-based models, IEEE Trans. Pattern Anal. Mach. Intell., 32 (2010) 1627-1645.
[12]
B. Collins, J. Deng, K. Li, L. Fei-Fei, Towards scalable dataset construction: An active learning approach, in: Proceedings of the Springer Europe Conference on Computer Vision (ECCV), 2008, pp. 8698.
[13]
B. Siddiquie, A. Gupta, Beyond active noun tagging: modeling contextual interactions for multi-class active learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 29792986.
[14]
S. Vijayanarasimhan, K. Grauman, Large-scale live active learning: training object detectors with crawled data and crowds, Int. J. Comput. Vis., 108 (2014) 97-114.
[15]
X.-S. Hua, J. Li, Prajna: Towards recognizing whatever you want from images without image labeling, in: Proceedings of the Twenty- Ninth AAAI Conference on Artificial Intelligence, (AAAI), 2015, pp. 137144.
[16]
J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for word representation, in: Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP), 2014, pp. 15321543.
[17]
Yang Yang, Zheng-Jun Zha, Yue Gao, Xiaofeng Zhu, Tat-Seng Chua, Exploiting web images for semantic video indexing via robust sample-specific loss, IEEE Trans. Multimed., 16 (2014) 1677-1689.
[18]
J. Weston, S. Bengio, N. Usunier, Wsabie: scaling up to large vocabulary image annotation, in: Proceedings of the International Joint Conference on Artificial Intelligence, (IJCAI), 2011, pp. 27642770.
[19]
A. Frome, G.S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al., Devise: A deep visual-semantic embedding model, in: Proceedings of the Advances in Neural Information Processing Systems, (NIPS), 2013, pp. 21212129.
[20]
G.A. Miller, Wordnet: a lexical database for english, Commun. ACM, 38 (1995) 39-41.
[21]
Y. Lin, J.-B. Michel, E.L. Aiden, J. Orwant, W. Brockman, S. Petrov, Syntactic annotations for the google books ngram corpus, in: Proceedings of the ACL 2012 System Demonstrations, Association for Computational Linguistics, 2012, pp. 169174.
[22]
R.L. Cilibrasi, P.M. Vitanyi, The google similarity distance, IEEE Trans. Knowl. Data Eng., 19 (2007) 370-383.
[23]
C.-C. Chang, C.-J. Lin, Libsvm: a library for support vector machines, ACM Trans. Intell. Syst. Technol., 2 (2011) 27.
[24]
C. Sinclair, B. Spurr, M. Ahmad, Modified anderson darling test, Commun. Stat.-Theory Methods, 19 (1990) 3677-3686.
[25]
Q. You, J. Luo, H. Jin, J. Yang, Robust image sentiment analysis using progressively trained and domain transferred deep networks, in: Proceedings of the Twenty- Ninth AAAI Conference on Artificial Intelligence (AAAI), 2015.
[26]
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, in: Proceedings of the ACM International Conference on Multimedia, (MM), 2014, pp. 675678.
[27]
Y. Bai, K. Yang, W. Yu, C. Xu, W.-Y. Ma, T. Zhao, Automatic image dataset construction from click-through logs using deep neural network, in: Proceedings of the ACM International Conference on Multimedia, (MM), 2015, pp. 441450.
[28]
Y.-Z. Yao, J. Zhang, F.-M. Shen, X.-S. Hua, J.-S. Xu, Z.-M. Tang, Automatic image dataset construction with multiple textual metadata, in: Proceedings of IEEE International Conference on Multimedia and Expo, 2016.
[29]
M. Pandey, S. Lazebnik, Scene recognition and weakly supervised object localization with deformable part-based models, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2011, pp. 13071314.
[30]
M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, Int. J. Comput. Vis., 88 (2010) 303-338.
[31]
A. Torralba, A. Efros, et al., Unbiased look at dataset bias, in: Proceedings of the Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, 2011, pp. 15211528.
[32]
Yang Yang, Zhigang Ma, Yi Yang, Feiping Nie, Heng Tao Shen, Multitask spectral clustering by exploring intertask correlation, IEEE Trans. Cybern., 45 (2015) 1083-1094.
[33]
Jan Chorowski, Jian Wang, Jacek M. Zurada, Review and performance comparison of svm-and elm-based classifiers, Neurocomputing, 128 (2014) 507-516.
[34]
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005, pp. 886893.
[35]
A. Bergamo, L. Torresani, Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach, in: Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2010, pp. 181189.
[36]
A. Coates, A.Y. Ng, H. Lee, An analysis of single-layer networks in unsupervised feature learning, in: Proceedings of the International conference on artificial intelligence and statistics, 2011, pp. 215223.
[37]
A. Krizhevsky, G. Hinton, Learning Multiple Layers of Features from Tiny Images, Citeseer, 2009.
[38]
Yanming Guo, Yu Liu, Ard Oerlemans, Songyang Lao, Song Wu, Michael S. Lew, Deep learning for visual understanding: a review, Neurocomputing, 187 (2016) 27-48.
[39]
Deepak Kumar, Manoj Thakur, Weighted multicategory nonparallel planes svm classifiers, Neurocomputing (2016).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing
Neurocomputing  Volume 236, Issue C
May 2017
160 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 02 May 2017

Author Tags

  1. Image dataset construction
  2. Multiple query expansions
  3. Web-supervised

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media