Abstract
Training sets of images for object recognition are the pillars on which classifiers base their performances. We have built a framework to support the entire process of image and textual retrieval from search engines, which, giving an input keyword, performs a statistical and a semantic analysis and automatically builds a training set. We have focused our attention on textual information and we have explored, with several experiments, three different approaches to automatically discriminate between positive and negative images: keyword position, tag frequency and semantic analysis. We present the best results for each approach.
Chapter PDF
Similar content being viewed by others
Keywords
References
Helmer, S., Meger, D., Viswanathan, P., McCann, S., Dockrey, M., Fazli, P., Southey, T., Muja, M., Joya, M., Jim, L., Lowe, D.G., Mackworth, A.K.: Semantic robot vision challenge: current state and future directions. In: IJCAI workshop (2009)
Cheng, D.S., Setti, F., Zeni, N., Ferrario, R., Cristani, M.: Semantically-driven automatic creation of training sets for object recognition. Computer Vision and Image Understanding 131, 56–71 (2014)
Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. 2(1), 1–19 (2006)
Liu, Y., Xu, D., Tsang, I.W., Luo, J.: Textual query of personal photos facilitated by large-scale web data. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 1022–1036 (2011)
Heymann, P., Paepcke, A., Garcia-Molina, H.: Tagging human knowledge. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining. WSDM 2010, pp. 51–60. ACM, New York (2010)
Lin, Z., Ding, G., Hu, M., Wang, J., Ye, X.: Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1618–1625, June 2013
Wu, L., Jin, R., Jain, A.K.: Tag completion for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 716–727 (2013)
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005, vol. 2, pp. 524–531 (2005)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding 106(1), 59–70 (2007)
Li, L.J., Li, F.F.: Optimol: Automatic online picture collection via incremental model learning. International Journal of Computer Vision 88(2), 147–168 (2010)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Tenth IEEE International Conference on Computer Vision, 2005. ICCV 2005. vol. 2, pp. 1816–1823 (2005)
Gilbert, A., Bowden, R.: A picture is worth a thousand tags: automatic web based image tag expansion. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 447–460. Springer, Heidelberg (2013)
Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., Tomkins, A.: Visualizing tags over time. In: Proceedings of the 15th International Conference on World Wide Web. WWW 2006, pp. 193–202. ACM, New York (2006)
Ahern, S., Naaman, M., Nair, R., Yang, J.: World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In. In Proceedings of the Seventh ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 1–10. ACM Press (2007)
Spain, M., Perona, P.: Some objects are more equal than others: measuring and predicting importance. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 523–536. Springer, Heidelberg (2008)
Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: CHI 2007: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 971–980. ACM Press, New York (2007)
Wu, L., Hua, X.S., Yu, N., Ma, W.Y., Li, S.: Flickr distance: A relationship measure for visual concepts. Pattern Analysis and Machine Intelligence, IEEE Transactions on 34(5), 863–875 (2012)
Hwang, S.J., Grauman, K.: Learning the relative importance of objects from tagged images for retrieval and cross-modal search. Int. J. Comput. Vision 100(2), 134–153 (2012)
Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (June 2008)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. CoRR cmp-lg/9709008 (1997)
Fellbaum, C.: WordNet: An Electronic Lexical Database. Language, Speech and Communication. Mit Press (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Abdulhak, S.A., Riviera, W., Zeni, N., Cristani, M., Ferrario, R., Cristani, M. (2015). Semantic-Analysis Object Recognition: Automatic Training Set Generation Using Textual Tags. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8926. Springer, Cham. https://doi.org/10.1007/978-3-319-16181-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-16181-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16180-8
Online ISBN: 978-3-319-16181-5
eBook Packages: Computer ScienceComputer Science (R0)