[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

On Generating Content-Oriented Geo Features for Sensor-Rich Outdoor Video Search

Published: 01 October 2015 Publication History

Abstract

Advanced technologies in consumer electronics products have enabled individual users to record, share, and view videos on mobile devices. With the volume of videos increasing tremendously on the Internet, fast and accurate video search has attracted much research attention. A good similarity measure is a key component in a video retrieval system. Most of the existing solutions only rely on either the low-level visual features or the surrounding textual annotations. Those approaches often suffer from low recall as they are highly susceptible to changes in viewpoint, illumination, and noisy tags. By leveraging geo-metadata, more reliable and precise search results can be obtained. However, two issues remain challenging: (1) how to quantify the spatial relevance of videos with the visual similarity to generate a pertinent ranking of results according to users' needs, and (2) how to design a compact video representation that supports efficient indexing for fast video retrieval. In this study, we propose a novel video description which consists of (a) determining the geographic coverage of a video based on the camera's field-of-view and a pre-constructed geo-codebook, and (b) fusing video spatial relevance and region-aware visual similarities to achieve a robust video similarity measure. Toward a better encoding of a video's geo-coverage, we construct a geo-codebook by semantically segmenting a map into a collection of coherent regions. To evaluate the proposed technique we developed a video retrieval prototype. Experiments show that our proposed method improves the mean average precision by 4.6% ~ 10.5%, compared with existing approaches.

References

[1]
“YouTube press room: Statistics,” YouTube, San Bruno, CA, USA, [Online]. Available: http://www.youtube.com/yt/press/statistics.html, Mar. 2014.
[2]
X. Tian et al., “Bayesian video search reranking,” in Proc. ACM Multimedia, 2008, pp. 131–140.
[3]
J. Liu, W. Lai, X.-S. Hua, Y. Huang, and S. Li, “Video search re-ranking via multi-graph propagation,” in Proc. ACM Multimedia, 2007, pp. 208–217.
[4]
R. Jain and P. Sinha, “Content without context is meaningless,” in Proc. ACM Multimedia, 2010, pp. 1259–1268.
[5]
C. Kumar, “Relevance and ranking in geographic information retrieval,” in Proc. BCS-IRSG FDIA, 2011, pp. 2–7.
[6]
L. S. Kennedy and M. Naaman, “Generating diverse and representative image search results for landmarks,” in Proc. WWW, 2008, pp. 297–306.
[7]
D. J. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg, “Mapping the world's photos,” in Proc. WWW, 2009, pp. 761–770.
[8]
J. Kleban, E. Moxley, J. Xu, and B. S. Manjunath, “Global annotation on georeferenced photographs,” in Proc. ACM Image Video Retrieval, 2009, pp. 1–8.
[9]
J. Kamahara, T. Nagamatsu, and N. Tanaka, “Conjunctive ranking function using geographic distance and image distance for geotagged image retrieval,” in Proc. ACM GeoMM, 2012, pp. 9–14.
[10]
S. Arslan Ay, R. Zimmermann, and S. H. Kim, “Viewable scene modeling for geospatial video search,” in Proc. ACM Multimedia, 2008, pp. 309–318.
[11]
S. Arslan Ay, R. Zimmermann, and S. Kim, “relevance ranking in georeferenced video search,” in Multimedia Syst., vol. 16, pp. 105–125, 2010.
[12]
Y. Kim, J. Kim, and H. Yu, “GeoSearch: Georeferenced video retrieval system,” in Proc. ACM SIGKDD, 2012, pp. 1540–1543.
[13]
A. G. Hauptmann and M. G. Christel, “Successful approaches in the TREC video retrieval evaluations,” in Proc. ACM Multimedia, 2004, pp. 668–675.
[14]
M. Campbell et al., “IBM research TRECVID-2006 video retrieval system,” in Proc. TRECVID 2006, 2006.
[15]
O. A. B. Penatti, L. T. Li, J. Almeida, R. Da, and S. Torres, “A visual approach for video geocoding using bag-of-scenes,” in Proc. ACM ICMR, 2012, pp. 53:1–53:8.
[16]
L.-J. Li, H. Su, Y. Lim, and L. Fei-Fei, “Object bank: An object-level image representation for high-level visual recognition,” in Int. J. Comput. Vis., vol. 107, no. 1, pp. 20–39, 2014.
[17]
W. H. Hsu, L. S. Kennedy, and S.-F. Chang, “Video search reranking through random walk over document-level context graph,” in Proc. ACM Multimedia, 2007, pp. 971–980.
[18]
Y. Yin, Z. Shen, L. Zhang, and R. Zimmermann, “Spatial-temporal tag mining for automatic geospatial video annotation,” in ACM Trans. Multimedia Comput., Commun., Appl., vol. 11, pp. 29:1–29:21, 2015.
[19]
Y. Yin, B. Seo, and R. Zimmermann, “Content vs. context: Visual and geographic information use in video landmark retrieval,” in ACM Trans. Multimedia Comput., Commun., Appl., vol. 11, no. 3, pp. 39:1–39:21, 2015.
[20]
S. Liao, X. Li, X. Wang, and X. Du, “Building geo-aware tag features for image classification,” in Proc. IEEE Int. Conf. Multimedia Expo, Jul. 2014, pp. 1–6.
[21]
B. Zhang et al., “Annotating and navigating tourist videos,” in Proc. ACM SIGSPATIAL GIS, 2010, pp. 260–269.
[22]
Z. Shen, S. Arslan Ay, S. H. Kim, and R. Zimmermann, “Automatic tag generation and ranking for sensor-rich outdoor videos,” in Proc. ACM Multimedia, 2011, pp. 93–102.
[23]
S. Cheung and A. Zakhor, “Efficient video similarity measurement and search,” in Image Process., vol. 1, pp. 85–88, 2000.
[24]
H. T. Shen, B. C. Ooi, and X. Zhou, “Towards effective indexing for very large video sequence database,” in Proc. ACM SIGMOD, 2005, pp. 730–741.
[25]
J. Yuan, Z.-J. Zha, Y.-T. Zheng, M. Wang, X. Zhou, and T.-S. Chua, “learning concept bundles for video search with complex queries,” in Proc. ACM Multimedia, 2011, pp. 453–462.
[26]
L. Cao, Z. Li, Y. Mu, and S.-F. Chang, “Submodular video hashing: A unified framework towards video pooling and indexing,” in Proc. ACM Multimedia, 2012, pp. 299–308.
[27]
X. Li, C. G. M. Snoek, M. Worring, and A. W. M. Smeulders, “Fusing concept detection and geo context for visual search,” in Proc. ACM ICMR, 2012, pp. 4:1–4:8.
[28]
X.-Y. Wei and C.-W. Ngo, “Ontology-enriched semantic space for video search,” in Proc. ACM Multimedia, 2007, pp. 981–990.
[29]
T. Judd, K. Ehinger, F. Durand, and A. Torralba, “Learning to predict where humans look,” in Comput. Vis., pp. 2106–2113, 2009.
[30]
R. H. Van Leuken, L. Garcia, X. Olivares, and R. van Zwol, “Visual diversification of image search results,” in Proc. WWW, 2009, pp. 341–350.
[31]
S. Cheung and A. Zakhor, “Efficient video similarity measurement with video signature,” in IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 1, pp. 59–74, Jan. 2003.
[32]
B. Thomee and A. Rae, “Uncovering locally characterizing regions within geotagged data,” in Proc. WWW, 2013, pp. 1285–1296.
[33]
S. Intagorn and K. Lerman, “Learning boundaries of vague places from noisy annotations,” in Proc. ACM SIGSPATIAL GIS, 2011, pp. 425–428.
[34]
A. Ballatore, M. Bertolotto, and D. Wilson, “Geographic knowledge extraction and semantic similarity in OpenStreetMap,” in Knowl. Inf. Syst., vol. 37, no. 1, pp. 61–81, 2013.
[35]
M. E. Newman, “Analysis of weighted networks,” in Phys. Rev. E, vol. 70, 2004, Art. ID 056131.
[36]
T. H. Silva, P. O. S. Vaz de Melo, J. M. Almeida, J. Salles, and A. A. F. Loureiro, “A comparison of Foursquare and Instagram to the study of city dynamics and urban social behavior,” in Proc. ACM SIGKDD UrbComp, 2013, pp. 1–8.
[37]
M. De Choudhury et al., “Automatic construction of travel itineraries using social breadcrumbs,” in Proc. ACM Conf. Hypertext Hypermedia, 2010, pp. 35–44.
[38]
H. J. Seo and P. Milanfar, “Static and space-time visual saliency detection by self-resemblance,” in J. Vis., vol. 9, pp. 1–27, 2009.
[39]
M.-M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S.-M. Hu, “Global contrast based salient region detection,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 569–582, 1 Mar. 2015.
[40]
M. Wang, X.-S. Hua, J. Tang, and R. Hong, “Beyond distance measurement: Constructing neighborhood similarity for video annotation,” in IEEE Trans. Multimedia, vol. 11, no. 3, pp. 465–476, Apr. 2009.
[41]
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with local and global consistency,” in Proc. NIPS, 2004, vol. 16, pp. 321–328.
[42]
L. Ballan, M. Bertini, A. Del Bimbo, M. Meoni, and G. Serra, “Tag suggestion and localization in user-generated videos based on social knowledge,” in Proc. ACM SIGMM Workshop Social Media, 2010, pp. 3–8.
[43]
X. Li, C. Snoek, and M. Worring, “Learning social tag relevance by neighbor voting,” in IEEE Trans. Multimedia, vol. 11, no. 7, pp. 1310–1322, Nov. 2009.
[44]
L. Ballan, M. Bertini, T. Uricchio, and A. Del Bimbo, “Social media annotation,” in Proc. Int. Workshop Content-Based Multimedia Indexing, 2013, pp. 229–235.
[45]
B. Manjunath and W. Ma, “Texture features for browsing and retrieval of image data,” in IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 8, pp. 837–842, Aug. 1996.
[46]
M. Stricker and M. Orengo, “Similarity of color images,” in Proc. SPIE Storage Retrieval Image Video Databases III, 1995, pp. 381–392.
[47]
A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,” in Int. J. Comput. Vis., vol. 42, pp. 145–175, 2001.
[48]
N. O’Hare, C. Gurrin, G. Jones, and A. Smeaton, “Combination of content analysis and context features for digital photograph retrieval,” in Proc. Eur. Workshop Integration Knowl., Semantics Digital Media Technol., 2005, pp. 323–328.
[49]
C. Hauff, “A study on the accuracy of Flickr's geotag data,” in Proc. ACM SIGIR, 2013, pp. 1037–1040.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 17, Issue 10
Oct. 2015
191 pages

Publisher

IEEE Press

Publication History

Published: 01 October 2015

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media