More Web Proxy on the site http://driver.im/

research-article

SEVA: Sensor-enhanced video annotation

Authors:

Prashant ShenoyAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 5, Issue 3

Article No.: 24, Pages 1 - 26

https://doi.org/10.1145/1556134.1556141

Published: 14 August 2009 Publication History

Abstract

In this article, we study how a sensor-rich world can be exploited by digital recording devices such as cameras and camcorders to improve a user's ability to search through a large repository of image and video files. We design and implement a digital recording system that records identities and locations of objects (as advertised by their sensors) along with visual images (as recorded by a camera). The process, which we refer to as Sensor-Enhanced Video Annotation (SEVA), combines a series of correlation, interpolation, and extrapolation techniques. It produces a tagged stream that later can be used to efficiently search for videos or frames containing particular objects or people. We present detailed experiments with a prototype of our system using both stationary and mobile objects as well as GPS and ultrasound. Our experiments show that: (i) SEVA has zero error rates for static objects, except very close to the boundary of the viewable area; (ii) for moving objects or a moving camera, SEVA only misses objects leaving or entering the viewable area by 1--2 frames; (iii) SEVA can scale to 10 fast-moving objects using current sensor technology; and (iv) SEVA runs online using relatively inexpensive hardware.

References

[1]

Adams, B., Phung, D., and Venkatesh, S. 2006. Extraction of social context and application to personal multimedia exploration. In Proceedings of the 14th Annual ACM International Conference on Multimedia (MULTIMEDIA '06). ACM Press, New York, 987--996.

Digital Library

[2]

Ahern, S., Eckles, D., Good, N., King, S., Naaman, M., and Nair, R. 2007. Over-exposed&quest; Privacy patterns and considerations in online and mobile photo sharing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 357--366.

Digital Library

[3]

Aizawa, K., Tancharoen, D., Kawasaki, S., and Yamasaki, T. 2004. Efficient retrieval of life log based on context and content. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experience (CARPE'04), 22--31.

Digital Library

[4]

Appan, P. and Sundaram, H. 2004. Networked multimedia event exploration. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MULTIMEDIA '04). ACM Press, New York, 40--47.

Digital Library

[5]

Bahl, P. and Padmanabhan, V. N. 2000. Radar: An in-building rf-based user location and tracking system. In Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (InfoCom'00), vol. 2, 775--784.

[6]

Bajaj, R., Ranaweera, S. L., and Agrawal, D. P. 2002. Gps: Location-tracking technology. Comput. 35, 4, 92--94.

Digital Library

[7]

Barry, B. 2005. Mindful documentary. Ph.D. thesis, Massachusetts Institute of Technology.

Digital Library

[8]

Davis, M., King, S., Good, N., and Sarvas, R. 2004. From context to content: Leveraging context to infer media metadata. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MM'04), 188--195.

Digital Library

[9]

Devore, J. L. 1999. Probability and Statistics for Engineering and the Sciences, 5th Ed. Brooks/Cole.

[10]

Dourish, P. 2004. What we talk about when we talk about context. Personal Ubiquitous Comput. 8, 1, 19--30.

Digital Library

[11]

Ellis, D. P. W. and Lee, K. 2004. Minimal-impact audio-based personal archives. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experience (CARPE'04), 39--47.

Digital Library

[12]

Fan, J., Gao, Y., and Luo, H. 2004. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MM'04), 540--547.

Digital Library

[13]

Feng, H., Shi, R., and Chua, T. 2004. A bootstrapping framework for annotating and retrieving www images. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MM'04), 960--967.

Digital Library

[14]

Finkenzeller, K. 2003. RFID Handbook: Fundamentals and Applications in Contactless Smart Cards and Identification, 2nd Ed. John Willey & Sons.

Digital Library

[15]

Gemmell, J., Bell, G., Lueder, R., Drucker, S., and Wong, C. 2002. Mylifebits: Fulfilling the memex vision. In Proceedings of the 10th Annual ACM International Conference on Multimedia (MM'02), 235--238.

Digital Library

[16]

Gemmell, J., Williams, L., Wood, K., Lueder, R., and Bell, G. 2004. Passive capture and ensuing issues for a personal lifetime store. In Proceedings of the 1st ACM Workshop on Continuous Archival and Retrieval of Personal Experience (CARPE'04), 48--55.

Digital Library

[17]

geocoder. Find the latitude and longitude of any us address. http://www.geocoder.us.

[18]

gpsdrive: Gpsdrive 2.09. http://www.gpsdrive.cc/.

[19]

Grimm, R. 2002. System support for pervasive applications. Ph.D. thesis, University of Washington, Department of Computer Science and Engineering.

Digital Library

[20]

H&#228;hnel, D., Burgard, W., Fox, D., Fishkin, K., and Philipose, M. 2004. Mapping and localization with rfid technology. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA'05), 1015--1020.

[21]

Harter, A., Hopper, A., Steggles, P., Ward, A., and Webster, P. 1999. The anatomy of a context-aware application. In Proceedings of the 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom'99), 59--68.

Digital Library

[22]

Hightower, J. and Borriello, G. 2001. Location systems for ubiquitous computing. Comput. 34, 8, 57--66.

Digital Library

[23]

Hightower, J., Want, R., and Borriello, G. 2000. Spoton: An indoor 3D location sensing technology based on rf signal strength. Tech. rep. 00-02-02, University of Washington.

[24]

Hill, J. and Culler, D. 2002. Mica: A wireless platform for deeply embedded networks. IEEE Micro 22, 6, 1224.

Digital Library

[25]

Hong, J. I. and Landay, J. A. 2004. An architecture for privacy-sensitive ubiquitous computing. In Proceedings of the 2nd International Conference on Mobile Systems, Applications, and Services, 177--189.

Digital Library

[26]

Jin, R., Chai, J. Y., and Si, L. 2004. Effective automatic image annotation via a coherent language model and active learning. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MM'04), 892--899.

Digital Library

[27]

Johanson, B., Fox, A., and Winograd, T. 2002. The interactive workspaces project: Experiences with ubiquitous computing rooms. IEEE Pervasive Comput. 1, 2.

Digital Library

[28]

Kindberg, T. and et. al. 2002. People, places, things: Web presence for the real world. Mobile Netw. 7, 5.

Digital Library

[29]

Li, B. and Goh, K. 2003. Confidence-based dynamic ensemble for image annotation and semantics discovery. In Proceedings of the 11th Annual ACM International Conference on Multimedia (MM'03), 195--206.

Digital Library

[30]

Liu, X., Corner, M., and Shenoy, P. 2005. Seva: Sensor-enhanced video annotation. In Proceedings of the 13th ACM Annual Conference on Multimedia (MM'05), 618--627.

Digital Library

[31]

Liu, X., Corner, M., and Shenoy, P. 2006. Ferret: Rfid localization for pervasive multimedia. In Proceedings of the 8th International Conference on Ubiquitous Computing (UbiComp'06).

Digital Library

[32]

Lymberopoulos, D. and Savvides, A. 2005. XYZ: A motion-enabled, power aware sensor node platform for distributed sensor network applications. In Proceedings of Information Processing in Sensor Networks (ISPN).

Digital Library

[33]

Mainwaring, A., Polastre, J., Szewczyk, R., Culler, D., and Anderson, J. 2002. Wireless sensor networks for habitat monitoring. In Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications (WSNA'02), 88--97.

Digital Library

[34]

Manjunath, B. S., Salembier, P., and Sikora, T. 2002. Introduction to MPEG 7: Multimedia Content Description Language, 4th Ed. John Wiley & Sons.

Digital Library

[35]

Mealling, M. 2003. Auto-id object name service (ons) 1.0. Working Draft 12.

[36]

Naaman, M., Harada, S., Wang, Q., Garcia-Molina, H., and Paepcke, A. 2004. Context data in geo-referenced digital photo collections. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MM'04), 196--203.

Digital Library

[37]

Naaman, M., Paepcke, A., and Garcia-Molina, H. 2003. From where to what: Metadata sharing for digital photographs with geographic coordinates. In Proceedings of the 10th International Conference on Cooperative Information Systems (CoopIS'03), 196--217.

[38]

Nack, F. and Putz, W. 2004. Saying what it means: Semi-automated (News) media annotation. Multimedia Tools and Applications 22, 3, 263--302.

Digital Library

[39]

Ni, L. M., Liu, Y., Lau, Y. C., and Patil, A. P. 2003. Landmarc: Indoor location sensing using active rfid. In Proceedings of the 1st IEEE International Conference on Pervasive Computing and Communications (PerCom'03). 407--417.

Digital Library

[40]

Polastre, J., Szewczyk, R., and Culler, D. 2005. Telos: Enabling ultra-low power wireless research. In Proceedings of the 4th International Conference on Information Processing in Sensor Networks: Special Track on Platform Tools and Design Methods for Network Embedded Sensors (IPSN/SPOTS).

Digital Library

[41]

Priyantha, N. B., Chakraborty, A., and Balakrishnan, H. 2000. The cricket location-support system. In Proceedings of the 6th Annual ACM International Conference on Mobile Computing and Networking (MobiCom'00), 32--43.

Digital Library

[42]

Roman, M., Hess, C., and Campbell, R. 2002. Gaia: An oo middleware infrastructure for ubiquitous computing environments. In ECOOP Workshop on Object-Orientation and Operating Systems.

[43]

Simon, D. 2006. Optimal State Estimation, 1st Ed. Wiley-Interscience.

[44]

Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content-based image retrieval at the end of the early years. IEEE Trans. Patt. Anal. Mach. Intell. 22, 12, 1349--1380.

Digital Library

[45]

Smith, A., Balakrishnan, H., Goraczko, M., and Priyantha, N. 2004. Tracking moving devices with the cricket location system. In Proceedings of the 2nd ACM International Conference on Mobile Systems, Applications, and Services (MobiSys'04), 190--202.

Digital Library

[46]

Su, N. M., Park, H., Bostrom, E., Burke, J., Srivastava, M. B., and Estrin, D. 2004. Augmemting film and video footage with sensor data. In Proceedings of the 2nd IEEE Annual Conference on Pervasive Computing and Communications (PerComm'04), 3--12.

Digital Library

[47]

Toyama, K., Logan, R., and Roseway, A. 2003. Geographic location tags on digital images. In Proceedings of the 11th Annual ACM International Conference on Multimedia (MM'03), 156--166.

Digital Library

[48]

Want, R., Hopper, A., Falcao, V., and Gibbons, J. 1992. The active badge location system. ACM Trans. Inf. Syst. 10, 1, 91--102.

Digital Library

[49]

Zhang, L., Hu, Y., Li, M., Ma, W., and Zhang, H. 2004. Effective propagation for face annotation in family albums. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MM'04), 716--723.

Digital Library

Cited By

Alfarrarjeh AKim SYoon J(2025)A framework for automatically generating composite keywords for geo-tagged street imagesKuwait Journal of Science10.1016/j.kjs.2024.10033352:1(100333)Online publication date: Jan-2025
https://doi.org/10.1016/j.kjs.2024.100333
Khan U(2020)Semantic Analysis of Videos for Tags Prediction and SegmentationIndustrial Internet of Things and Cyber-Physical Systems10.4018/978-1-7998-2803-7.ch014(296-307)Online publication date: 2020
https://doi.org/10.4018/978-1-7998-2803-7.ch014
Shao JHu GSong JLiu XShen H(2019)Towards Accurate Georeferenced Video Search With Camera Field of View ModelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2018.284820029:6(1844-1855)Online publication date: Jun-2019
https://doi.org/10.1109/TCSVT.2018.2848200
Show More Cited By

Index Terms

SEVA: Sensor-enhanced video annotation
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

SEVA: sensor-enhanced video annotation
MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia

In this paper, we study how a sensor-rich world can be exploited by digital recording devices such as cameras and camcorders to improve a user's ability to search through a large repository of image and video files. We design and implement a digital ...
Mobile Seva-Enabling mGovernance in India
CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Mobile Governance (m-Governance) is a new channel or access method to deliver government service to all citizens. M-Governance provides an additional access tool for e-Government and its processes with the uses of wireless and mobile technologies to ...
A heuristic evaluation of a mobile annotation tool
WebMedia '13: Proceedings of the 19th Brazilian symposium on Multimedia and the web

Modern mobile devices are natural multimedia devices that enable one to access, manage and transmit multiple types of media such as video, photo, audio and maps. Video playing on these devices is becoming part of everyday life for many users. Aiming to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 5, Issue 3

August 2009

204 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/1556134

Issue’s Table of Contents

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2009

Accepted: 01 May 2008

Revised: 01 December 2007

Received: 01 September 2006

Published in TOMM Volume 5, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
454
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alfarrarjeh AKim SYoon J(2025)A framework for automatically generating composite keywords for geo-tagged street imagesKuwait Journal of Science10.1016/j.kjs.2024.10033352:1(100333)Online publication date: Jan-2025
https://doi.org/10.1016/j.kjs.2024.100333
Khan U(2020)Semantic Analysis of Videos for Tags Prediction and SegmentationIndustrial Internet of Things and Cyber-Physical Systems10.4018/978-1-7998-2803-7.ch014(296-307)Online publication date: 2020
https://doi.org/10.4018/978-1-7998-2803-7.ch014
Shao JHu GSong JLiu XShen H(2019)Towards Accurate Georeferenced Video Search With Camera Field of View ModelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2018.284820029:6(1844-1855)Online publication date: Jun-2019
https://doi.org/10.1109/TCSVT.2018.2848200
Panta FQodseya MRoman-Jimenez GPéninou ASèdes F(2018)Spatio-Temporal Metadata Querying for CCTV Video RetrievalProceedings of the 9th ACM SIGSPATIAL International Workshop on Indoor Spatial Awareness10.1145/3282461.3282465(7-14)Online publication date: 6-Nov-2018
https://dl.acm.org/doi/10.1145/3282461.3282465
Dongbo Liu He Ma Jianhua Li (2017)Error distribution modeling of embedded sensors on smartphones by using laser ranger2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)10.1109/ICMEW.2017.8026218(387-392)Online publication date: Jul-2017
https://doi.org/10.1109/ICMEW.2017.8026218
Chen BHuang S(2015)An Advanced Visibility Restoration Algorithm for Single Hazy ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/272694711:4(1-21)Online publication date: 2-Jun-2015
https://dl.acm.org/doi/10.1145/2726947
Venkatagiri SChan MOoi WChiam J(2015)On Demand Retrieval of Crowdsourced Mobile VideoIEEE Sensors Journal10.1109/JSEN.2014.233629215:5(2632-2642)Online publication date: May-2015
https://doi.org/10.1109/JSEN.2014.2336292
Codreanu DPeninou ASedes F(2015)Video Spatio-Temporal Filtering Based on Cameras and Target Objects Trajectories -- Videosurveillance Forensic FrameworkProceedings of the 2015 10th International Conference on Availability, Reliability and Security10.1109/ARES.2015.102(611-617)Online publication date: 24-Aug-2015
https://dl.acm.org/doi/10.1109/ARES.2015.102
Narayanan RYe YKaul AShah M(2014)Mobile Video StreamingAdvanced Content Delivery, Streaming, and Cloud Services10.1002/9781118909690.ch7(141-158)Online publication date: 3-Oct-2014
https://doi.org/10.1002/9781118909690.ch7
Yu XGanz ACandan KPanchanathan SPrabhakaran BSundaram HFeng WSebe N(2011)Detecting and identifying people in mobile videosProceedings of the 19th ACM international conference on Multimedia10.1145/2072298.2071927(1017-1020)Online publication date: 28-Nov-2011
https://dl.acm.org/doi/10.1145/2072298.2071927

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents