[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3152178.3152185acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Automatic Spatio-temporal Indexing to Integrate and Analyze the Data of an Organization

Published: 07 November 2017 Publication History

Abstract

Organizations are awash in data. In many cases, they do not know what data exists within the organization and much information is not available when needed, or worse, information gets recreated from other sources. In this paper, we present an automatic approach to spatio-temporal indexing of the datasets within an organization. The indexing process automatically identifies the spatial and temporal fields, normalizes and cleans those fields, and then loads them into a big data store where the information can be efficiently searched, queried, and analyzed. We evaluated our approach on 600 datasets published by the City of Los Angeles and show that we can automatically process their data and can efficiently access and analyze the indexed data.

References

[1]
R Ahsan, R Neamtu, and E Rundensteiner. 2016. Using entity identification and classification for automated integration of spatial-temporal data. International Journal of Design & Nature and Ecodynamics 11, 3 (2016), 186--197.
[2]
Avi Arampatzis, Marc Van Kreveld, Iris Reinbacher, Christopher B Jones, Subodh Vaid, Paul Clough, Hideo Joho, and Mark Sanderson. 2006. Web-based delineation of imprecise regions. Computers, Environment and Urban Systems 30, 4 (2006), 436--459.
[3]
Luciano Barbosa, Kien Pham, Claudio Silva, Marcos R Vieira, and Juliana Freire. 2014. Structured open urban data: understanding the landscape. Big data 2, 3 (2014), 144--154.
[4]
Daniel Castellani Ribeiro, Huy T Vo, Juliana Freire, and Cláudio T Silva. 2015. An urban data profiler. In Proceedings of the 24th International Conference on World Wide Web. ACM, 1389--1394.
[5]
Sanjay Chawla, Shashi Shekhar, Weili Wu, and Uygar Ozesmi. 2001. Modeling spatial dependencies for mining geospatial data. In Proceedings of the 2001 SIAM International Conference on Data Mining. SIAM, 1--17.
[6]
Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In ACM SIGPLAN Notices, Vol. 46. ACM, 317--330.
[7]
Craig A. Knoblock and Pedro Szekely. 2015. Exploiting Semantics for Big Data Integration. AI Magazine (2015).
[8]
Oleksii Kononenko, Olga Baysal, Reid Holmes, and Michael W Godfrey. 2014. Mining modern repositories with elasticsearch. In Proceedings of the 11th Working Conference on Mining Software Repositories. ACM, 328--331.
[9]
Laurent Marsan and Marie-France Sagot. 2000. Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. Journal of computational biology 7, 3-4 (2000), 345--362.
[10]
Bruno Martins, Hugo Manguinhas, and José Borbinha. 2008. Extracting and exploring the geo-temporal semantics of textual resources. In Semantic Computing, 2008 IEEE International Conference on. IEEE, 1--9.
[11]
Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai. 2007. Automatic labeling of multinomial topic models. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 490--499.
[12]
Mohamed F Mokbel, Walid G Aref, Susanne E Hambrusch, and Sunil Prabhakar. 2003. Towards scalable location-aware services: requirements and research issues. In Proceedings of the 11th ACM international symposium on Advances in geographic information systems. ACM, 110--117.
[13]
Tamer Nadeem, Sasan Dashtinezhad, Chunyuan Liao, and Liviu Iftode. 2004. Trafficview: A scalable traffic monitoring system. In Mobile Data Management, 2004. Proceedings. 2004 IEEE International Conference on. IEEE, 13--26.
[14]
Minh Pham, Suresh Alse, Craig Knoblock, and Pedro Szekely. 2016. Semantic labeling: A domain-independent approach. In ISWC 2016 - 15th International Semantic Web Conference.
[15]
Ross Purves, Paul Clough, and Hideo Joho. 2005. Identifying imprecise regions for geographic information retrieval using the web. In Proceedings of the 13th Annual GIS Research UK Conference. 313--18.
[16]
Vijayshankar Raman and Joseph M. Hellerstein. 2001. Potter's Wheel: An Interactive Data Cleaning System. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB '01). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 381--390. http://dl.acm.org/citation.cfm?id=645927.672045
[17]
Tye Rattenbury, Nathaniel Good, and Mor Naaman. 2007. Towards automatic extraction of event and place semantics from flickr tags. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 103--110.
[18]
Jannik Strötgen and Michael Gertz. 2010. Heideltime: High quality rule-based extraction and normalization of temporal expressions. In Proceedings of the 5th International Workshop on Semantic Evaluation. Association for Computational Linguistics, 321--324.
[19]
Mohsen Taheriyan, Craig A. Knoblock, Pedro Szekely, and Jose Luis Ambite. 2016. Learning the semantics of structured data sources. Journal of Web Semantics 37, C (2016).
[20]
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, and Wei-Ying Ma. 2006. Simultaneous record detection and attribute labeling in web data extraction. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 494--503.

Cited By

View all
  • (2022)Efficient Querying and Indexing of Moving Data Objects2022 International Conference on Futuristic Technologies (INCOFT)10.1109/INCOFT55651.2022.10094348(1-6)Online publication date: 25-Nov-2022
  • (2022)GeoCloud4SDI: a cloud enabled open framework for development of spatial data infrastructure at city levelEarth Science Informatics10.1007/s12145-022-00893-616:1(481-500)Online publication date: 11-Nov-2022
  • (2019)Local geographic information storing and querying using ElasticsearchProceedings of the 13th Workshop on Geographic Information Retrieval10.1145/3371140.3371144(1-4)Online publication date: 28-Nov-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
UrbanGIS'17: Proceedings of the 3rd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics
November 2017
118 pages
ISBN:9781450354950
DOI:10.1145/3152178
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data cleaning
  2. efficient querying and analysis
  3. large-scale integration
  4. spatio-temporal indexing
  5. urban data

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SIGSPATIAL'17
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Efficient Querying and Indexing of Moving Data Objects2022 International Conference on Futuristic Technologies (INCOFT)10.1109/INCOFT55651.2022.10094348(1-6)Online publication date: 25-Nov-2022
  • (2022)GeoCloud4SDI: a cloud enabled open framework for development of spatial data infrastructure at city levelEarth Science Informatics10.1007/s12145-022-00893-616:1(481-500)Online publication date: 11-Nov-2022
  • (2019)Local geographic information storing and querying using ElasticsearchProceedings of the 13th Workshop on Geographic Information Retrieval10.1145/3371140.3371144(1-4)Online publication date: 28-Nov-2019
  • (2019)Learning Data Transformations with Minimal User Effort2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9006350(657-664)Online publication date: Dec-2019
  • (2018)Aging in PlaceProceedings of the 1st ACM SIGSPATIAL Workshop on Advances on Resilient and Intelligent Cities10.1145/3284566.3284567(1-2)Online publication date: 6-Nov-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media