[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2948674.2948677acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Space odyssey: efficient exploration of scientific data

Published: 26 June 2016 Publication History

Abstract

Advances in data acquisition---through more powerful supercomputers for simulation or sensors with better resolution---help scientists tremendously to understand natural phenomena. At the same time, however, it leaves them with a plethora of data and the challenge of analysing it. Ingesting all the data in a database or indexing it for an efficient analysis is unlikely to pay off because scientists rarely need to analyse all data. Not knowing a priori what parts of the datasets need to be analysed makes the problem challenging.
Tools and methods to analyse only subsets of this data are rather rare. In this paper we therefore present Space Odyssey, a novel approach enabling scientists to efficiently explore multiple spatial datasets of massive size. Without any prior information, Space Odyssey incrementally indexes the datasets and optimizes the access to datasets frequently queried together. As our experiments show, through incrementally indexing and changing the data layout on disk, Space Odyssey accelerates exploratory analysis of spatial data by substantially reducing query-to-insight time compared to the state of the art.

References

[1]
I. Alagiannis, R. Borovica, M. Branco, S. Idreos, and A. Ailamaki. NoDB: Efficient Query Execution on Raw Data Files. In SIGMOD '12.
[2]
J. Cieslewicz, K. A. Ross, K. Satsumi, and Y. Ye. Automatic Contention Detection and Amelioration for Data-intensive Operations. In SIGMOD '10.
[3]
V. Gaede and O. Günther. Multidimensional Access Methods. ACM Computing Surveys, 30(2), 1998.
[4]
G. Graefe and H. Kuno. Adaptive Indexing for Relational Keys. In ICDEW '10.
[5]
G. Graefe and H. Kuno. Self-selecting, Self-tuning, Incrementally Optimized Indexes. In EDBT '10.
[6]
J. Gray, P. Sundaresan, S. Englert, K. Baclawski, and P. J. Weinberger. Quickly Generating Billion-record Synthetic Databases. In SIGMOD '94.
[7]
S. Idreos, M. L. Kersten, and S. Manegold. Database Cracking. In CIDR '07.
[8]
S. Idreos, S. Manegold, H. A. Kuno, and G. Graefe. Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores. In VLDB '11.
[9]
M. Karpathiotakis, M. Branco, I. Alagiannis, and A. Ailamaki. Adaptive Query Processing on RAW Data. In VLDB '14.
[10]
S. T. Leutenegger, M. Lopez, et al. STR: A Simple and Efficient Algorithm for R-tree Packing. In ICDE '97.
[11]
H. Markram et al. Introducing the Human Brain Project. Procedia Computer Science, 7:39--42, 2011.
[12]
D. Šidlauskas, C. S. Jensen, and S. Šaltenis. A Comparison of the Use of Virtual Versus Physical Snapshots for Supporting Update-intensive Workloads. In DaMoN '12.
[13]
E. Stefanakis, Y. Theodoridis, T. Sellis, and Y.-C. Lee. Point Representation of Spatial Objects and Query Window Extension: A new Technique for Spatial Access Methods. IJGIS, 11(6), 1997.
[14]
Y. Tao and D. Papadias. Adaptive Index Structures. In VLDB '02.
[15]
F. Tauheed, L. Biveinis, T. Heinis, F. Schürmann, H. Markram, and A. Ailamaki. Accelerating Range Queries For Brain Simulations. In ICDE '12.

Cited By

View all
  • (2024)Optimizing Dataflow Systems for Scalable Interactive VisualizationProceedings of the ACM on Management of Data10.1145/36392762:1(1-25)Online publication date: 26-Mar-2024
  • (2021)RawVis: A System for Efficient In-situ Visual AnalyticsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452764(2760-2764)Online publication date: 9-Jun-2021
  • (2021) Novel approaches on bulk‐loading of large scale spatial datasets Concurrency and Computation: Practice and Experience10.1002/cpe.659634:9Online publication date: 10-Sep-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ExploreDB '16: Proceedings of the Third International Workshop on Exploratory Search in Databases and the Web
June 2016
38 pages
ISBN:9781450343121
DOI:10.1145/2948674
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

  • LogicBlox: LogicBlox Inc.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2016

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • European Union

Conference

SIGMOD/PODS'16
Sponsor:
  • LogicBlox
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco

Acceptance Rates

ExploreDB '16 Paper Acceptance Rate 5 of 11 submissions, 45%;
Overall Acceptance Rate 11 of 21 submissions, 52%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Optimizing Dataflow Systems for Scalable Interactive VisualizationProceedings of the ACM on Management of Data10.1145/36392762:1(1-25)Online publication date: 26-Mar-2024
  • (2021)RawVis: A System for Efficient In-situ Visual AnalyticsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452764(2760-2764)Online publication date: 9-Jun-2021
  • (2021) Novel approaches on bulk‐loading of large scale spatial datasets Concurrency and Computation: Practice and Experience10.1002/cpe.659634:9Online publication date: 10-Sep-2021
  • (2019)Identifying the Most Interactive Object in Spatial Databases2019 IEEE 35th International Conference on Data Engineering (ICDE)10.1109/ICDE.2019.00117(1286-1297)Online publication date: Apr-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media