[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2484838.2484875acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Astronomical data processing in EXTASCID

Published: 29 July 2013 Publication History

Abstract

Scientific data have dual structure. Raw data are preponderantly ordered multi-dimensional arrays or sequences while metadata and derived data are best represented as unordered relations. Scientific data processing requires complex operations over arrays and relations. These operations cannot be expressed using only standard linear and relational algebra operators, respectively. Existing scientific data processing systems are designed for a single data model and handle complex processing at the application level.
EXTASCID is a complete and extensible system for scientific data processing. It supports both array and relational data natively. Complex processing is handled by a metaoperator that can execute any user code. As a result, EXTASCID can process full scientific workflows inside the system, with minimal data movement and application code. We illustrate the overall process on a real dataset and workflow from astronomy---starting with a set of sky images, the goal is to identify and classify transient astrophysical objects.

References

[1]
Palomar Transient Factory. www.astro.caltech.edu/ptf.
[2]
Sloan Digital Sky Survey. www.sdss3.org.
[3]
S. Arumugam and al. The DataPath System: A Data-Centric Analytic Processing Engine for Large Data Warehouses. In SIGMOD 2010.
[4]
P. Baumann and al. The Multidimensional Database System RasDaMan. In SIGMOD 1998.
[5]
J. B. Buck and al. SciHadoop: Array-based Query Processing in Hadoop. In SC 2011.
[6]
C. Chang and al. Titan: A High-Performance Remote Sensing Database. In ICDE 1997.
[7]
C. Chang and al. T2: A Customizable Parallel Database for Multi-Dimensional Data. SIGMOD Rec., 27(1), 1998.
[8]
Y. Cheng and al. GLADE: Big Data Analytics Made Easy. In SIGMOD 2012.
[9]
R. Cornacchia and al. Flexible and Efficient IR using Array Databases. VLDBJ, 17, 2008.
[10]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM, 51(1), 2008.
[11]
P. Furtado and P. Baumann. Storage of Multidimensional Arrays Based on Arbitrary Tiling. In ICDE 1999.
[12]
S. Idreos and al. MonetDB: Two Decades of Research in Column-oriented Database Architectures. IEEE Data Eng. Bull., 35(1), 2012.
[13]
M. Ivanova and al. Data Vaults: A Symbiosis between Database Technology and Scientific File Repositories. In SSDBM 2012.
[14]
S. Sarawagi and M. Stonebraker. Efficient Organization of Large Multidimensional Arrays. In ICDE 1994.
[15]
E. Soroush and al. ArrayStore: A Storage Manager for Complex Parallel Array Processing. In SIGMOD 2011.
[16]
M. Stonebraker and al. The Architecture of SciDB. In SSDBM 2011.
[17]
A. R. van Ballegooij. RAM: A Multidimensional Array DBMS. In EDBT Workshops 2004.
[18]
Y. Zhang and al. SciQL: Bridging the Gap between Science and Relational DBMS. In IDEAS 2011.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SSDBM '13: Proceedings of the 25th International Conference on Scientific and Statistical Database Management
July 2013
401 pages
ISBN:9781450319218
DOI:10.1145/2484838
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 July 2013

Check for updates

Qualifiers

  • Research-article

Conference

SSDBM '13

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)60 Years of Databases (part three)PROBLEMS IN PROGRAMMING10.15407/pp2022.01.034(034-066)Online publication date: Mar-2022
  • (2022)A survey on machine learning in array databasesApplied Intelligence10.1007/s10489-022-03979-253:9(9799-9822)Online publication date: 12-Aug-2022
  • (2022)A Data Cube Architecture for Cloud‐Based Earth Observation AnalyticsBig Data Analytics in Earth, Atmospheric, and Ocean Sciences10.1002/9781119467557.ch5(95-113)Online publication date: Nov-2022
  • (2020)On the Integration of Machine Learning and Array Databases2020 IEEE 36th International Conference on Data Engineering (ICDE)10.1109/ICDE48307.2020.00170(1786-1789)Online publication date: Apr-2020
  • (2018)Automatic identification and classification of Palomar Transient Factory astrophysical objects in GLADEInternational Journal of Computational Science and Engineering10.1504/IJCSE.2018.1001495516:4(337-349)Online publication date: 1-Jan-2018
  • (2018)Array DatabasesEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_2061(165-177)Online publication date: 7-Dec-2018
  • (2017)Location and Processing Aware Datacube CachingProceedings of the 29th International Conference on Scientific and Statistical Database Management10.1145/3085504.3085539(1-6)Online publication date: 27-Jun-2017
  • (2017)ArrayUDFProceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3078597.3078599(53-64)Online publication date: 26-Jun-2017
  • (2016)Towards a General Array Database Benchmark: Measuring Storage AccessBig Data Benchmarking10.1007/978-3-319-49748-8_3(40-67)Online publication date: 1-Dec-2016
  • (2016)Array DatabasesEncyclopedia of Database Systems10.1007/978-1-4899-7993-3_2061-2(1-12)Online publication date: 21-Dec-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media