[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3548785.3548802acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article
Open access

Provenance in Spatial Queries

Published: 13 September 2022 Publication History

Abstract

Despite data growth being a known problem for several years, there are more and more people, tools and devices to create and share data, and the need for tools to infer their provenance and quality is even more important than before. Research on data provenance focuses on W3C PROV and databases (where, why, how). However, in the particular case of spatial data, research has mainly focused on handling spatial data provenance from documents and workflows, but there is no literature approaching the topic of spatial data provenance in DBMS and queries.
This paper deals with the computation of How–, Why– and Where– provenance in spatial database queries. It presents an evaluation of how the formalism and methods proposed to deal with general-purpose database queries behave when dealing with spatial data. Two tools are used to manage provenance in databases and a discussion of the results and guidelines for future work are presented. This is a first contribution towards dealing with spatial data provenance by tuple, attribute and query, whereas previous work has only focused on the management of provenance at a coarser level, namely documents and workflows.

References

[1]
Daniel Abadi, Anastasia Ailamaki, David Andersen, Peter Bailis, Magdalena Balazinska, Philip Bernstein, Peter Boncz, Surajit Chaudhuri, Alvin Cheung, AnHai Doan, Luna Dong, Michael J. Franklin, Juliana Freire, Alon Halevy, Joseph M. Hellerstein, Stratos Idreos, Donald Kossmann, Tim Kraska, Sailesh Krishnamurthy, Volker Markl, Sergey Melnik, Tova Milo, C. Mohan, Thomas Neumann, Beng Chin Ooi, Fatma Ozcan, Jignesh Patel, Andrew Pavlo, Raluca Popa, Raghu Ramakrishnan, Christopher Ré, Michael Stonebraker, and Dan Suciu. 2020. The Seattle Report on Database Research. SIGMOD Rec. 48, 4 (feb 2020), 44–53. https://doi.org/10.1145/3385658.3385668
[2]
Yael Amsterdamer, Daniel Deutch, and Val Tannen. 2011. Provenance for Aggregate Queries. In Proceedings of the Thirtieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (Athens, Greece) (PODS ’11). Association for Computing Machinery, New York, NY, USA, 153–164. https://doi.org/10.1145/1989284.1989302
[3]
Bahareh Sadat Arab, Su Feng, Boris Glavic, Seokki Lee, Xing Niu, and Qitian Zeng. 2018. GProM - A Swiss Army Knife for Your Provenance Needs. IEEE Data Engineering Bulletin 41, 1 (2018), 51–62. http://sites.computer.org/debull/A18mar/p51.pdf
[4]
Peter Buneman, Sanjeev Khanna, Wang-Chiew Tan, and Wang Chiew. 2001. Why and Where: A Characterization of Data Provenance. Computer Science 1973(2001), 316–330.
[5]
Peter Buneman and Wang Chiew Tan. 2018. Data provenance: What next?SIGMOD Record 47, 3 (2018), 5–16. https://doi.org/10.1145/3316416.3316418
[6]
James Cheney, Laura Chiticariu, and Wang Chiew Tan. 2007. Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1 (2007), 379–474. Issue 4. https://doi.org/10.1561/1900000006
[7]
Guillem Closa, Joan Masó, Núria Julià, and Xavier Pons. 2021. Geospatial Queries on Data Collection Using a Common Provenance Model. ISPRS International Journal of Geo-Information 10 (2021), 139. Issue 3. https://doi.org/10.3390/ijgi10030139
[8]
Guillem Closa, Joan Masó, Benjamin Proß, and Xavier Pons. 2017. W3C PROV to describe provenance at the dataset, feature and attribute levels in a distributed environment. Computers, Environment and Urban Systems 64 (2017), 103–117. https://doi.org/10.1016/j.compenvurbsys.2017.01.008
[9]
Guillem Closa, Joan Masó, Alaitz Zabala, Lluís Pesquer, and Xavier Pons. 2019. A provenance metadata model integrating ISO geospatial lineage and the OGC WPS: Conceptual model and implementation. Transactions in GIS 23(2019), 1102–1124. Issue 5. https://doi.org/10.1111/tgis.12555
[10]
Liping Di, Yuanzheng Shao, and Lingjun Kang. 2013. Implementation of geospatial data provenance in a web service workflow environment with ISO 19115 and ISO 19115-2 lineage model. IEEE Transactions on Geoscience and Remote Sensing 51 (2013), 5082–5089. Issue 11. https://doi.org/10.1109/TGRS.2013.2248740
[11]
Shi Gao and Carlo Zaniolo. 2012. Supporting Database Provenance under Schema Evolution. In Advances in Conceptual Modeling(Florence, Italy) (ER’12). Springer-Verlag, Berlin, Heidelberg, 67–77. https://doi.org/10.1007/978-3-642-33999-8_9
[12]
Boris Glavic and Gustavo Alonso. 2009. Perm: Processing provenance and data on the same data model through query rewriting. In Proceedings of the International Conference on Data Engineering. IEEE, Shanghai, China, 174–185. https://doi.org/10.1109/ICDE.2009.15
[13]
Todd J. Green, Grigoris Karvounarakis, and Val Tannen. 2007. Provenance Semirings. In Proceedings of the Twenty-Sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (Beijing, China) (PODS ’07). Association for Computing Machinery, New York, NY, USA, 31–40. https://doi.org/10.1145/1265530.1265535
[14]
Todd J. Green and Val Tannen. 2017. The Semiring Framework for Database Provenance. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (Chicago, Illinois, USA) (PODS ’17). Association for Computing Machinery, New York, NY, USA, 93–99. https://doi.org/10.1145/3034786.3056125
[15]
Ralf Hartmut Güting. 1994. An Introduction to Spatial Database Systems. The VLDB Journal 3, 4 (oct 1994), 357–399.
[16]
Ralf Güting, Michael Böhlen, Martin Erwig, Christian Jensen, Nikos Lorentzos, Enrico Nardelli, Markus Schneider, and José Viqueira. 2003. Spatio-temporal models and languages: An approach based on data types. Lecture Notes in Computer Science 2520 (01 2003), 117–176.
[17]
Melanie Herschel, Ralf Diestelkämper, and Houssem Ben Lahmar. 2017. A survey on provenance: What for? What form? What from?VLDB Journal 26, 6 (2017), 881–906. https://doi.org/10.1007/s00778-017-0486-1
[18]
Melanie Herschel and Marcel Hlawatsch. 2016. Provenance: On and behind the screens. Proceedings of the ACM SIGMOD International Conference on Management of Data 26-June-2016(2016), 2213–2218. https://doi.org/10.1145/2882903.2912568
[19]
I. Ivánová, K. Armstrong, and D. McMeekin. 2017. Provenance in the next-generation spatial knowledge infrastructure. Proceedings - 22nd International Congress on Modelling and Simulation, MODSIM 2017 (2017), 410–416. Issue December. https://doi.org/10.36334/modsim.2017.c2.ivanova
[20]
Liangcun Jiang, Peng Yue, Werner Kuhn, Chenxiao Zhang, Changhui Yu, and Xia Guo. 2018. Advancing interoperability of geospatial data provenance on the web: Gap analysis and strategies. Computers and Geosciences 117 (2018), 21–31. Issue May. https://doi.org/10.1016/j.cageo.2018.05.001
[21]
Ann Kristin Kock-Schoppenhauer, Lina Hartung, Hannes Ulrich, Petra Duhm-Harbeck, and Josef Ingenerf. 2018. Practical Extension of Provenance to Healthcare Data Based on the W3C PROV Standard. Studies in Health Technology and Informatics 253 (2018), 28–32. Issue January. https://doi.org/10.3233/978-1-61499-896-9-28
[22]
Kisung Lee, Raghu Ganti, Mudhakar Srivatsa, and Prasant Mohapatra. 2013. Spatio-temporal provenance: Identifying location information from unstructured text. In 2013 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops). 499–504. https://doi.org/10.1109/PerComW.2013.6529548
[23]
Joana E.G. Malaverri, Claudia Bauzer Medeiros, and Rubens Camargo Lamparelli. 2012. A provenance approach to assess the quality of geospatial data. Proceedings of the ACM Symposium on Applied Computing (2012), 2043–2044. https://doi.org/10.1145/2245276.2232116
[24]
Beatriz Pérez, Julio Rubio, and Carlos Sáenz-Adán. 2018. A Systematic Review of Provenance Systems. Knowl. Inf. Syst. 57, 3 (dec 2018), 495–543. https://doi.org/10.1007/s10115-018-1164-3
[25]
Paulo Pintor, Rogério Luís de Carvalho Costa, and José Moreira. 2022. Why- and How-Provenance in Distributed Environments. In Database and Expert Systems Applications, Christine Strauss, Alfredo Cuzzocrea, Gabriele Kotsis, A. Min Tjoa, and Ismail Khalil (Eds.). Springer International Publishing, Cham, 103–115.
[26]
Philippe Rigaux, Michel Scholl, and Agnès Voisard. 2001. Spatial Databases with Application to GIS. SIGMOD Record (01 2001).
[27]
Markus Schneider. 2009. Spatial and Spatio-Temporal Data Models and Languages. Springer US, Boston, MA, 2681–2685. https://doi.org/10.1007/978-0-387-39940-9_360
[28]
Pierre Senellart. 2017. Provenance and probabilities in relational databases: From theory to practice. SIGMOD Record 46(2017), 5–15. Issue 4. https://doi.org/10.1145/3186549.3186551 7, 5.
[29]
Pierre Senellart. 2019. Provenance in Databases: Principles and Applications. Springer-Verlag, Berlin, Heidelberg, 104–109. https://doi.org/10.1007/978-3-030-31423-1_3
[30]
Pierre Senellart, Louis Jachiet, Silviu Maniu, and Yann Ramusat. 2018. ProvSQL: Provenance and probability management in PostgreSQL. Proceedings of the VLDB Endowment 11, 12 (2018), 2034–2037. https://doi.org/10.14778/3229863.3236253
[31]
Umber Sheikh, Abid Khan, Bilal Ahmed, Abdul Waheed, and Abdul Hameed. 2018. Provenance Inference Techniques: Taxonomy, comparative analysis and design challenges. Journal of Network and Computer Applications 110, March(2018), 11–26. https://doi.org/10.1016/j.jnca.2018.03.004
[32]
Knut Stolze. 2003. SQL/MM Spatial - The Standard to Manage Spatial Data in a Relational Database System. In BTW, Vol. 26. 247–264.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
IDEAS '22: Proceedings of the 26th International Database Engineered Applications Symposium
August 2022
174 pages
ISBN:9781450397094
DOI:10.1145/3548785
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data provenance
  2. Spatial data & Queries

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Fundação para a Ciência e a Tecnologia

Conference

IDEAS'22

Acceptance Rates

Overall Acceptance Rate 74 of 210 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 226
    Total Downloads
  • Downloads (Last 12 months)98
  • Downloads (Last 6 weeks)12
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media