[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Translating canonical SQL to imperative code in Coq

Published: 29 April 2022 Publication History

Abstract

SQL is by far the most widely used and implemented query language. Yet, on some key features, such as correlated queries and NULL value semantics, many implementations diverge or contain bugs. We leverage recent advances in the formalization of SQL and query compilers to develop DBCert, the first mechanically verified compiler from SQL queries written in a canonical form to imperative code. Building DBCert required several new contributions which are described in this paper. First, we specify and mechanize a complete translation from SQL to the Nested Relational Algebra which can be used for query optimization. Second, we define Imp, a small imperative language sufficient to express SQL and which can target several execution languages including JavaScript. Finally, we develop a mechanized translation from the nested relational algebra to Imp, using the nested relational calculus as an intermediate step.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.
[2]
2022. AlaSQL JavaScript SQL Database Library. http://alasql.org.
[3]
Joshua S. Auerbach, Martin Hirzel, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2017. Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler. 1555–1569. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3035961
[4]
Joshua S. Auerbach, Martin Hirzel, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2017. Q*cert: A Platform for Implementing and Verifying Query Compilers. 1703–1706. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3056447
[5]
Véronique Benzaken and Évelyne Contejean. 2019. A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebra. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2019, Cascais, Portugal, January 14-15, 2019. 249–261. https://doi.org/10.1145/3293880.3294107
[6]
Véronique Benzaken, Éveleyne Contejean, Mohammed Houssem Hachmaoui, Chantal Keller, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2022. Translating Canonical SQL to Imperative Code in Coq. https://doi.org/10.5281/zenodo.6366579
[7]
Véronique Benzaken, Évelyne Contejean, Mohammed Houssem Hachmaoui, Chantal Keller, Louis Mandel, Avraham Shinnar, and Jérôme Siméon. 2022. Translating Canonical SQL to Imperative Code in Coq – extended. CoRR, abs/2203.08941 (2022), https://doi.org/10.48550/arXiv.2203.08941
[8]
Dariusz Biernacki, Jean-Louis Colaço, Grégoire Hamon, and Marc Pouzet. 2008. Clock-directed modular code generation for synchronous data-flow languages. In LCTES. ACM, 121–130. https://doi.org/10.1145/1375657.1375674
[9]
Martin Bodin, Arthur Charguéraud, Daniele Filaretti, Philippa Gardner, Sergio Maffeis, Daiva Naudziuniene, Alan Schmitt, and Gareth Smith. 2014. A trusted mechanised JavaScript specification. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20-21, 2014. 87–100. https://doi.org/10.1145/2535838.2535876
[10]
Bin Cao and Antonio Badia. 2007. SQL query optimization through nested relational algebra. ACM Trans. Database Syst., 32, 3 (2007), 18. https://doi.org/10.1145/1272743.1272748
[11]
Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017. HoTTSQL: Proving Query Rewrites with Univalent SQL Semantics. In PLDI 2017. ACM, New York, NY, USA. 510–524. https://doi.org/10.1145/3062341.3062348
[12]
Jens Clauß en, Alfons Kemper, Guido Moerkotte, and Klaus Peithner. 1997. Optimizing Queries with Universal Quantification in Object-Oriented and Object-Relational Databases. In Conference on Very Large Data Bases (VLDB). 286–295.
[13]
Jens Clauß en, Alfons Kemper, Guido Moerkotte, Klaus Peithner, and Michael Steinbrunn. 2000. Optimization and Evaluation of Disjunctive Queries. IEEE Trans. Knowl. Data Eng., 12, 2 (2000), 238–260. https://doi.org/10.1109/69.842265
[14]
Sophie Cluet and Guido Moerkotte. 1993. Nested Queries in Object Bases. In Database Programming Languages (DBPL-4), Manhattan, New York City, USA, 30 August - 1 September 1993. 226–242.
[15]
Todd J. Green, Gregory Karvounarakis, and Val Tannen. 2007. Provenance semirings. In Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 11-13, 2007, Beijing, China. 31–40. https://doi.org/10.1145/1265530.1265535
[16]
Akshay Grover, Jay Gholap, Vandana P. Janeja, Yelena Yesha, Raghu Chintalapati, Harsh Marwaha, and Kunal Modi. 2015. SQL-like big data environments: Case study in clinical trial analytics. In 2015 IEEE International Conference on Big Data, Big Data 2015, Santa Clara, CA, USA, October 29 - November 1, 2015. IEEE Computer Society, 2680–2689. isbn:978-1-4799-9926-2 https://doi.org/10.1109/BigData.2015.7364068
[17]
Paolo Guagliardo and Leonid Libkin. 2017. A Formal Semantics of SQL Queries, Its Validation, and Applications. PVLDB, 11, 1 (2017), 27–39. https://doi.org/10.14778/3151113.3151116
[18]
TU Jin-De. 2010. StreamSQL: A Query Language for Stream Data. Computer Systems & Applications, 26.
[19]
Jacques-Henri Jourdan, François Pottier, and Xavier Leroy. 2012. Validating LR(1) Parsers. In Programming Languages and Systems - 21st European Symposium on Programming, ESOP 2012, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March 24 - April 1, 2012. Proceedings, Helmut Seidl (Ed.) (Lecture Notes in Computer Science, Vol. 7211). Springer, 397–416. isbn:978-3-642-28868-5 https://doi.org/10.1007/978-3-642-28869-2_20
[20]
Taewhi Lee, Moonyoung Chung, Sung-Soo Kim, Hyewon Song, and Jongho Won. 2016. Partial Materialization for Data Integration in SQL-on-Hadoop Engines. In 6th International Conference on IT Convergence and Security, ICITCS 2016, Prague, Czech Republic, September 26, 2016. IEEE Computer Society, 1–2. isbn:978-1-5090-3765-0 https://doi.org/10.1109/ICITCS.2016.7740361
[21]
J. Gregory Malecha, Greg Morrisett, Avraham Shinnar, and Ryan Wisnesky. 2010. Toward a Verified Relational Database Management System. In ACM Int. Conf. POPL. https://doi.org/10.1145/1706299.1706329
[22]
Guido Moerkotte. 2020. Building Query Compilers. Univ. Mannheim. https://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdf
[23]
2017. Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu (Eds.). ACM. isbn:978-1-4503-4197-4 http://dl.acm.org/citation.cfm?id=3035918
[24]
Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. 2016. How to Architect a Query Compiler. In SIGMOD Conference. ACM, 1907–1922. https://doi.org/10.1145/2882903.2915244
[25]
Avraham Shinnar, Jérôme Siméon, and Martin Hirzel. 2015. A Pattern Calculus for Rule Languages: Expressiveness, Compilation, and Mechanization. In 29th European Conference on Object-Oriented Programming, ECOOP 2015, July 5-10, 2015, Prague, Czech Republic. 542–567. https://doi.org/10.4230/LIPIcs.ECOOP.2015.542
[26]
2021. SQL Alchemy: The Python SQL Toolkit and Object Relational Mapper. https://www.sqlalchemy.org.
[27]
2022. SQLite compiled to JavaScript. https://sql.js.org/.
[28]
Jeffrey D. Ullman. 1982. Principles of Database Systems, 2nd Edition. Computer Science Press. isbn:0-914894-36-6
[29]
Jan Van den Bussche and Stijn Vansummeren. 2007. Polymorphic type inference for the named nested relational calculus. Transactions on Computational Logic (TOCL), 9, 1 (2007), https://doi.org/10.1145/1297658.1297661

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 6, Issue OOPSLA1
April 2022
687 pages
EISSN:2475-1421
DOI:10.1145/3534679
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2022
Published in PACMPL Volume 6, Issue OOPSLA1

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Coq
  2. JavaScript
  3. Query compiler
  4. SQL
  5. Semantics preserving compiler

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 519
    Total Downloads
  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)29
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media