More Web Proxy on the site http://driver.im/

research-article

Open access

Understanding and Reusing Test Suites Across Database Systems

Authors:

Manuel RiggerAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 2, Issue 6

Article No.: 253, Pages 1 - 26

https://doi.org/10.1145/3698829

Published: 20 December 2024 Publication History

Abstract

Database Management System (DBMS) developers have implemented extensive test suites to test their DBMSs. For example, the SQLite test suites contain over 92 million lines of code. Despite these extensive efforts, test suites are not systematically reused across DBMSs, leading to wasted effort. Integration is challenging, as test suites use various test case formats and rely on unstandardized test runner features. We present a unified test suite, SQuaLity, in which we integrated test cases from three widely-used DBMSs, SQLite, PostgreSQL, and DuckDB. In addition, we present an empirical study to determine the potential of reusing these systems' test suites. Our results indicate that reusing test suites is challenging: First, test formats and test runner commands vary widely; for example, SQLite has 4 test runner commands, while MySQL has 112 commands with additional features, to, for example, execute file operations or interact with a shell. Second, while some test suites contain mostly standard-compliant statements (e.g., 99% in SQLite), other test suites mostly test non-standardized functionality (e.g., 31% of statements in the PostgreSQL test suite are nonstandardized). Third, test reuse is complicated by various explicit and implicit dependencies, such as the need to set variables and configurations, certain test cases requiring extensions not present by default, and query results depending on specific clients. Despite the above findings, we have identified 3 crashes, 3 hangs, and multiple compatibility issues across four different DBMSs by executing test suites across DBMSs, indicating the benefits of reuse. Overall, this work represents the first step towards test-case reuse in the context of DBMSs, and we hope that it will inspire follow-up work on this important topic.

References

[1]

2022. Sqllogictest. https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki

[2]

2023. ISO/IEC 9075 "Information technology - Database languages - SQL".

[3]

2023. MySQL Test Commands. https://dev.mysql.com/doc/dev/mysql-server/latest/PAGE_MYSQL_TEST_COMMANDS.html Accessed: 2023--8--8.

[4]

Jinsheng Ba and Manuel Rigger. 2023. Testing Database Engines via Query Plan Guidance. In Proceedings of the 45th International Conference on Software Engineering (Melbourne, Victoria, Australia) (ICSE '23). IEEE Press, 2060--2071. https://doi.org/10.1109/ICSE48619.2023.00174

Digital Library

[5]

Jinsheng Ba and Manuel Rigger. 2024. CERT: Finding Performance Issues in Database Systems Through the Lens of Cardinality Estimation. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (, Lisbon, Portugal,) (ICSE '24). Association for Computing Machinery, New York, NY, USA, Article 133, 13 pages. https://doi.org/10.1145/3597503.3639076

Digital Library

[6]

Jinsheng Ba and Manuel Rigger. 2024. An Exploratory Case Study of Query Plan Representations. arXiv preprint arXiv:2408.07857 (2024).

[7]

Sebastian Baltes and Paul Ralph. 2022. Sampling in software engineering research: A critical review and guidelines. Empirical Software Engineering 27, 4 (2022), 94.

Digital Library

[8]

Earl T. Barr, Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. 2015. Automated Software Transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (Baltimore, MD, USA) (ISSTA 2015). Association for Computing Machinery, New York, NY, USA, 257--269. https://doi.org/10.1145/2771783.2771796

Digital Library

[9]

Alexander Böhm, Maria Christakis, Eric Lo, and Manuel Rigger. 2022. Ensuring the Reliability and Robustness of Database Management Systems (Dagstuhl Seminar 21442). In Dagstuhl Reports, Vol. 11. Schloss Dagstuhl-Leibniz- Zentrum für Informatik.

[10]

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. 26, 2, Article 4 (jun 2008), 26 pages. https://doi.org/10.1145/1365815.1365816

Digital Library

[11]

Ziyu Cui, Wensheng Dou, Qianwang Dai, Jiansen Song, Wei Wang, Jun Wei, and Dan Ye. 2023. Differentially Testing Database Transactions for Fun and Profit. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (Rochester, MI, USA) (ASE '22). Association for Computing Machinery, New York, NY, USA, Article 35, 12 pages. https://doi.org/10.1145/3551349.3556924

Digital Library

[12]

Davide Falessi, Natalia Juristo, Claes Wohlin, Burak Turhan, Jürgen Münch, Andreas Jedlitschka, and Markku Oivo. 2018. Empirical software engineering experts on the use of students and professionals in experiments. Empirical Software Engineering 23 (2018), 452--489.

Digital Library

[13]

Jingzhou Fu, Jie Liang, Zhiyong Wu, and Yu Jiang. 2024. Sedar: Obtaining High-Quality Seeds for DBMS Fuzzing via Cross-DBMS SQL Transfer. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (Lisbon, Portugal) (ICSE '24). Association for Computing Machinery, New York, NY, USA, Article 146, 12 pages. https://doi.org/10.1145/3597503.3639210

Digital Library

[14]

Jingzhou Fu, Jie Liang, Zhiyong Wu, Mingzhe Wang, and Yu Jiang. 2023. Griffin: Grammar-Free DBMS Fuzzing. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (Rochester, MI, USA) (ASE '22). Association for Computing Machinery, New York, NY, USA, Article 49, 12 pages. https://doi.org/10.1145/3551349.3560431

Digital Library

[15]

Goetz Graefe et al. 2011. Modern B-tree techniques. Foundations and Trends® in Databases 3, 4 (2011), 203--402.

[16]

Paolo Guagliardo and Leonid Libkin. 2017. A Formal Semantics of SQL Queries, Its Validation, and Applications. Proc. VLDB Endow. 11, 1 (sep 2017), 27--39. https://doi.org/10.14778/3151113.3151116

Digital Library

[17]

Surabhi Gupta and Karthik Ramachandra. 2021. Procedural Extensions of SQL: Understanding their usage in the wild. Proceedings of the VLDB Endowment 14, 8 (2021), 1378--1391.

Digital Library

[18]

Zu-Ming Jiang, Jia-Ju Bai, and Zhendong Su. 2023. DynSQL: Stateful Fuzzing for Database Management Systems with Complex and Valid SQL Query Generation. In 32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 4949--4965. https://www.usenix.org/conference/usenixsecurity23/presentation/jiang-zuming

[19]

Zu-Ming Jiang, Si Liu, Manuel Rigger, and Zhendong Su. 2023. Detecting Transactional Bugs in Database Engines via Graph-Based Oracle Construction. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI . USENIX Association, Boston, MA, 397--417. https://www.usenix.org/conference/osdi23/presentation/jiang

[20]

Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, andWoonhak Kang. 2019. APOLLO: Automatic detection and diagnosis of performance regressions in database systems. Proceedings of the VLDB Endowment 13, 1 (2019), 57--70.

Digital Library

[21]

Kyle Kingsbury and Peter Alvaro. 2020. Elle: Inferring Isolation Anomalies from Experimental Observations. Proc. VLDB Endow. 14, 3 (nov 2020), 268--280. https://doi.org/10.14778/3430915.3430918

Digital Library

[22]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (Edinburgh, United Kingdom) (PLDI '14). Association for Computing Machinery, New York, NY, USA, 216--226. https://doi.org/10.1145/2594291.2594334

Digital Library

[23]

Yu Liang, Song Liu, and Hong Hu. 2022. Detecting Logical Bugs of DBMS with Coverage-based Guidance. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 4309--4326. https://www.usenix.org/conference/usenixsecurity22/presentation/liang

[24]

Xinyu Liu, Qi Zhou, Joy Arulraj, and Alessandro Orso. 2022. Automatic Detection of Performance Bugs in Database Systems Using Equivalent Queries. In Proceedings of the 44th International Conference on Software Engineering (Pittsburgh, Pennsylvania) (ICSE '22). Association for Computing Machinery, New York, NY, USA, 225--236. https://doi.org/10.1145/3510003.3510093

Digital Library

[25]

Leonardo Mariani, Ali Mohebbi, Mauro Pezzè, and Valerio Terragni. 2021. Semantic Matching of GUI Events for Test Reuse: Are We There Yet?. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (Virtual, Denmark) (ISSTA 2021). Association for Computing Machinery, New York, NY, USA, 177--190. https://doi.org/10.1145/3460319.3464827

Digital Library

[26]

Kayvan Memarian, Victor B. F. Gomes, Brooks Davis, Stephen Kell, Alexander Richardson, Robert N. M. Watson, and Peter Sewell. 2019. Exploring C Semantics and Pointer Provenance. Proc. ACM Program. Lang. 3, POPL, Article 67 (jan 2019), 32 pages. https://doi.org/10.1145/3290380

Digital Library

[27]

Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N. M. Watson, and Peter Sewell. 2016. Into the Depths of C: Elaborating the de Facto Standards. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (Santa Barbara, CA, USA) (PLDI '16). Association for Computing Machinery, New York, NY, USA, 1--15. https://doi.org/10.1145/2908080.2908081

Digital Library

[28]

C. Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, and Peter Schwarz. 1992. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM Trans. Database Syst. 17, 1 (mar 1992), 94--162. https://doi.org/10.1145/128765.128770

Digital Library

[29]

Alex Monahan. 2022. Friendlier SQL with DuckDB. https://duckdb.org/2022/05/04/friendlier-sql.html

[30]

Thomas Neumann and Bernhard Radke. 2018. Adaptive Optimization of Very Large Join Queries. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 677--692. https://doi.org/10.1145/3183713.3183733

Digital Library

[31]

Mark Raasveldt. 2022. DuckDB Testing - Present and Future. Keynote speech at DBTest '22: Proceedings of the 2022 workshop on 9th International Workshop of Testing Database Systems.

[32]

Mark Raasveldt and Hannes Mühleisen. 2019. DuckDB: an embeddable analytical database. In Proceedings of the 2019 International Conference on Management of Data. 1981--1984.

Digital Library

[33]

Manuel Rigger and Zhendong Su. 2020. Detecting optimization bugs in database engines via non-optimizing reference engine construction. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 1140--1152. https://doi.org/10.1145/3368089.3409710

Digital Library

[34]

Manuel Rigger and Zhendong Su. 2020. Finding bugs in database systems via query partitioning. Proc. ACM Program. Lang. 4, OOPSLA, Article 211 (nov 2020), 30 pages. https://doi.org/10.1145/3428279

Digital Library

[35]

Manuel Rigger and Zhendong Su. 2020. Testing database engines via pivoted query synthesis. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 667--682.

[36]

P. Griffiths Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. 1979. Access Path Selection in a Relational Database Management System. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data (Boston, Massachusetts) (SIGMOD '79). Association for Computing Machinery, New York, NY, USA, 23--34. https://doi.org/10.1145/582095.582099

Digital Library

[37]

Andreas Seltenreich. 2022. Sqlsmith. https://github.com/anse1/sqlsmith

[38]

Donald R. Slutz. 1998. Massive Stochastic Testing of SQL. In Proceedings of the 24rd International Conference on Very Large Data Bases (VLDB '98). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 618--622.

Digital Library

[39]

solid IT. 2023. DB-Engines Ranking. https://db-engines.com/en/ranking

[40]

Xiu Tang, Sai Wu, Dongxiang Zhang, Feifei Li, and Gang Chen. 2023. Detecting Logic Bugs of Join Optimizations in DBMS. Proc. ACM Manag. Data 1, 1, Article 55 (may 2023), 26 pages. https://doi.org/10.1145/3588909

Digital Library

[41]

Rajeev Tiwari and Noopur Goel. 2013. Reuse: Reducing Test Effort. SIGSOFT Softw. Eng. Notes 38, 2 (mar 2013), 1--11. https://doi.org/10.1145/2439976.2439982

Digital Library

[42]

Etienne Toussaint, Paolo Guagliardo, Leonid Libkin, and Juan Sequeda. 2022. Troubles with Nulls, Views from the Users. Proc. VLDB Endow. 15, 11 (jul 2022), 2613--2625. https://doi.org/10.14778/3551793.3551818

Digital Library

[43]

Adrian Vogelsgesang, Michael Haubenschild, Jan Finis, Alfons Kemper, Viktor Leis, Tobias Muehlbauer, Thomas Neumann, and Manuel Then. 2018. Get Real: How Benchmarks Fail to Represent the Real World. In Proceedings of the Workshop on Testing Database Systems (Houston, TX, USA) (DBTest'18). Association for Computing Machinery, New York, NY, USA, Article 1, 6 pages. https://doi.org/10.1145/3209950.3209952

Digital Library

[44]

Chenggang Wu, Alekh Jindal, Saeed Amizadeh, Hiren Patel, Wangchao Le, Shi Qiao, and Sriram Rao. 2018. Towards a Learning Optimizer for Shared Clouds. Proc. VLDB Endow. 12, 3 (nov 2018), 210--222. https://doi.org/10.14778/3291264.3291267

Digital Library

[45]

Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng, Ding Yuan, Yuanyuan Zhou, and Shankar Pasupathy. 2013. Do Not Blame Users for Misconfigurations. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). Association for Computing Machinery, New York, NY, USA, 244--259. https://doi.org/10.1145/2517349.2522727

Digital Library

[46]

YouTube. 2020. DuckDB -- the SQLite for analytics (Mark Raasveldt, CWI). https://www.youtube.com/watch?v= PFUZlNQIndo. Accessed: 2024-07--19.

[47]

Andreas Zeller. 1999. Yesterday, My Program Worked. Today, It Does Not. Why? SIGSOFT Softw. Eng. Notes 24, 6 (oct 1999), 253--267. https://doi.org/10.1145/318774.318946

Digital Library

[48]

A. Zeller and R. Hildebrandt. 2002. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering 28, 2 (2002), 183--200. https://doi.org/10.1109/32.988498

Digital Library

[49]

Yixue Zhao, Justin Chen, Adriana Sejfia, Marcelo Schmitt Laser, Jie Zhang, Federica Sarro, Mark Harman, and Nenad Medvidovic. 2020. FrUITeR: A Framework for Evaluating UI Test Reuse. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 1190--1201. https://doi.org/10.1145/3368089.3409708

Digital Library

[50]

Rui Zhong, Yongheng Chen, Hong Hu, Hangfan Zhang, Wenke Lee, and Dinghao Wu. 2020. SQUIRREL: Testing Database Management Systems with Language Validity and Coverage Feedback. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (Virtual Event, USA) (CCS '20). Association for Computing Machinery, New York, NY, USA, 955--970. https://doi.org/10.1145/3372297.3417260

Digital Library

Index Terms

Understanding and Reusing Test Suites Across Database Systems
1. Information systems
  1. Data management systems
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Integration testing through reusing representative unit test cases for high-confidence medical software

As medical software is getting larger-sized, complex, and connected with other devices, finding faults in integrated software modules gets more difficult and time consuming. Existing integration testing typically takes a black-box approach, which treats ...
Prioritizing random combinatorial test suites
SAC '17: Proceedings of the Symposium on Applied Computing

The behaviour of a system under test can be influenced by several factors, such as system configurations, user inputs, and so on. It has also been observed that many failures are caused by only a small number of factors. Combinatorial testing aims at ...
Reusing Component Test Cases for Integration Testing of Retarding Embedded System Components
VALID '09: Proceedings of the 2009 First International Conference on Advances in System Testing and Validation Lifecycle

The integration of retarding components represents a crucial challenge for reusing component test cases for integration testing. Failing to synchronize both test stimulation and test evaluation easily results in useless test executions missing the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 2, Issue 6

SIGMOD

December 2024

792 pages

EISSN:2836-6573

DOI:10.1145/3709598

Editor:
Divyakant Agrawal
University of California, Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2024

Published in PACMMOD Volume 2, Issue 6

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
61
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)61

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents