Designing an open framework for query optimization and compilation

M Jungmair, A Kohn, J Giceva - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
Proceedings of the VLDB Endowment, 2022dl.acm.org
Since its invention, data-centric code generation has been adopted for query compilation by
various database systems in academia and industry. These database systems are fast but
maximize performance at the expense of developer friendliness, flexibility, and extensibility.
Recent advances in the field of compiler construction identified similar issues for domain-
specific compilers and introduced a solution with MLIR, a generic infrastructure for domain-
specific dialects. We propose a layered query compilation stack based on MLIR with open …
Since its invention, data-centric code generation has been adopted for query compilation by various database systems in academia and industry. These database systems are fast but maximize performance at the expense of developer friendliness, flexibility, and extensibility. Recent advances in the field of compiler construction identified similar issues for domain-specific compilers and introduced a solution with MLIR, a generic infrastructure for domain-specific dialects.
We propose a layered query compilation stack based on MLIR with open intermediate representations that can be combined at each layer. We further propose moving query optimization into the query compiler to benefit from the existing optimization infrastructure and make cross-domain optimization viable. With LingoDB, we demonstrate that the used approach significantly decreases the implementation effort and is highly flexible and extensible. At the same time, LingoDB achieves high performance and low compilation latencies.
ACM Digital Library