Compiled query execution engine using JVM
J Rao, H Pirahesh, C Mohan… - … Conference on Data …, 2006 - ieeexplore.ieee.org
22nd International Conference on Data Engineering (ICDE'06), 2006•ieeexplore.ieee.org
A conventional query execution engine in a database system essentially uses a SQL virtual
machine (SVM) to interpret a dataflow tree in which each node is associated with a relational
operator. During query evaluation, a single tuple at a time is processed and passed among
the operators. Such a model is popular because of its efficiency for pipelined processing.
However, since each operator is implemented statically, it has to be very generic in order to
deal with all possible queries. Such generality tends to introduce significant runtime …
machine (SVM) to interpret a dataflow tree in which each node is associated with a relational
operator. During query evaluation, a single tuple at a time is processed and passed among
the operators. Such a model is popular because of its efficiency for pipelined processing.
However, since each operator is implemented statically, it has to be very generic in order to
deal with all possible queries. Such generality tends to introduce significant runtime …
A conventional query execution engine in a database system essentially uses a SQL virtual machine (SVM) to interpret a dataflow tree in which each node is associated with a relational operator. During query evaluation, a single tuple at a time is processed and passed among the operators. Such a model is popular because of its efficiency for pipelined processing. However, since each operator is implemented statically, it has to be very generic in order to deal with all possible queries. Such generality tends to introduce significant runtime inefficiency, especially in the context of memory-resident systems, because the granularity of data commercial system, using SVM. processing (a tuple) is too small compared with the associated overhead. Another disadvantage in such an engine is that each operator code is compiled statically, so query-specific optimization cannot be applied. To improve runtime efficiency, we propose a compiled execution engine, which, for a given query, generates new query-specific code on the fly, and then dynamically compiles and executes the code. The Java platform makes our approach particularly interesting for several reasons: (1) modern Java Virtual Machines (JVM) have Just- In-Time (JIT) compilers that optimize code at runtime based on the execution pattern, a key feature that SVMs lack; (2) because of Java’s continued popularity, JVMs keep improving at a faster pace than SVMs, allowing us to exploit new advances in the Java runtime in the future; (3) Java is a dynamic language, which makes it convenient to load a piece of new code on the fly. In this paper, we develop both an interpreted and a compiled query execution engine in a relational, Java-based, in-memory database prototype, and perform an experimental study. Our experimental results on the TPC-H data set show that, despite both engines benefiting from JIT, the compiled engine runs on average about twice as fast as the interpreted one, and significantly faster than an in-memory
ieeexplore.ieee.org