VLDB: Vol 33, No 4

Volume 33, Issue 4Jul 2024

Volume 33, Issue 4

Jul 2024

Publisher:

Springer-Verlag
Berlin, Heidelberg

ISSN:1066-8888

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

editorial

Special issue on “Machine learning and databases”

Page 901https://doi.org/10.1007/s00778-024-00848-x

research-article

F-IVM: analytics over relational databases under updates

Pages 903–929https://doi.org/10.1007/s00778-023-00817-w

Abstract

This article describes F-IVM, a unified approach for maintaining analytics over changing relational data. We exemplify its versatility in four disciplines: processing queries with group-by aggregates and joins; learning linear regression models ...

research-article

Efficient and robust active learning methods for interactive database exploration

Pages 931–956https://doi.org/10.1007/s00778-023-00816-x

Abstract

There is an increasing gap between fast growth of data and the limited human ability to comprehend data. Consequently, there has been a growing demand of data management tools that can bridge this gap and help the user retrieve high-value content ...

research-article

AutoML in heavily constrained applications

Pages 957–979https://doi.org/10.1007/s00778-023-00820-1

Abstract

Optimizing a machine learning pipeline for a task at hand requires careful configuration of various hyperparameters, typically supported by an AutoML system that optimizes the hyperparameters for the given training dataset. Yet, depending on the ...

research-article

Alfa: active learning for graph neural network-based semantic schema alignment

Pages 981–1011https://doi.org/10.1007/s00778-023-00822-z

Abstract

Semantic schema alignment aims to match elements across a pair of schemas based on their semantic representation. It is a key primitive for data integration that facilitates the creation of a common data fabric across heterogeneous data sources. ...

research-article

Givens rotations for QR decomposition, SVD and PCA over database joins

Pages 1013–1037https://doi.org/10.1007/s00778-023-00818-9

Abstract

This article introduces FiGaRo, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over relational data. FiGaRo ’s main novelty is that it pushes the QR decomposition past the ...

research-article

A multi-facet analysis of BERT-based entity matching models

Pages 1039–1064https://doi.org/10.1007/s00778-023-00824-x

Abstract

State-of-the-art Entity Matching approaches rely on transformer architectures, such as BERT, for generating highly contextualized embeddings of terms. The embeddings are then used to predict whether pairs of entity descriptions refer to the same ...

research-article

Morphtree: a polymorphic main-memory learned index for dynamic workloads

Pages 1065–1084https://doi.org/10.1007/s00778-023-00823-y

Abstract

Modern database systems rely on indexes to accelerate data access. The recently proposed learned indexes can offer higher search performance with lower space costs than traditional indexes like B+-tree. We observe that existing main-memory learned ...

research-article

DB-BERT: making database tuning tools “read” the manual

Immanuel Trummer

Pages 1085–1104https://doi.org/10.1007/s00778-023-00831-y

Abstract

DB-BERT is a database tuning tool that exploits information gained via natural language analysis of manuals and other relevant text documents. It uses text to identify database system parameters to tune as well as recommended parameter values. DB-...

research-article

Towards flexibility and robustness of LSM trees

Pages 1105–1128https://doi.org/10.1007/s00778-023-00826-9

Abstract

Log-structured merge trees (LSM trees) are increasingly used as part of the storage engine behind several data systems, and are frequently deployed in the cloud. As the number of applications relying on LSM-based storage backends increases, the ...

research-article

Assisted design of data science pipelines

Pages 1129–1153https://doi.org/10.1007/s00778-024-00835-2

Abstract

When designing data science (DS) pipelines, end-users can get overwhelmed by the large and growing set of available data preprocessing and modeling techniques. Intelligent discovery assistants (IDAs) and automated machine learning (AutoML) ...

research-article

A learning-based framework for spatial join processing: estimation, optimization and tuning

Pages 1155–1177https://doi.org/10.1007/s00778-024-00836-1

Abstract

The importance and complexity of spatial join operation resulted in the availability of many join algorithms, some of which are tailored for big-data platforms like Hadoop and Spark. The choice among them is not trivial and depends on different ...

research-article

Speech-to-SQL: toward speech-driven SQL query generation from natural language question

Pages 1179–1201https://doi.org/10.1007/s00778-024-00837-0

Abstract

Speech-based inputs have been gaining significant momentum with the popularity of smartphones and tablets in our daily lives, since voice is the most popular and efficient way for human–computer interaction. This paper works toward designing more ...

research-article

Reliability evaluation of individual predictions: a data-centric approach

Pages 1203–1230https://doi.org/10.1007/s00778-024-00857-w

Abstract

Machine learning models only provide probabilistic guarantees on the expected loss of random samples from the distribution represented by their training data. As a result, a model with high accuracy, may or may not be reliable for predicting an ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.

The VLDB Journal — The International Journal on Very Large Data Bases

Sections

Special issue on “Machine learning and databases”

F-IVM: analytics over relational databases under updates

Efficient and robust active learning methods for interactive database exploration

AutoML in heavily constrained applications

Alfa: active learning for graph neural network-based semantic schema alignment

Givens rotations for QR decomposition, SVD and PCA over database joins

A multi-facet analysis of BERT-based entity matching models

Morphtree: a polymorphic main-memory learned index for dynamic workloads

DB-BERT: making database tuning tools “read” the manual

Towards flexibility and robustness of LSM trees

Assisted design of data science pipelines

A learning-based framework for spatial join processing: estimation, optimization and tuning

Speech-to-SQL: toward speech-driven SQL query generation from natural language question

Reliability evaluation of individual predictions: a data-centric approach