[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Reflects downloads up to 23 Feb 2025Bibliometrics
Skip Table Of Content Section
research-article
F-IVM: analytics over relational databases under updates
Abstract

This article describes F-IVM, a unified approach for maintaining analytics over changing relational data. We exemplify its versatility in four disciplines: processing queries with group-by aggregates and joins; learning linear regression models ...

research-article
Efficient and robust active learning methods for interactive database exploration
Abstract

There is an increasing gap between fast growth of data and the limited human ability to comprehend data. Consequently, there has been a growing demand of data management tools that can bridge this gap and help the user retrieve high-value content ...

research-article
AutoML in heavily constrained applications
Abstract

Optimizing a machine learning pipeline for a task at hand requires careful configuration of various hyperparameters, typically supported by an AutoML system that optimizes the hyperparameters for the given training dataset. Yet, depending on the ...

research-article
Alfa: active learning for graph neural network-based semantic schema alignment
Abstract

Semantic schema alignment aims to match elements across a pair of schemas based on their semantic representation. It is a key primitive for data integration that facilitates the creation of a common data fabric across heterogeneous data sources. ...

research-article
Givens rotations for QR decomposition, SVD and PCA over database joins
Abstract

This article introduces FiGaRo, an algorithm for computing the upper-triangular matrix in the QR decomposition of the matrix defined by the natural join over relational data. FiGaRo ’s main novelty is that it pushes the QR decomposition past the ...

research-article
A multi-facet analysis of BERT-based entity matching models
Abstract

State-of-the-art Entity Matching approaches rely on transformer architectures, such as BERT, for generating highly contextualized embeddings of terms. The embeddings are then used to predict whether pairs of entity descriptions refer to the same ...

research-article
Morphtree: a polymorphic main-memory learned index for dynamic workloads
Abstract

Modern database systems rely on indexes to accelerate data access. The recently proposed learned indexes can offer higher search performance with lower space costs than traditional indexes like B+-tree. We observe that existing main-memory learned ...

research-article
DB-BERT: making database tuning tools “read” the manual
Abstract

DB-BERT is a database tuning tool that exploits information gained via natural language analysis of manuals and other relevant text documents. It uses text to identify database system parameters to tune as well as recommended parameter values. DB-...

research-article
Towards flexibility and robustness of LSM trees
Abstract

Log-structured merge trees (LSM trees) are increasingly used as part of the storage engine behind several data systems, and are frequently deployed in the cloud. As the number of applications relying on LSM-based storage backends increases, the ...

research-article
Assisted design of data science pipelines
Abstract

When designing data science (DS) pipelines, end-users can get overwhelmed by the large and growing set of available data preprocessing and modeling techniques. Intelligent discovery assistants (IDAs) and automated machine learning (AutoML) ...

research-article
A learning-based framework for spatial join processing: estimation, optimization and tuning
Abstract

The importance and complexity of spatial join operation resulted in the availability of many join algorithms, some of which are tailored for big-data platforms like Hadoop and Spark. The choice among them is not trivial and depends on different ...

research-article
Speech-to-SQL: toward speech-driven SQL query generation from natural language question
Abstract

Speech-based inputs have been gaining significant momentum with the popularity of smartphones and tablets in our daily lives, since voice is the most popular and efficient way for human–computer interaction. This paper works toward designing more ...

research-article
Reliability evaluation of individual predictions: a data-centric approach
Abstract

Machine learning models only provide probabilistic guarantees on the expected loss of random samples from the distribution represented by their training data. As a result, a model with high accuracy, may or may not be reliable for predicting an ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.