Issue Downloads
Collective influence maximization for multiple competing products with an awareness-to-influence model
Influence maximization (IM) is a fundamental task in social network analysis. Typically, IM aims at selecting a set of seeds for the network that influences the maximum number of individuals. Motivated by practical applications, in this paper we focus ...
Finding group Steiner trees in graphs with both vertex and edge weights
Given an undirected graph and a number of vertex groups, the group Steiner trees problem is to find a tree such that (i) this tree contains at least one vertex in each vertex group; and (ii) the sum of vertex and edge weights in this tree is minimized. ...
Optimizing bipartite matching in real-world applications by incremental cost computation
The Kuhn-Munkres (KM) algorithm is a classical combinatorial optimization algorithm that is widely used for minimum cost bipartite matching in many real-world applications, such as transportation. For example, a ride-hailing service may use it to find ...
The case for NLP-enhanced database tuning: towards tuning tools that "read the manual"
A large body of knowledge on database tuning is available in the form of natural language text. We propose to leverage natural language processing (NLP) to make that knowledge accessible to automated tuning tools. We describe multiple avenues to exploit ...
Errata for "Unifying consensus and atomic commitment for effective cloud data management"
This errata article discusses and corrects a minor error in our work published in VLDB 2019. The discrepancy specifically pertains to Algorithms 3 and 4. The algorithms presented in the paper are biased towards a commit decision in a specific failure ...
Software-defined data protection: low overhead policy compliance at the storage layer is within reach!
Most modern data processing pipelines run on top of a distributed storage layer, and securing the whole system, and the storage layer in particular, against accidental or malicious misuse is crucial to ensuring compliance to rules and regulations. ...
TRACE: real-time compression of streaming trajectories in road networks
The deployment of vehicle location services generates increasingly massive vehicle trajectory data, which incurs high storage and transmission costs. A range of studies target offline compression to reduce the storage cost. However, to enable online ...
Shortest paths and centrality in uncertain networks
Computing the shortest path between a pair of nodes is a fundamental graph primitive, which has critical applications in vehicle routing, finding functional pathways in biological networks, survivable network design, among many others. In this work, we ...
Adaptive data augmentation for supervised learning over missing data
Real-world data is dirty, which causes serious problems in (supervised) machine learning (ML). The widely used practice in such scenario is to first repair the labeled source (a.k.a. train) data using rule-, statistical- or ML-based methods and then use ...
KLL± approximate quantile sketches over dynamic datasets
Recently the long standing problem of optimal construction of quantile sketches was resolved by Karnin, Lang, and Liberty using the KLL sketch (FOCS 2016). The algorithm for KLL is restricted to online insert operations and no delete operations. For ...
Distributed numerical and machine learning computations via two-phase execution of aggregated join trees
When numerical and machine learning (ML) computations are expressed relationally, classical query execution strategies (hash-based joins and aggregations) can do a poor job distributing the computation. In this paper, we propose a two-phase execution ...
An inquiry into machine learning-based automatic configuration tuning services on real-world database management systems
- Dana Van Aken,
- Dongsheng Yang,
- Sebastien Brillard,
- Ari Fiorino,
- Bohan Zhang,
- Christian Bilien,
- Andrew Pavlo
Modern database management systems (DBMS) expose dozens of configurable knobs that control their runtime behavior. Setting these knobs correctly for an application's workload can improve the performance and efficiency of the DBMS. But because of their ...
Subjects
Currently Not Available