Visually aware recommendation with aesthetic features
Visual information plays a critical role in human decision-making process. Recent developments on visually aware recommender systems have taken the product image into account. We argue that the aesthetic factor is very important in modeling and ...
Mis-categorized entities detection
Entity categorization, the process of categorizing entities into groups, is an important problem with many applications. However, in practice, many entities are mis-categorized, such as Google Scholar and Amazon products. In this paper, we study ...
A cost model for random access queries in document stores
Document stores have become one of the key NoSQL storage solutions. They have been widely adopted in different domains due to their ability to store semi-structured data and expressive query capabilities. However, implementations differ in terms ...
Distributed detection of sequential anomalies in univariate time series
The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All ...
Location- and keyword-based querying of geo-textual data: a survey
With the broad adoption of mobile devices, notably smartphones, keyword-based search for content has seen increasing use by mobile users, who are often interested in content related to their geographical location. We have also witnessed a ...
Micro-architectural analysis of in-memory OLTP: Revisited
Micro-architectural behavior of traditional disk-based online transaction processing (OLTP) systems has been investigated extensively over the past couple of decades. Results show that traditional OLTP systems mostly under-utilize the available ...
In-Memory Interval Joins
The interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory ...
Model averaging in distributed machine learning: a case study with Apache Spark
The increasing popularity of Apache Spark has attracted many users to put their data into its ecosystem. On the other hand, it has been witnessed in the literature that Spark is slow when it comes to distributed machine learning (ML). One resort ...