Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJuly 2021
Investigating Automatic Parameter Tuning for SQL-on-Hadoop Systems
AbstractSQL-on-Hadoop engines such as Hive provide a declarative interface for processing large-scale data over computing frameworks such as Hadoop. The underlying frameworks contain a large number of configuration parameters that can ...
- research-articleJanuary 2020
CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance
Journal of Computer Science and Technology (JCST), Volume 35, Issue 1Pages 194–208https://doi.org/10.1007/s11390-020-9536-zAbstractThis paper presents CirroData, a high-performance SQL-on-Hadoop system designed for Big Data analytics workloads. As a home-grown enterprise-level online analytical processing (OLAP) system with more than seven-year research and development (R&D) ...
- research-articleSeptember 2017
No data left behind: real-time insights from a complex data ecosystem
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingPages 108–120https://doi.org/10.1145/3127479.3131208The typical enterprise data architecture consists of several actively updated data sources (e.g., NoSQL systems, data warehouses), and a central data lake such as HDFS, in which all the data is periodically loaded through ETL processes. To simplify ...
- research-articleJuly 2017
Evaluating SQL-on-Hadoop for Big Data Warehousing on Not-So-Good Hardware
- Maribel Yasmina Santos,
- Carlos Costa,
- João Galvão,
- Carina Andrade,
- Bruno Augusto Martinho,
- Francisca Vale Lima,
- Eduarda Costa
IDEAS '17: Proceedings of the 21st International Database Engineering & Applications SymposiumPages 242–252https://doi.org/10.1145/3105831.3105842Big Data is currently conceptualized as data whose volume, variety or velocity impose significant difficulties in traditional techniques and technologies. Big Data Warehousing is emerging as a new concept for Big Data analytics. In this context, SQL-on-...
- research-articleNovember 2016
Building a Hybrid Warehouse: Efficient Joins between Data Stored in HDFS and Enterprise Warehouse
ACM Transactions on Database Systems (TODS), Volume 41, Issue 4Article No.: 21, Pages 1–38https://doi.org/10.1145/2972950The Hadoop Distributed File System (HDFS) has become an important data repository in the enterprise as the center for all business analytics, from SQL queries and machine learning to reporting. At the same time, enterprise data warehouses (EDWs) continue ...
- research-articleOctober 2016
Adaptive Caching in Big SQL using the HDFS Cache
SoCC '16: Proceedings of the Seventh ACM Symposium on Cloud ComputingPages 321–333https://doi.org/10.1145/2987550.2987553The memory and storage hierarchy in database systems is currently undergoing a radical evolution in the context of Big Data systems. SQL-on-Hadoop systems share data with other applications in the Big Data ecosystem by storing their data in HDFS, using ...
- research-articleJune 2016
RBAS: A Real-Time User Behavior Analysis System for Internet TV in Cloud Computing
CFI '16: Proceedings of the 11th International Conference on Future Internet TechnologiesPages 36–42https://doi.org/10.1145/2935663.2935664The characteristic of Internet TV user behavior is quite essential for designers to optimize resource schedule and improve user experience. With the rapid development of Internet, both Internet TV users and STB (set top boxes) models are booming. This ...
- research-articleJune 2016
VectorH: Taking SQL-on-Hadoop to the Next Level
- Andrei Costea,
- Adrian Ionescu,
- Bogdan Răducanu,
- Michał Switakowski,
- Cristian Bârca,
- Juliusz Sompolski,
- Alicja Łuszczak,
- Michał Szafrański,
- Giel de Nijs,
- Peter Boncz
SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataPages 1105–1117https://doi.org/10.1145/2882903.2903742Actian Vector in Hadoop (VectorH for short) is a new SQL-on-Hadoop system built on top of the fast Vectorwise analytical database system. VectorH achieves fault tolerance and storage scalability by relying on HDFS, and extends the state-of-the-art in ...
- research-articleApril 2016
Take me to SSD: a hybrid block-selection method on HDFS based on storage type
SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied ComputingPages 965–971https://doi.org/10.1145/2851613.2851658As the era of Big-data has risen, the importance of big data technologies is also increasing day by day. Especially, Hadoop has become a critical part of the overall Big-data system because of its ability to store, process, and analyze thousands of ...
- short-paperOctober 2015
Flying KIWI: Design of Approximate Query Processing Engine for Interactive Data Analytics at Scale
BigDAS '15: Proceedings of the 2015 International Conference on Big Data Applications and ServicesPages 206–207https://doi.org/10.1145/2837060.2837096This paper introduces the design of hybrid SQL-on-Hadoop system, which supports dual-mode (interactive and deep) analytics. We present an architecture of approximate query processing engine using horizontal and vertical sampling of the original database ...
- ArticleJuly 2015
Database Architectures: Current State and Development
DATA 2015: Proceedings of 4th International Conference on Data Management Technologies and ApplicationsPages 152–161https://doi.org/10.5220/0005512001520161The paper presents shortly a history and development of database management tools in last decade. The
movement towards a higher database performance and database scalability is discussed in the context to
requirements of practice. These include Big Data ...