Abstract
The work presented in this paper aims to build OLAP cubes from big data warehouses implemented by using the columnar NoSQL model. The use of NoSQL models is motivated by the inability of the relational model, usually used to implement data warehousing, to allow data scalability easily. Indeed, the columnar NoSQL model is suitable for storing and managing massive data, especially for decisional queries. However, the column-oriented NoSQL DBMS do not offer online analysis operators (OLAP). Our main contribution is to define a new cube operator called MC-CUBE (MapReduce Columnar CUBE), which allows building columnar NoSQL cubes by taking into account the no relational and distributed aspects when data warehouses are stored.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, D.J., Madden, S.R., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp 967–980. ACM, New York (2008)
Abelló, A., Ferrarons, F., Romero, O.: Building cubes with MapReduce. In: ACM International Workshop on Data Warehousing and OLAP, pp. 17–24. ACM, New York (2011)
Apache Hive (2014). https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
Barber, R., Bendel, P., Czech, M., Draese, O., Ho, F., Hrle, N., Idreos, S., Kim, M.S., Koeth, O., Lee, J.G.: Business analytics in (a) blink. IEEE Data Eng. Bull. 35(1), 9–14 (2012)
Beyer K.S., Ramakrishnan, R.: Bottom-up computation of sparse and Iceberg CUBE. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 359–370. ACM, New York (1999)
Bhogal, J., Choksi, I.: Handling big data using NoSQL. In: IEEE International Conference on Advanced Information Networking and Applications Workshops, pp. 393–398. IEEE Computer Society, Washington, D.C. (2015)
Cattell, R.: Scalable SQL and NoSQL data stores. SIGMOD Rec. 39, 12–27 (2011)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: Symposium on Operating Systems Design and Implementation, Berkeley, USA, vol. 7, pp. 15–25 (2006)
Chavan, V., Phursule, R.N.: Survey paper on big data. Int. J. Comput. Sci. Inf. Technol. 5(6), 7932–7939 (2014)
Chevalier, R., El Malki, M., Kopliku, A., Teste, O., Tournier, R.: Implementation of multidimensional databases in column-oriented NoSQL systems. In: Advances in Databases and Information Systems, Poitiers, France, pp. 79–91 (2015)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design and Implementation, Berkeley, CA, USA, vol. 6, pp. 137–149 (2014)
Dehdouh, K., Bentayeb, F., Boussaïd, O., Kabachi, N.: Columnar NoSQL CUBE: aggregation operator for columnar NoSQL data warehouse. In: IEEE International Conference on Systems, Man, and Cybernetics, San Diego, USA, pp. 3828–3833(2014)
Dehdouh, K., Bentayeb, F., Boussaïd, O., Kabachi, N.: Using the column oriented NoSQL model for implementing big data warehouses. In: International Conference on Parallel and Distributed Processing Techniques, Las Vegas, USA, pp. 469–475 (2015)
Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: data management for modern business applications. SIGMOD Rec. 40, 45–51 (2012)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Discov. 1, 29–53 (1997)
Idreos, S., Groffen, F., Nes, N., Manegold, S., Mullender, S., Kersten, M.: MonetDB: two decades of research in column-oriented database architectures. IEEE Data Eng. Bull. 35, 40–45 (2012)
Imhoff, C., Geiger, J.G., Galemmo, N.: Relational Modeling and Data Warehouse Design. Wiley, New York (2003)
Lamb, A., Fuller, M., Varadarajan, R., Tran, N., Vandiver, B., Doshi, L., Bear, C.: The vertica analytic database: C-store 7 years later. Proc. VLDB Endow. 5(12), 1790–1801 (2012)
Larson, P.-Å., Hanson, E.N., Price, S.L.: Columnar storage in SQL server 2012. IEEE Data Eng. Bull. 35, 15–20 (2012)
Rabuzin, K., Modrušan, N.: Business intelligence and column-oriented databases. In: Proceedings of the Central European Conference on Information and Intelligent Systems, Varaždin Croatia, pp. 12–16 (2014)
Ślezak, D., Eastwood, V.: Data warehouse technology by infobright. In: Binnig, C., Dageville, B. (eds.) Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, pp. 841–846 (2009)
Ślezak, D., Wróblewski, J., Eastwood, V., Synak, P.: Brighthouse: an analytic data warehouse for ad-hoc queries. Proc. VLDB Endow. 1, 1337–1345 (2008)
Zukowski, M., Wiel, M.V., Boncz, P.: Vectorwise: a vectorized analytical DBMS. In: IEEE 28th International Conference on Data Engineering (ICDE), Washington, pp. 1349–1350 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Dehdouh, K. (2016). Building OLAP Cubes from Columnar NoSQL Data Warehouses. In: Bellatreche, L., Pastor, Ó., Almendros Jiménez, J., Aït-Ameur, Y. (eds) Model and Data Engineering. MEDI 2016. Lecture Notes in Computer Science(), vol 9893. Springer, Cham. https://doi.org/10.1007/978-3-319-45547-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-45547-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45546-4
Online ISBN: 978-3-319-45547-1
eBook Packages: Computer ScienceComputer Science (R0)