Online Balancing of aR-Tree Indexed Distributed Spatial Data Warehouse

Marcin Gorawski²⁰ &
Robert Chechelski²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3911))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

868 Accesses

Abstract

One of the key requirements of data warehouses is query response time. Amongst all methods of improving query performance, parallel processing (especially in shared nothing class) is one of the giving practically unlimited system’s scaling possibility. The complexity of data warehouse systems is very high with respect to system structure, data model and many mechanisms used, which have a strong influence on the overall performance. The main problem in a parallel data warehouse balancing is data allocation between system nodes. The problem is growing when nodes have different computational characteristics. In this paper we present an algorithm of balancing distributed data warehouse built on shared nothing architecture. Balancing is realized by iterative setting dataset size stored in each node. We employ some well known data allocation schemes using space filling curves: Hilbert and Peano. We provide a collection of system tests results and its analysis that confirm the possibility of a balancing algorithm realization in a proposed way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bulk-Loading xBR $$^+$$ -trees

Spatial data management in apache spark: the GeoSpark perspective and beyond

Article 22 October 2018

A New Approach for Processing Raster Geospatial Big Data in Distributed Environment

References

Bernardino, J., Madeira, H.: Data Warehousing and OLAP: Improving Query Performance Using Distributed Computing. In: Wangler, B., Bergman, L.D. (eds.) CAiSE 2000. LNCS, vol. 1789, Springer, Heidelberg (2000)
Google Scholar
Dehne, F., Eavis, T., Rau-Chaplin, A.: Parallel Multi-Dimensional ROLAP Indexing. In: 3rd International Symposium on Cluster Computing and the Grid, Tokyo, Japan (2003)
Google Scholar
Faloutsos, C., Bhagwat, P.: Declustering using fractals. In: Proc. of the Int’l Conf. on Parallel and Distributed Information Systems, San Diego, California, January 1993, pp. 18–25 (1993)
Google Scholar
Faloutsos, C., Roseman, S.: Fractals for Secondary Key Retrieval. Technical Report UMIACS-TR-89-47, CS-TR-2242, University of Maryland, Colledge Park, Maryland (May 1989)
Google Scholar
Gorawski, M., Malczok, R.: Distributed Spatial Data Warehouse Indexed with Virtual Memory Aggregation Tree. In: 5th Workshop on Spatial-Temporal DataBase Management (STDBM_VLDB 2004), Toronto, Canada (2004)
Google Scholar
Han, J., Stefanovic, N., Koperski, K.: Selective Materialization: An Efficient Method for Spatial Data Cube Construction. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, Springer, Heidelberg (1998)
Chapter Google Scholar
Hua, K., Lo, Y., Young, H.: GeMDA: A Multidimensional Data Partitioning Technique for Multiprocessor Database Systems. Distributed and Parallel Databases, University of Florida, 9, 211–236 (2001)
Google Scholar
Moore, D.: Fast hilbert curve generation, sorting, and range queries, http://www.caam.rice.edu/~dougm/twiddle/Hilbert
Papadias, D., Kalnis, P., Zhang, J., Tao, Y.: Efficient OLAP Operations in Spatial Data Warehouses. In: Agha, G.A., De Cindio, F., Rozenberg, G. (eds.) APN 2001. LNCS, vol. 2001, Springer, Heidelberg (2001)
Google Scholar
Zeng, Z., Bharadwaj, V.: Design and analysis of a non-preemptive decentralized load balancing algorithm for multi-class jobs in distributed networks. Computer Communications 27, 679–694 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland
Marcin Gorawski & Robert Chechelski

Authors

Marcin Gorawski
View author publications
You can also search for this author in PubMed Google Scholar
Robert Chechelski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computational and Information Sciences, Czestochowa University of Technology, Poland
Roman Wyrzykowski
Computer Science Department,, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra
Poznan Supercomputing and Networking Center, Poland
Norbert Meyer
Informatics & Mathematical Modeling, Technical University of Denmark, 2800, Lyngby, DK, Denmark
Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gorawski, M., Chechelski, R. (2006). Online Balancing of aR-Tree Indexed Distributed Spatial Data Warehouse. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2005. Lecture Notes in Computer Science, vol 3911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752578_57

Download citation

DOI: https://doi.org/10.1007/11752578_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34141-3
Online ISBN: 978-3-540-34142-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics