Distributed Query Processing on Compressed Graphs Using K2-Trees

Sandra Álvarez-García¹⁹,
Nieves R. Brisaboa¹⁹,
Carlos Gómez-Pantoja²⁰ &
…
Mauricio Marin²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8214))

Included in the following conference series:

International Symposium on String Processing and Information Retrieval

1222 Accesses
2 Citations

Abstract

Compact representation of Web and social graphs can be made efficiently with the K ²-tree as it achieves compression ratios about 5 bits per link for web graphs and about 20 bits per link for social graphs. The K ²-tree also enables fast processing of relevant queries such as direct and reverse neighbours in the compressed graph. These two properties make the K ²-tree suitable for inclusion in Web search engines where it is necessary to maintain very large graphs and to process on-line queries on them. Typically these search engines are deployed on dedicated clusters of distributed memory processors wherein the data set is partitioned and replicated to enable low query response time and high query throughput. In this context a practical strategy is simply to distribute the data on the processors and build local data structures for efficient retrieval in each processor. However, the way the data set is distributed on the processors can have a significant impact in performance. In this paper, we evaluate a number of data distribution strategies which are suitable for the K ²-tree and identify the alternative with the best general performance. In our study we consider different data sets and focus on metrics such as overall compression ratio and parallel response time for retrieving direct and reverse neighbours.

SAG and NB were founded by MICIN (PGE and FEDER) grants TIN2009-14560-C03-02, TIN2010-21246-C02-01, and CDTI CEN-20091048 and Xunta de Galicia (co-funded with FEDER) ref. 2010/17. MM was partially funded by research grant FONDEF IDeA CA12I10314.

The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-3-319-02432-5_33

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Fast Construction of Compressed Web Graphs

Investigations on Path Indexing for Graph Databases

Graph Partitioning for Distributed Graph Processing

Article Open access 04 February 2017

References

The boost graph library: user guide and reference manual. Addison-Wesley Longman Publishing Co., Inc., Boston (2002)
Google Scholar
Boldi, P., Codenotti, B., Santini, M., Vigna, S.: Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience 34(8), 711–726 (2004)
Google Scholar
Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: WWW, pp. 595–601. ACM Press, Manhattan (2004)
Google Scholar
Brisaboa, N.R., Ladra, S., Navarro, G.: k2-trees for compact web graph representation. In: SPIRE, pp. 18–30 (2009)
Google Scholar
Brisaboa, N.R., Ladra, S., Navarro, G.: Dacs: Bringing direct access to variable-length codes. In: SPIRE, pp. 392–404 (2009)
Google Scholar
Bulu, A., Gilbert, J.R.: The combinatorial blas: design, implementation, and applications. Int. J. High Perform. Comput. Appl. 25(4), 496–509 (2011)
Article Google Scholar
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI 2012 (2012)
Google Scholar
Gregor, D., Lumsdaine, A.: The parallel bgl: A generic library for distributed graph computations. In: POOSC (2005)
Google Scholar
Krepska, E., Kielmann, T., Fokkink, W., Bal, H.: Hipg: parallel processing of large-scale graphs. SIGOPS Oper. Syst. Rev. 45(2), 3–13 (2011)
Article Google Scholar
Ladra, S.: Algorithms and Compressed Data Structures for Information Retrieval. PhD thesis, Department of Computer Science, University of A Corun̈a (2011)
Google Scholar
Leskovec, L.: Snap: Stanford network analysis platform, http://snap.stanford.edu
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Graphlab: A new framework for parallel machine learning. In: Grünwald, P., Spirtes, P. (eds.) UAI, pp. 340–349. AUAI Press (2010)
Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD 2010, pp. 135–146. ACM Press, New York (2010)
Google Scholar
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Article Google Scholar
Yucheng, L., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. VLDB 5(8), 716–727 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Database Laboratory, University of Coruña, Spain
Sandra Álvarez-García & Nieves R. Brisaboa
Facultad de Ingeniería, Universidad Andres Bello, Sazié, 2325, Santiago, Chile
Carlos Gómez-Pantoja
Yahoo!Research Latin America, Santiago, Chile
Mauricio Marin

Authors

Sandra Álvarez-García
View author publications
You can also search for this author in PubMed Google Scholar
Nieves R. Brisaboa
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Gómez-Pantoja
View author publications
You can also search for this author in PubMed Google Scholar
Mauricio Marin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Industrial Engineering and Management Technion, Technion Institute of Technology, Bloomfield Hall 308, 32000, Haifa, Israel
Oren Kurland
Bar-Ilan University, Israel
Moshe Lewenstein
Department of Computer Science, Bar-Ilan University, 52900, Ramat-Gan, Israel
Ely Porat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Álvarez-García, S., Brisaboa, N.R., Gómez-Pantoja, C., Marin, M. (2013). Distributed Query Processing on Compressed Graphs Using K2-Trees. In: Kurland, O., Lewenstein, M., Porat, E. (eds) String Processing and Information Retrieval. SPIRE 2013. Lecture Notes in Computer Science, vol 8214. Springer, Cham. https://doi.org/10.1007/978-3-319-02432-5_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-02432-5_32
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02431-8
Online ISBN: 978-3-319-02432-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Distributed Query Processing on Compressed Graphs Using K2-Trees

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Fast Construction of Compressed Web Graphs

Investigations on Path Indexing for Graph Databases

Graph Partitioning for Distributed Graph Processing

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Distributed Query Processing on Compressed Graphs Using K2-Trees

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Fast Construction of Compressed Web Graphs

Investigations on Path Indexing for Graph Databases

Graph Partitioning for Distributed Graph Processing

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation