Large-Scale Graph Processing Using Apache Giraph

Overview

Authors:

Sherif Sakr ⁰,
Faisal Moeen Orakzai ¹,
Ibrahim Abdelaziz ²,
…
Zuhair Khayyat ³

Sherif Sakr
1. School of Comput. Sci. & Engin., The University of New South Wales, Sydney, Australia
View author publications

You can also search for this author in PubMed Google Scholar
Faisal Moeen Orakzai
1. Department of Computer Science, Aalborg University, Aalborg, Denmark
View author publications

You can also search for this author in PubMed Google Scholar
Ibrahim Abdelaziz
1. K.A. University of Science & Technology, Thuwal, Saudi Arabia
View author publications

You can also search for this author in PubMed Google Scholar
Zuhair Khayyat
1. K.A. Univ. of Science and Technology, Thuwal, Saudi Arabia
View author publications

You can also search for this author in PubMed Google Scholar

Describes the fundamental abstractions of the Apache Giraph, its programming models and various techniques
Offers step-by-step coverage of the implementation of several popular and advanced graph analytics algorithms, including related optimization details
All source code presented in the book is available for download from an associated github repository
Includes supplementary material: sn.pub/extras

6101 Accesses
14 Citations
3 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

eBook GBP 39.99

Price includes VAT (United Kingdom)

Softcover Book GBP 49.99

Price includes VAT (United Kingdom)

Hardcover Book GBP 49.99

Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

About this book

This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms.

The book is organized as follows: Chapter 1 starts by providing a general background of the big data phenomenon and a general introduction to the Apache Giraph system, its abstraction, programming model and design architecture. Next, chapter 2 focuses on Giraph as a platform and how to use it. Based on a sample job, even more advanced topics like monitoring the Giraph application lifecycle and different methods for monitoring Giraph jobs are explained. Chapter 3 then provides an introduction to Giraph programming, introduces the basic Giraph graph model and explains how to write Giraph programs. In turn, Chapter 4 discusses in detail the implementation of some popular graph algorithms including PageRank, connected components, shortest paths and triangle closing. Chapter 5 focuses on advanced Giraph programming, discussing common Giraph algorithmic optimizations, tunable Giraph configurations that determine the system’s utilization of the underlying resources, and how to write a custom graph input and output format. Lastly, chapter 6 highlights two systems that have been introduced to tackle the challenge of large scale graph processing, GraphX and GraphLab, and explains the main commonalities and differences between these systems and Apache Giraph.

This book serves as an essential reference guide for students, researchers and practitioners in the domain of large scale graph processing. It offers step-by-step guidance, with several code examples and the complete source code available in the related github repository. Students will find a comprehensive introduction to and hands-on practice with tackling large scale graph processing problems using the Apache Giraph system, while researchers will discover thorough coverage of the emerging and ongoing advancements in big graph processing systems.

Large scale graph processing systems: survey and an experimental evaluation

Article 24 July 2015

An analysis of the graph processing landscape

Article Open access 09 April 2021

Management and Analysis of Big Graph Data: Current Systems and Open Challenges

Keywords

Table of contents (6 chapters)

Front Matter

Pages i-xxv

Download chapter PDF
Introduction
- Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
Pages 1-33
Installing and Getting Giraph Ready to Use
- Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
Pages 35-86
Getting Started with Giraph Programming
- Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
Pages 87-117
Popular Graph Algorithms on Giraph
- Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
Pages 119-139
Advanced Giraph Programming
- Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
Pages 141-173
Related Large-Scale Graph Processing Systems
- Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
Pages 175-194
Back Matter

Pages 195-197

Download chapter PDF

Reviews

“This volume is a cookbook on Giraph. … Its virtue is that it will help newcomers to Giraph to get up and running quickly. … Users who need to bring up Giraph quickly and who have no experience with the Hadoop-Giraph ecosystem will find the volume a helpful introduction to these powerful tools.” (Computing Reviews, October, 2017)

Authors and Affiliations

School of Comput. Sci. & Engin., The University of New South Wales, Sydney, Australia

Sherif Sakr
Department of Computer Science, Aalborg University, Aalborg, Denmark

Faisal Moeen Orakzai
K.A. University of Science & Technology, Thuwal, Saudi Arabia

Ibrahim Abdelaziz
K.A. Univ. of Science and Technology, Thuwal, Saudi Arabia

Zuhair Khayyat

About the authors

Sherif Sakr is currently a professor of computer and information science in the Health Informatics department at King Saud bin Abdulaziz University for Health Sciences. He is also affiliated with the University of New South Wales and DATA61/CSIRO (formerly NICTA). He had held visiting appointments in several academic and research institutes including Microsoft Research (2011), Alcatel-Lucent Bell Labs (2012), Humboldt University of Berlin (2015), University of Zurich (2016) and TU Dresden (2016). In 2013, Sherif has been awarded the Stanford Innovation and Entrepreneurship Certificate.

Faisal Moeen Orakzai is a joint PhD candidate at Université Libre de Bruxelles (ULB) Belgium and Aalborg University (AAU) Denmark. In addition to doing research, he works as a consultant and helps companies setting up their distributed data processing architectures and pipelines. He is a Big Data management and analytics enthusiast and currently working on a Giraph based framework for spatio-temporal pattern mining.

Ibrahim Abdelaziz is a Computer Science PhD candidate at King Abdullah University of Science and Technology (KAUST). Prior to joining KAUST, he used to work on pattern recognition and information retrieval in several research organizations in Egypt. His current research interests are Data Mining over large scale graphs, Distributed Systems and Machine Learning.

Zuhair Khayyat is a PhD candidate in the InfoCloud group at King Abdullah University of Science and Technology (KAUST) focusing on Big Data, Analytics and Graphs.

Bibliographic Information

Book Title: Large-Scale Graph Processing Using Apache Giraph
Authors: Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
DOI: https://doi.org/10.1007/978-3-319-47431-1
Publisher: Springer Cham
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2016
Hardcover ISBN: 978-3-319-47430-4Published: 12 January 2017
Softcover ISBN: 978-3-319-83735-2Published: 07 July 2018
eBook ISBN: 978-3-319-47431-1Published: 05 January 2017
Edition Number: 1
Number of Pages: XXV, 197
Number of Illustrations: 15 b/w illustrations, 87 illustrations in colour
Topics: Database Management, Big Data/Analytics, Data Structures

Publish with us

Policies and ethics

Large-Scale Graph Processing Using Apache Giraph

Overview

Access this book

Subscribe and save

Buy Now

Other ways to access

About this book

Similar content being viewed by others

Large scale graph processing systems: survey and an experimental evaluation

An analysis of the graph processing landscape

Management and Analysis of Big Graph Data: Current Systems and Open Challenges

Keywords

Table of contents (6 chapters)

Front Matter

Introduction

Installing and Getting Giraph Ready to Use

Getting Started with Giraph Programming

Popular Graph Algorithms on Giraph

Advanced Giraph Programming

Related Large-Scale Graph Processing Systems

Back Matter

Reviews

Authors and Affiliations

School of Comput. Sci. & Engin., The University of New South Wales, Sydney, Australia

Department of Computer Science, Aalborg University, Aalborg, Denmark

K.A. University of Science & Technology, Thuwal, Saudi Arabia

K.A. Univ. of Science and Technology, Thuwal, Saudi Arabia

About the authors

Bibliographic Information

Publish with us

Navigation

Large-Scale Graph Processing Using Apache Giraph

Overview

Access this book

Subscribe and save

Buy Now

Other ways to access

About this book

Similar content being viewed by others

Keywords

Table of contents (6 chapters)

Front Matter

Back Matter

Reviews

Authors and Affiliations

School of Comput. Sci. & Engin., The University of New South Wales, Sydney, Australia

Department of Computer Science, Aalborg University, Aalborg, Denmark

K.A. University of Science & Technology, Thuwal, Saudi Arabia

K.A. Univ. of Science and Technology, Thuwal, Saudi Arabia

About the authors

Bibliographic Information

Publish with us

Search

Navigation