[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2463676.2467799acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Trinity: a distributed graph engine on a memory cloud

Published: 22 June 2013 Publication History

Abstract

Computations performed by graph algorithms are data driven, and require a high degree of random data access. Despite the great progresses made in disk technology, it still cannot provide the level of efficient random access required by graph computation. On the other hand, memory-based approaches usually do not scale due to the capacity limit of single machines. In this paper, we introduce Trinity, a general purpose graph engine over a distributed memory cloud. Through optimized memory management and network communication, Trinity supports fast graph exploration as well as efficient parallel computing. In particular, Trinity leverages graph access patterns in both online and offline computation to optimize memory and communication for best performance. These enable Trinity to support efficient online query processing and offline analytics on large graphs with just a few commodity machines. Furthermore, Trinity provides a high level specification language called TSL for users to declare data schema and communication protocols, which brings great ease-of-use for general purpose graph management and computing. Our experiments show Trinity's performance in both low latency graph queries as well as high throughput graph analytics on web-scale, billion-node graphs.

References

[1]
http://graphlab.org/.
[2]
http://hadoop.apache.org/.
[3]
http://incubator.apache.org/giraph/.
[4]
http://neo4j.org/.
[5]
http://www.graph500.org/.
[6]
How to partition a billion-scale graph. Technical report, Microsoft Research, 2012.
[7]
M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. SOSP '07, pages 159--174, 2007.
[8]
D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: a fast array of wimpy nodes. SOSP '09, pages 1--14.
[9]
J. Berry, B. Hendrickson, S. Kahan, and P. Konecny. Software and algorithms for graph queries on multithreaded architectures. In IPDPS 2007, pages 1--14, 2007.
[10]
D. Borthakur. The Hadoop Distributed File System: Architecture and Design, 2007.
[11]
R. Bramandia, B. Choi, and W. K. Ng. Incremental maintenance of 2-hop labeling of large graphs. TKDE, 22(5):682--698, 2010.
[12]
D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. SDM '04, 2004.
[13]
T. D. Chandra, R. Griesemer, and J. Redstone. Paxos made live: an engineering perspective. PODC '07, pages 398--407, 2007.
[14]
J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang. Fast graph pattern matching. In ICDE, pages 913--922, 2008.
[15]
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. OSDI '04, pages 137--150.
[16]
E. W. Dijkstra. Shmuel Safra's version of termination detection. Jan. 1987.
[17]
B. Fitzpatrick. Distributed caching with memcached. Linux J., August 2004.
[18]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, pages 17--30, 2012.
[19]
D. Gregor and A. Lumsdaine. The Parallel BGL: A generic library for distributed graph computations. POOSC '05.
[20]
Y. Guo, Z. Pan, and J. Heflin. LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics, 3(2-3):158--182, 2005.
[21]
H. Higaki, K. Shima, T. Tachikawa, and M. Takizawa. Checkpoint and rollback in asynchronous distributed systems. INFOCOM '97, pages 998--, 1997.
[22]
B. Iordanov. Hypergraphdb: a generalized graph database. WAIM '10, pages 25--36, 2010.
[23]
U. Kang, C. E. Tsourakakis, and C. Faloutsos. Pegasus: A peta-scale graph mining system implementation and observations. ICDM '09, pages 229--238, 2009.
[24]
G. Karypis and V. Kumar. Parallel multilevel k-way partitioning scheme for irregular graphs. Supercomputing '96.
[25]
A. Kyrola, G. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a pc. In OSDI, pages 31--46, 2012.
[26]
A. Lumsdaine, D. Gregor, B. Hendrickson, and J. W. Berry. Challenges in parallel graph processing. Parallel Processing Letters, 17(1):5--20, 2007.
[27]
N. A. Lynch. Distributed Algorithms. 1996.
[28]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. SIGMOD '10.
[29]
D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. SOSP '11, pages 29--41, 2011.
[30]
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramclouds: scalable high-performance storage entirely in dram. SIGOPS Oper. Syst. Rev., 43:92--105, 2010.
[31]
T. Schütt, F. Schintke, and A. Reinefeld. Scalaris: reliable transactional p2p key/value store. ERLANG '08, pages 41--48, 2008.
[32]
Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efficient subgraph matching on billion node graphs. Proc. VLDB Endow., 5(9):788--799, May 2012.
[33]
W. Wu, H. Li, H. Wang, and K. Zhu. Probase: A probabilistic taxonomy for text understanding. In SIGMOD, 2012.
[34]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. HotCloud'10, pages 10--10, 2010.
[35]
J. Zawodny. Redis: Lightweight key/value store that goes the extra mile. Linux Magazine, 2009.
[36]
K. Zeng, J. Yang, H. Wang, B. Shao, and Z. Wang. A distributed graph engine for web scale RDF data. In VLDB 2013.
[37]
X. Zhao, A. Sala, C. Wilson, H. Zheng, and B. Y. Zhao. Orion: shortest path estimation for large social graphs. WOSN'10, pages 9--9, 2010.

Cited By

View all
  • (2024)A nearly gapless, highly contiguous reference genome for a doubled haploid line of Populus ussuriensis, enabling advanced genomic studiesForestry Research10.48130/forres-0024-00164:1(0-0)Online publication date: 2024
  • (2024)Tele-Trafficking of Virtual Data Storage Obtained from Smart Grid by Replicated Gluster in Syntose EnvironmentEnergies10.3390/en1710234417:10(2344)Online publication date: 13-May-2024
  • (2024)KGFabric: A Scalable Knowledge Graph Warehouse for Enterprise Data InterconnectionProceedings of the VLDB Endowment10.14778/3685800.368581017:12(3841-3854)Online publication date: 1-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
June 2013
1322 pages
ISBN:9781450320375
DOI:10.1145/2463676
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed system
  2. graph database
  3. memory cloud

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'13
Sponsor:

Acceptance Rates

SIGMOD '13 Paper Acceptance Rate 76 of 372 submissions, 20%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)4
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A nearly gapless, highly contiguous reference genome for a doubled haploid line of Populus ussuriensis, enabling advanced genomic studiesForestry Research10.48130/forres-0024-00164:1(0-0)Online publication date: 2024
  • (2024)Tele-Trafficking of Virtual Data Storage Obtained from Smart Grid by Replicated Gluster in Syntose EnvironmentEnergies10.3390/en1710234417:10(2344)Online publication date: 13-May-2024
  • (2024)KGFabric: A Scalable Knowledge Graph Warehouse for Enterprise Data InterconnectionProceedings of the VLDB Endowment10.14778/3685800.368581017:12(3841-3854)Online publication date: 1-Aug-2024
  • (2024)Graph Crypto-Stego System for Securing Graph Data Using Association SchemesJournal of Applied Mathematics10.1155/2024/20843422024(1-12)Online publication date: 2-Mar-2024
  • (2024)Streaming Graph Algorithms in the Massively Parallel Computation ModelProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662770(496-507)Online publication date: 17-Jun-2024
  • (2024)LSGraph: A Locality-centric High-performance Streaming Graph EngineProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650076(33-49)Online publication date: 22-Apr-2024
  • (2024)Locality-Preserving Graph Traversal With Split Live MigrationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.343682835:10(1810-1825)Online publication date: Oct-2024
  • (2024)How to Fit the SCC Algorithm Efficiently into Distributed Graph Iterative Computation2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00043(254-263)Online publication date: 2-Jul-2024
  • (2024)Distributed Knowledge Graph Query Acceleration AlgorithmWeb and Big Data10.1007/978-981-97-2387-4_3(32-47)Online publication date: 28-Apr-2024
  • (2024)Graph Database: Re-engineering Methodologies Relational to NOSQL DatabasesIntelligent Strategies for ICT10.1007/978-981-97-1260-1_12(131-146)Online publication date: 11-May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media