[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Using domain-specific languages for analytic graph databases

Published: 01 September 2016 Publication History

Abstract

Recently graph has been drawing lots of attention both as a natural data model that captures fine-grained relationships between data entities and as a tool for powerful data analysis that considers such relationships. In this paper, we present a new graph database system that integrates a robust graph storage with an efficient graph analytics engine. Primarily, our system adopts two domain-specific languages (DSLs), one for describing graph analysis algorithms and the other for graph pattern matching queries. Compared to the API-based approaches in conventional graph processing systems, the DSL-based approach provides users with more flexible and intuitive ways of expressing algorithms and queries. Moreover, the DSL-based approach has significant performance benefits as well, (1) by skipping (remote) API invocation overhead and (2) by applying high-level optimization from the compiler.

References

[1]
AllegroGraph. http://franz.com/agraph/allegrograph/.
[2]
Apache Giraph Project. http://giraph.apache.org.
[3]
Apache TinkerPop. http://tinkerpop.incubator.apache.org.
[4]
Boost Graph Library (BGL). http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/index.html.
[5]
Cypher - the Neo4j query Language. http://www.neo4j.org/learn/cypher.
[6]
InfiniteGraph. http://www.objectivity.com/infinitegraph.
[7]
Intel 64 and IA-32 Architectures Optimization Reference Manual. http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html,2016.
[8]
Java universal network/graph framework. http://jung.sourceforge.net.
[9]
Neo4j graph database. http://www.neo4j.org/.
[10]
NetworkX. https://networkx.github.io.
[11]
Oracle Spatial and Graph, RDF Semantic Graph,. http://www.oracle.com/technetwork/database/options/spatialandgraph/overview/rdfsemantic-graph-1902016.html.
[12]
SPARQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-query/.
[13]
Stanford network analysis library. http://snap.stanford.edu/snap.
[14]
Tinkerpop, Gremlin. https://github.com/tinkerpop/gremlin/wiki.
[15]
Titan Distributed Graph Database. http://thinkaurelius.github.io/titan/.
[16]
Virtuoso Universal Server. http://virtuoso.openlinksw.com/.
[17]
Lada A. Adamic and Eytan Adar. Friends and neighbors on the web. Social Networks, 25(3):211--230, 2001.
[18]
Pablo Barceló, Leonid Libkin, Anthony W. Lin, and Peter T. Wood. Expressive languages for path queries over graph-structured data. ACM Transactions on Database Systems (TODS), 37(4):31, 2012.
[19]
Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education, 2nd edition, 2001.
[20]
David Ediger, Robert McColl, E. Jason Riedy, and David A. Bader. STINGER: High performance data structure for streaming graphs. In High Performance Extreme Computing (HPEC), pages 1--5. IEEE, 2012.
[21]
Jing Fan, Adalbert Gerald Soosai Raj, and Jignesh M. Patel. The case against specialized graph analytics engines. In CIDR 2015, Seventh Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 4-7, 2015, Online Proceedings, 2015.
[22]
L. C. Freeman. Centrality in social networks: Conceptual clarification. Social Networks, 1(3):215--239, 1979.
[23]
Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. Wtf: The who to follow service at twitter. In Proceedings of the 22nd international conference on World Wide Web, pages 505--514. International World Wide Web Conferences Steering Committee, 2013.
[24]
Sungpack Hong, Hassan Chafi, Edic Sedlar, and Kunle Olukotun. Green-Marl: A DSL for Easy and Efficient Graph Analysis. In ASPLOS, pages 349--362. ACM, 2012.
[25]
Sungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Verstraaten, and Hassan Chafi. Pgx.d: A fast distributed graph processing engine. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '15, pages 58:1--58:12, New York, NY, USA, 2015. ACM.
[26]
Sungpack Hong, Semih Salihoglu, Jennifer Widom, and Kunle Olukotun. Simplifying scalable graph processing with a domain-specific language. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '14, pages 208:208--208:218, New York, NY, USA, 2014. ACM.
[27]
Sungpack Hong, Jan Van Der Lugt, Adam Welc, Raghavan Raman, and Hassan Chafi. Early experiences in using a domain-specific language for large-scale graph analysis. In First International Workshop on Graph Data Management Experiences and Systems, page 5. ACM, 2013.
[28]
Volker Kaibel and Matthias A. F. Peinhardt. On the bottleneck shortest path problem, technical report, ZIB-Report, 2006.
[29]
U Kang, Charalampos E Tsourakakis, and Christos Faloutsos. Pegasus: A peta-scale graph mining system implementation and observations. In IEEE International Conference on Data Mining (ICDM), pages 229--238, 2009.
[30]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web, WWW '10, pages 591--600. ACM, 2010.
[31]
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow., 5(8):716--727, 2012.
[32]
Kamesh Madduri, David Ediger, Karl Jiang, David A. Bader, and Daniel Chavarria-Miranda. A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets. In Proceedings of the 2009 IEEE IPDPS, IPDPS '09, pages 1--8, Washington, DC, USA, 2009. IEEE Computer Society.
[33]
Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: A System for Large-scale Graph Processing. In SIGMOD. Proceedings of the 2010 international conference on Management of data, pages 135--146. ACM, 2010.
[34]
R. C. McColl, D. Ediger, J. Poovey, D. Campbell, and D. A. Bader. A performance evaluation of open source graph databases. In Proceedings of the First Workshop on PPAA, PPAA '14, pages 11--18, New York, NY, USA, 2014. ACM.
[35]
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 456--471. ACM, 2013.
[36]
L. Page. Method for node ranking in a linked database, September 4 2001. US Patent 6,285,999.
[37]
A. Roy, I. Mihailovic, and W. Zwaenepoel. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 472--488. ACM, 2013.
[38]
Martin Sevenich, Sungpack Hong, Adam Welc, and Hassan Chafi. Fast in-memory triangle listing for large real-world graphs. In Proceedings of the 8th Workshop on Social Network Mining and Analysis, SNAKDD'14, pages 2:1--2:9, New York, NY, USA, 2014. ACM.
[39]
Jyothish Soman and Ankur Narang. Fast community detection algorithm with gpus and multicore architectures. In IPDPS, pages 568--579. IEEE, 2011.
[40]
Wen Sun, Achille Fokoue, Kavitha Srinivas, Anastasios Kementsietsidis, Gang Hu, and Guotong Xie. Sqlgraph: An efficient relational-based property graph store. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1887--1901, New York, NY, USA, 2015. ACM.
[41]
S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In Proceedings of the 20th international conference on World wide web, pages 607--614. ACM, 2011.
[42]
Robert Endre Tarjan. Depth-first search and linear graph algorithms. SIAM J. Comput., 1(2):146--160, 1972.
[43]
M. Then, M. Kaufmann, F. Chirigati, T. Hoang-Vu, K. Pham, A. Kemper, T. Neumann, and H. T. Vo. The more the merrier: Efficient multi-source graph traversal. Proc. VLDB Endow., 8(4):449--460, December 2014.
[44]
Adam Welc, Raghavan Raman, Zhe Wu, Sungpack Hong, Hassan Chafi, and Jay Banerjee. Graph analysis: do we have to reinvent the wheel? In First International Workshop on Graph Data Management Experiences and Systems, page 7. ACM, 2013.
[45]
M. Wittmann, T. Zeiser, G. Hager, and G. Wellein. Short note on costs of floating point operations on current x86-64 architectures: Denormals, overflow, underflow, and division by zero. CoRR, abs/1506.03997, 2015.

Cited By

View all
  • (2022)A Multi-target, Multi-paradigm DSL Compiler for Algorithmic Graph ProcessingProceedings of the 15th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3567512.3567513(2-15)Online publication date: 29-Nov-2022
  • (2021)Grafs: declarative graph analyticsProceedings of the ACM on Programming Languages10.1145/34735885:ICFP(1-32)Online publication date: 19-Aug-2021
  • (2020)GraphOneACM Transactions on Storage10.1145/336418015:4(1-40)Online publication date: 16-Jan-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 9, Issue 13
September 2016
378 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 September 2016
Published in PVLDB Volume 9, Issue 13

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)3
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)A Multi-target, Multi-paradigm DSL Compiler for Algorithmic Graph ProcessingProceedings of the 15th ACM SIGPLAN International Conference on Software Language Engineering10.1145/3567512.3567513(2-15)Online publication date: 29-Nov-2022
  • (2021)Grafs: declarative graph analyticsProceedings of the ACM on Programming Languages10.1145/34735885:ICFP(1-32)Online publication date: 19-Aug-2021
  • (2020)GraphOneACM Transactions on Storage10.1145/336418015:4(1-40)Online publication date: 16-Jan-2020
  • (2019)GRAPHONEProceedings of the 17th USENIX Conference on File and Storage Technologies10.5555/3323298.3323322(249-263)Online publication date: 25-Feb-2019
  • (2018)Start late or finish earlyProceedings of the VLDB Endowment10.14778/3282495.328250112:2(154-168)Online publication date: 1-Oct-2018
  • (2018)Analytics with smart arraysProceedings of the Thirteenth EuroSys Conference10.1145/3190508.3190514(1-15)Online publication date: 23-Apr-2018
  • (2018)G-COREProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3190654(1421-1432)Online publication date: 27-May-2018
  • (2018)A Graph Database for a Virtualized Network InfrastructureProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3190653(1393-1405)Online publication date: 27-May-2018
  • (2017)PGX.D/AsyncProceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems10.1145/3078447.3078454(1-6)Online publication date: 19-May-2017

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media