[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2187836.2187963acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Distributed graph pattern matching

Published: 16 April 2012 Publication History

Abstract

Graph simulation has been adopted for pattern matching to reduce the complexity and capture the need of novel applications. With the rapid development of the Web and social networks, data is typically distributed over multiple machines. Hence a natural question raised is how to evaluate graph simulation on distributed data. To our knowledge, no such distributed algorithms are in place yet. This paper settles this question by providing evaluation algorithms and optimizations for graph simulation in a distributed setting. (1) We study the impacts of components and data locality on the evaluation of graph simulation. (2) We give an analysis of a large class of distributed algorithms, captured by a message-passing model, for graph simulation. We also identify three complexity measures: visit times, makespan and data shipment, for analyzing the distributed algorithms, and show that these measures are essentially controversial with each other. (3) We propose distributed algorithms and optimization techniques that exploit the properties of graph simulation and the analyses of distributed algorithms. (4) We experimentally verify the effectiveness and efficiency of these algorithms, using both real-life and synthetic data.

References

[1]
Data center. http://wikibon.org/blog/inside-ten-of-the-worlds-largest-data-centers.
[2]
S. Abiteboul, P. Buneman, and D. Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, 1999.
[3]
C. C. Aggarwal and H. Wang. Managing and Mining Graph Data. Springer, 2010.
[4]
S. Amer-Yahia, M. Benedikt, and P. Bohannon. Challenges in searching online communities. IEEE Data Eng. Bull., 30(2):23--31, 2007.
[5]
J. Brynielsson, J. Hogberg, L. Kaati, C. Martenson, and P. Svenson. Detecting social positions using simulation. In ASONAM, 2010.
[6]
D. Bustan and O. Grumberg. Simulation-based minimization. TOCL, 4(2), 2003.
[7]
J. Cohen. Graph twiddling in a mapreduce world. Computing in Science and Engineering, 11(4):29--41, 2009.
[8]
G. Cong, W. Fan, and A. Kementsietsidis. Distributed query evaluation with performance guarantees. In SIGMOD, 2007.
[9]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 2001.
[10]
T. M. Cover and J. A. Thomas. Elements of information theory (2nd ed.) . Wiley, 2006.
[11]
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, 2004.
[12]
W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu. Adding regular expressions to graph reachability and pattern queries. In ICDE, 2011.
[13]
W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractable to polynomial time. PVLDB, 3(1), 2010.
[14]
P.-O. Fjallstrom. Algorithms for graph partitioning: A survey. Linkoping Electronic Articles in Computer and Information Science, 3, 1998.
[15]
B. Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. AAAI FS., 2006.
[16]
M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.
[17]
D. Kossmann. The state of the art in distributed query processing.ACM Comput. Surv., 32(4):422--469, 2000.
[18]
C. Liu, C. Chen, J. Han, and P. S. Yu. Gplag: detection of software plagiarism by program dependence graph analysis. In KDD, 2006.
[19]
N. A. Lynch. Distributed Algorithms. Morgan Kaufmann, 1996.
[20]
S. Ma, Y. Cao, W. Fan, J. Huai, and T. Wo. Capturing topology in graph pattern matching. In VLDB, 2012.
[21]
G. Malewicz, M.H. Austern, A.J.C. Bik, J.C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010.
[22]
M. Naor and L. J. Stockmeyer. What can be computed locally? SIAM J. Comput., 24(6):1259--1277, 1995.
[23]
F. Ranzato and F. Tapparo. An efficient simulation algorithm based on abstract interpretation.Inf. Comput., 208(1):1--22, 2010.
[24]
L. G. Terveen and D. W. McDonald. Social matching: A framework and research agenda. In ACM Trans. Comput.-Hum. Interact., pages 401--434, 2005.
[25]
Y. Tian and J. M. Patel. Tale: A tool for approximate large graph matching. In ICDE, 2008.
[26]
H. Tong, C. Faloutsos, B. Gallagher, and T. Eliassi-Rad. Fast best-effort pattern matching in large attributed graphs. In KDD, 2007.
[27]
J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23(1):31--42, 1976.
[28]
V. V. Vazirani. Approximation Algorithms. Springer, 2003.

Cited By

View all
  • (2024)On The Soundness of a Language for Large and Distributed Graph Processing2024 IEEE 18th International Conference on Application of Information and Communication Technologies (AICT)10.1109/AICT61888.2024.10740415(1-6)Online publication date: 25-Sep-2024
  • (2024)A Survey of Tax Risk Detection Using Data Mining TechniquesEngineering10.1016/j.eng.2023.07.01434(43-59)Online publication date: Mar-2024
  • (2024)Towards efficient simulation-based constrained temporal graph pattern matchingWorld Wide Web10.1007/s11280-024-01259-227:3Online publication date: 3-Apr-2024
  • Show More Cited By

Index Terms

  1. Distributed graph pattern matching

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '12: Proceedings of the 21st international conference on World Wide Web
    April 2012
    1078 pages
    ISBN:9781450312295
    DOI:10.1145/2187836
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • Univ. de Lyon: Universite de Lyon

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 April 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. distributed algorithms
    2. graph querying
    3. graph simulation

    Qualifiers

    • Research-article

    Conference

    WWW 2012
    Sponsor:
    • Univ. de Lyon
    WWW 2012: 21st World Wide Web Conference 2012
    April 16 - 20, 2012
    Lyon, France

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 20 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)On The Soundness of a Language for Large and Distributed Graph Processing2024 IEEE 18th International Conference on Application of Information and Communication Technologies (AICT)10.1109/AICT61888.2024.10740415(1-6)Online publication date: 25-Sep-2024
    • (2024)A Survey of Tax Risk Detection Using Data Mining TechniquesEngineering10.1016/j.eng.2023.07.01434(43-59)Online publication date: Mar-2024
    • (2024)Towards efficient simulation-based constrained temporal graph pattern matchingWorld Wide Web10.1007/s11280-024-01259-227:3Online publication date: 3-Apr-2024
    • (2024)GPU-accelerated relaxed graph pattern matching algorithmsThe Journal of Supercomputing10.1007/s11227-024-06283-780:15(21811-21836)Online publication date: 16-Jun-2024
    • (2023)DecoMine: A Compilation-Based Graph Pattern Mining System with Pattern DecompositionProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3567955.3567956(47-61)Online publication date: 25-Mar-2023
    • (2023)Efficient Anomaly Detection in Property GraphsDatabase Systems for Advanced Applications10.1007/978-3-031-30675-4_9(120-136)Online publication date: 15-Apr-2023
    • (2022)Optimizing subgraph matching over distributed knowledge graphs using partial evaluationWorld Wide Web10.1007/s11280-022-01075-626:2(751-771)Online publication date: 8-Jul-2022
    • (2021)Approximate computation for big data analyticsACM SIGWEB Newsletter10.1145/3447879.34478832021:Winter(1-8)Online publication date: 19-Feb-2021
    • (2021)SandslashProceedings of the 35th ACM International Conference on Supercomputing10.1145/3447818.3460359(378-391)Online publication date: 3-Jun-2021
    • (2021)A Survey on Distributed Graph Pattern Matching in Massive GraphsACM Computing Surveys10.1145/343972454:2(1-35)Online publication date: 9-Feb-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media