[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Graph pattern matching: from intractable to polynomial time

Published: 01 September 2010 Publication History

Abstract

Graph pattern matching is typically defined in terms of subgraph isomorphism, which makes it an np-complete problem. Moreover, it requires bijective functions, which are often too restrictive to characterize patterns in emerging applications. We propose a class of graph patterns, in which an edge denotes the connectivity in a data graph within a predefined number of hops. In addition, we define matching based on a notion of bounded simulation, an extension of graph simulation. We show that with this revision, graph pattern matching can be performed in cubic-time, by providing such an algorithm. We also develop algorithms for incrementally finding matches when data graphs are updated, with performance guarantees for dag patterns. We experimentally verify that these algorithms scale well, and that the revised notion of graph pattern matching allows us to identify communities commonly found in real-world networks.

References

[1]
S. Amer-Yahia, M. Benedikt, and P. Bohannon. Challenges in searching online communities. IEEE Data Eng. Bull., 30(2):23--31, 2007.
[2]
J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.
[3]
T. Y. Berger-Wolf and J. Saia. A framework for analysis of dynamic social networks. In KDD, 2006.
[4]
N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD, 2002.
[5]
E. P. Chan and H. Lim. Optimization and evaluation of shortest path queries. VLDB J., 16(3):343--369, 2007.
[6]
L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In VLDB, 2005.
[7]
J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang. Fast graph pattern matching. In ICDE, 2008.
[8]
J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, 2008.
[9]
J. Cho, N. Shivakumar, and H. Garcia-Molina. Finding replicated Web collections. SIGMOD Rec., 29(2), 2000.
[10]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5), 2003.
[11]
W. Fan and P. Bohannon. Information preserving XML schema embedding. TODS, 33(1), 2008.
[12]
W. Fan, J. Li, S. Ma, H. Wang, and Y. Wu. Graph homomorphism revisited for graph matching. PVLDB, 3, 2010.
[13]
B. Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. AAAI FS., 2006.
[14]
H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, 2009.
[15]
M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.
[16]
R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD, 2009.
[17]
T. Milo and D. Suciu. Index structures for path expressions. In ICDT, 1999.
[18]
L. D. Nardo, F. Ranzato, and F. Tapparo. The subgraph similarity problem. TKDE, 21(5):748--749, 2009.
[19]
M. Natarajan. Understanding the structure of a drug trafficking organization: a conversational analysis. Crime Prevention Studies, 11:273--298, 2000.
[20]
G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996.
[21]
G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996.
[22]
G. Ramalingam and T. W. Reps. A categorized bibliography on incremental computation. In POPL, 1993.
[23]
R. Ronen and O. Shmueli. SoQL: A language for querying and creating data in social networks. In ICDE, 2009.
[24]
D. Saha. An incremental bisimulation algorithm. In FSTTCS, 2007.
[25]
D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In PODS, 2002.
[26]
L. Terveen and D. W. McDonald. Social matching: A framework and research agenda. ACM Trans. Comput.-Hum. Interact., 12(3), 2005.
[27]
H. Tong, C. Faloutsos, B. Gallagher, and T. Eliassi-Rad. Fast best-effort pattern matching in large attributed graphs. In KDD, 2007.
[28]
J. R. Ullmann. An algorithm for subgraph isomorphism. J. ACM, 23(1), 1976.
[29]
H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE, 2006.
[30]
K. Yi, H. He, I. Stanoi, and J. Yang. Incremental maintenance of XML structural indexes. In SIGMOD, 2004.
[31]
L. Zou, L. Chen, and M. T. Özsu. Distance-join: Pattern match query in a large graph database. In VLDB, 2009.

Cited By

View all
  1. Graph pattern matching: from intractable to polynomial time

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
      September 2010
      1658 pages

      Publisher

      VLDB Endowment

      Publication History

      Published: 01 September 2010
      Published in PVLDB Volume 3, Issue 1-2

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)54
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 17 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Linking Entities across Relations and GraphsACM Transactions on Database Systems10.1145/363936349:1(1-50)Online publication date: 3-Jan-2024
      • (2024)Scalable Diversified Top-k Pattern Matching in Big GraphsBig Data Research10.1016/j.bdr.2024.10046436:COnline publication date: 28-May-2024
      • (2024)Towards efficient simulation-based constrained temporal graph pattern matchingWorld Wide Web10.1007/s11280-024-01259-227:3Online publication date: 3-Apr-2024
      • (2024)GPU-accelerated relaxed graph pattern matching algorithmsThe Journal of Supercomputing10.1007/s11227-024-06283-780:15(21811-21836)Online publication date: 1-Oct-2024
      • (2024)Temporal graph patterns by timed automataThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00795-z33:1(25-47)Online publication date: 1-Jan-2024
      • (2023)CommunityAF: An Example-Based Community Search Method via Autoregressive FlowProceedings of the VLDB Endowment10.14778/3603581.360359516:10(2565-2577)Online publication date: 1-Jun-2023
      • (2023)Semi-Supervised Graph Pattern Matching and Rematching for Expert Community LocationACM Transactions on Knowledge Discovery from Data10.1145/353262317:1(1-26)Online publication date: 20-Feb-2023
      • (2023)Event Association Analysis Using Graph RulesArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44216-2_29(352-363)Online publication date: 26-Sep-2023
      • (2023)A Scalable Query Pricing Framework for Incomplete Graph DataDatabase Systems for Advanced Applications10.1007/978-3-031-30637-2_7(97-113)Online publication date: 17-Apr-2023
      • (2022)RapidFlowProceedings of the VLDB Endowment10.14778/3551793.355180315:11(2415-2427)Online publication date: 1-Jul-2022
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media