short-paper

VerSaChI: Finding Statistically Significant Subgraph Matches using Chebyshev's Inequality

Authors:

Shubhangi Agarwal,

Sourav Dutta,

Arnab BhattacharyaAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 2812 - 2816

https://doi.org/10.1145/3459637.3482217

Published: 30 October 2021 Publication History

Get Access

Abstract

Approximate subgraph matching, an important primitive for many applications like question answering, community detection, and motif discovery, often involves large labeled graphs such as knowledge graphs, social networks, and protein sequences. Effective methods for extracting matching subgraphs, in terms of label and structural similarities to a query, should depict accuracy, computational efficiency, and robustness to noise. In this paper, we propose VerSaChI for finding the top-k most similar subgraphs based on 2-hop label and structural overlap similarity with the query. The similarity is characterized using Chebyshev's inequality to compute the chi-square statistical significance for measuring the degree of matching of the subgraphs. Experiments on real-life graph datasets showcase significant improvements in terms of accuracy compared to state-of-the-art methods, as well as robustness to noise.

Supplementary Material

MP4 File (CIKM21-rgsp2563.mp4)

Approximate subgraph matching, an important primitive for many applications like question answering, community detection, and motif discovery, often involves large labeled graphs such as knowledge graphs, social networks, and protein sequences. Effective methods for extracting matching subgraphs, in terms of label and structural similarities to a query, should depict accuracy, computational efficiency, and robustness to noise. In this paper, we propose VerSaChI for finding the top-k most similar subgraphs based on 2-hop label and structural overlap similarity with the query. The similarity is characterized using Chebyshev?s inequality to compute the chi-square statistical significance for measuring the degree of matching of the subgraphs. Experiments on real-life graph datasets showcase significant improvements in terms of accuracy compared to state-of-the-art methods, as well as robustness to noise.

Download
20.31 MB

References

[1]

S. Agarwal, S. Dutta, and A. Bhattacharya. 2020. ChiSeL: Graph Similarity Search using Chi-Squared Statistics in Large Probabilistic Graphs. PVLDB, Vol. 13, 10 (2020), 1654--1668.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Neighbor-Aware Search for Approximate Labeled Graph Matching using the Chi-Square Statistics

ChiSeL: graph similarity search using chi-squared statistics in large probabilistic graphs

VeNoM: Approximate Subgraph Matching with Enhanced Neighbourhood Structural Information

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations