[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ICDE.2012.109guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On Top-k Structural Similarity Search

Published: 01 April 2012 Publication History

Abstract

Search for objects similar to a given query object in a network has numerous applications including web search and collaborative filtering. We use the notion of structural similarity to capture the commonality of two objects in a network, e.g., if two nodes are referenced by the same node, they may be similar. Meeting-based methods including SimRank and P-Rank capture structural similarity very well. Deriving inspiration from PageRank, SimRank has gained popularity by a natural intuition and domain independence. Since it's computationally expensive, subsequent work has focused on optimizing and approximating the computation of SimRank. In this paper, we approach SimRank from a top-k querying perspective where given a query node v, we are interested in finding the top-k nodes that have the highest SimRank score w.r.t. v. The only known approaches for answering such queries are either a naive algorithm of computing the similarity matrix for all node pairs or computing the similarity vector by comparing the query node v with each other node independently, and then picking the top-k. None of these approaches can handle top-k structural similarity search efficiently by scaling to very large graphs consisting of millions of nodes. We propose an algorithmic framework called TopSim based on transforming the top-k SimRank problem on a graph G to one of finding the top-k nodes with highest authority on the product graph G G. We further accelerate Top Sim by merging similarity paths and develop a more efficient algorithm called Top Sim-SM. Two heuristic algorithms, Trun-Top Sim-SM and Prio-Top Sim-SM, are also proposed to approximate Top Sim-SM on scale-free graphs to trade accuracy for speed, based on truncated random walk and prioritizing propagation respectively. We analyze the accuracy and performance of Top Sim family algorithms and report the results of a detailed experimental study.

Cited By

View all
  • (2023)Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and ImprovementProceedings of the VLDB Endowment10.14778/3636218.363621917:4(617-629)Online publication date: 1-Dec-2023
  • (2023)Effective and Scalable Manifold Ranking-Based Image Retrieval with Output BoundACM Transactions on Knowledge Discovery from Data10.1145/356557417:5(1-31)Online publication date: 7-Apr-2023
  • (2022)Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge GraphsACM Transactions on Information Systems10.1145/349520940:4(1-45)Online publication date: 11-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDE '12: Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
April 2012
1432 pages
ISBN:9780769547473

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 April 2012

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Efficient and Accurate SimRank-Based Similarity Joins: Experiments, Analysis, and ImprovementProceedings of the VLDB Endowment10.14778/3636218.363621917:4(617-629)Online publication date: 1-Dec-2023
  • (2023)Effective and Scalable Manifold Ranking-Based Image Retrieval with Output BoundACM Transactions on Knowledge Discovery from Data10.1145/356557417:5(1-31)Online publication date: 7-Apr-2023
  • (2022)Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge GraphsACM Transactions on Information Systems10.1145/349520940:4(1-45)Online publication date: 11-Jan-2022
  • (2020)SimTabProceedings of the VLDB Endowment10.14778/3407790.340781913:12(2202-2214)Online publication date: 14-Sep-2020
  • (2020)Realtime index-free single source SimRank processing on web-scale graphsProceedings of the VLDB Endowment10.14778/3384345.338434713:7(966-980)Online publication date: 26-Mar-2020
  • (2020)Personalized PageRank to a Target Node, RevisitedProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403108(657-667)Online publication date: 23-Aug-2020
  • (2020)Exact Single-Source SimRank Computation on Large GraphsProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389781(653-663)Online publication date: 11-Jun-2020
  • (2019)PRSimProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3319873(1042-1059)Online publication date: 25-Jun-2019
  • (2018)Dynamical SimRank search on time-varying networksThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0488-z27:1(79-104)Online publication date: 1-Feb-2018
  • (2017)ProbesimProceedings of the VLDB Endowment10.14778/3151113.315111511:1(14-26)Online publication date: 1-Sep-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media