[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Efficient Tree-SVD for Subset Node Embedding over Large Dynamic Graphs

Published: 30 May 2023 Publication History

Abstract

Subset embedding is the task to learn low-dimensional representations for a subset of nodes according to the graph topology. It has applications when we focus on a subset of users, e.g., young adults, and aim to make better recommendations for these target users. In real-world scenarios, graphs are dynamically changing. Thus, it is more desirable to dynamically maintain the subset embeddings to reflect graph updates. The state-of-the-art methods, e.g., DynPPE, still adopt a hashing-based method, while hashing-based solutions are shown to be less effective than matrix factorization (MF)-based methods in existing studies. At the same time, MF-based methods in the literature are too expensive to update the embedding when the graph changes, making them inapplicable on dynamic graphs.
Motivated by this, we present Tree-SVD, an efficient and effective MF-based method for dynamic subset embedding. If we simply maintain the whole proximity matrix, then we need to re-do the MF, e.g., truncated Singular Value Decomposition (SVD), on the whole matrix after graph updates, which is prohibitive. To tackle this issue, our main idea is to do hierarchical SVD (HSVD) on the proximity matrix of the given subset, which vertically divides the proximity matrix into multiple sub-matrices, and then repeatedly do SVD on sub-matrices and merge the intermediate results to obtain the final embedding. We first present Tree-SVD, which combines a sparse randomized SVD with an HSVD. Our theoretical analysis shows that our Tree-SVD gains the efficiency of sparse randomized SVD and the flexibility of the HSVD with theoretical guarantees. To further reduce update costs, we present a lazy-update strategy. In this strategy, we only update sub-matrices that changes remarkably in terms of the Frobenius norm. We present theoretical analysis to show the guarantees with our lazy-update strategy. Extensive experiments show the efficiency and effectiveness of Tree-SVD on node classification and link prediction tasks.

Supplemental Material

MP4 File
Presentation video for SIGMOD 2023

References

[1]
2023. Technical report and source code. https://github.com/DoThingYo/Tree-Embedding.
[2]
Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local Graph Partitioning using PageRank Vectors. In FOCS. 475--486.
[3]
Xu Chen, Siheng Chen, Jiangchao Yao, Huangjie Zheng, Ya Zhang, and Ivor W Tsang. 2022. Learning on attribute-missing graphs. TPAMI 44, 2 (2022), 740--757.
[4]
Kenneth L Clarkson and David P Woodruff. 2017. Low-rank approximation and regression in input sparsity time. JACM 63, 6 (2017), 1--45.
[5]
Lun Du, Yun Wang, Guojie Song, Zhicong Lu, and Junshan Wang. 2018. Dynamic Network Embedding : An Extended Approach for Skip-gram based Network Embedding. In IJCAI. 2086--2092.
[6]
Xu Feng, Yuyang Xie, Mingye Song, Wenjian Yu, and Jie Tang. 2018. Fast Randomized PCA for Sparse Data. In ACML. 710--725.
[7]
Aditya Grover and Jure Leskovec. 2016. Node2vec: Scalable Feature Learning for Networks. In SIGKDD. 855--864.
[8]
Xingzhi Guo, Baojian Zhou, and Steven Skiena. 2021. Subset Node Representation Learning over Large Dynamic Graphs. In SIGKDD. 516--526.
[9]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeurIPS.
[10]
Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, YongDong Zhang, and Meng Wang. 2020. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. In SIGIR. 639--648.
[11]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (1997), 1735--1780.
[12]
Guanhao Hou, Xingguang Chen, Sibo Wang, and Zhewei Wei. 2021. Massively parallel algorithms for personalized pagerank. PVLDB 14, 9 (2021), 1668--1680.
[13]
Guanhao Hou, Qintian Guo, Fangyuan Zhang, Sibo Wang, and Zhewei Wei. 2023. Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme. PACMMOD 1, 1 (2023), 25:1--25:26.
[14]
Mark A Iwen and BW Ong. 2016. A distributed and incremental SVD algorithm for agglomerative data analysis on large networks. SIMAX 37, 4 (2016), 1699--1718.
[15]
Seyed Mehran Kazemi, Rishab Goel, Kshitij Jain, Ivan Kobyzev, Akshay Sethi, Peter Forsyth, and Pascal Poupart. 2020. Representation Learning for Dynamic Graphs: A Survey. JMLR 21, 70 (2020), 1--73.
[16]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
[17]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a Social Network or a News Media?. In WWW. 591--600.
[18]
Peter Lofgren. 2015. Efficient algorithms for personalized pagerank. Stanford University.
[19]
Yao Ma, Ziyi Guo, Zhaocun Ren, Jiliang Tang, and Dawei Yin. 2020. Streaming Graph Neural Networks. In SIGIR. 719--728.
[20]
Sedigheh Mahdavi, Shima Khoshraftar, and Aijun An. 2018. dynnode2vec: Scalable Dynamic Network Embedding. In ICBD. 3762--3765.
[21]
Franco Manessi, Alessandro Rozza, and Mario Manzo. 2020. Dynamic graph convolutional networks. Pattern Recognition 97 (2020).
[22]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NeurIPS. 3111--3119.
[23]
Giang Hoang Nguyen, John Boaz Lee, Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, and Sungchul Kim. 2018. Continuous-Time Dynamic Network Embeddings. In WWW Companion. 969--976.
[24]
Naoto Ohsaka, Takanori Maehara, and Ken-ichi Kawarabayashi. 2015. Efficient PageRank Tracking in Evolving Networks. 875--884.
[25]
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric Transitivity Preserving Graph Embedding. In SIGKDD. 1105--1114.
[26]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: bringing order to the web. (1999).
[27]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In SIGKDD. 701--710.
[28]
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. 2019. NetSMF: Large-Scale Network Embedding As Sparse Matrix Factorization. In WWW. 1509--1520.
[29]
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and Node2vec. In WSDM. 459--467.
[30]
Mingyue Tang, Pan Li, and Carl Yang. 2022. Graph Auto-Encoder via Neighborhood Wasserstein Reconstruction. In ICLR.
[31]
Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. 2019. DyRep: Learning Representations over Dynamic Graphs. In ICLR.
[32]
Anton Tsitsulin, Marina Munkhoeva, Davide Mottin, Panagiotis Karras, Ivan Oseledets, and Emmanuel Müller. 2021. FREDE: Anytime Graph Embeddings. PVLDB 14, 6 (2021), 1102--1110.
[33]
Hanzhi Wang, Zhewei Wei, Junhao Gan, Sibo Wang, and Zengfeng Huang. 2020. Personalized pagerank to a target node, revisited. In SIGKDD. 657--667.
[34]
Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. 2016. Hubppr: effective indexing for approximate personalized pagerank. PVLDB 10, 3 (2016), 205--216.
[35]
Sibo Wang, Renchi Yang, Runhui Wang, Xiaokui Xiao, Zhewei Wei, Wenqing Lin, Yin Yang, and Nan Tang. 2019. Efficient algorithms for approximate single-source personalized pagerank queries. TODS 44, 4 (2019), 1--37.
[36]
Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. 2017. FORA: Simple and Effective Approximate Single-Source Personalized PageRank. In SIGKDD. 505--514.
[37]
Zhewei Wei, Xiaodong He, Xiaokui Xiao, Sibo Wang, Shuo Shang, and Ji-Rong Wen. 2018. Topppr: top-k personalized pagerank queries with precision guarantees on large graphs. In SIGMOD. 441--456.
[38]
Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, and Sourav S. Bhowmick. 2020. Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank. PVLDB 13, 5 (2020), 670--683.
[39]
Yuan Yin and Zhewei Wei. 2019. Scalable graph embeddings via sparse transpose proximities. In SIGKDD. 1429--1437.
[40]
Wenchao Yu, Wei Cheng, Charu C Aggarwal, Haifeng Chen, and Wei Wang. 2017. Link Prediction with Spatial and Temporal Consistency in Dynamic Networks. In IJCAI. 3343--3349.
[41]
Hongyang Zhang, Peter Lofgren, and Ashish Goel. 2016. Approximate personalized pagerank on dynamic graphs. In SIGKDD. 1315--1324.
[42]
Xingyi Zhang, Kun Xie, Sibo Wang, and Zengfeng Huang. 2021. Learning Based Proximity Matrix Factorization for Node Embedding. In SIGKDD. 2243--2253.
[43]
Ziwei Zhang, Peng Cui, Haoyang Li, Xiao Wang, and Wenwu Zhu. 2018. Billion-scale network embedding with iterative random projection. In ICDM. 787--796.
[44]
Ziwei Zhang, Peng Cui, Jian Pei, Xiao Wang, and Wenwu Zhu. 2018. TIMERS: Error-Bounded SVD Restart on Dynamic Networks. In AAAI. 224--231.

Cited By

View all
  • (2024)PSNE: Efficient Spectral Sparsification Algorithms for Scaling Network EmbeddingProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679540(1420-1429)Online publication date: 21-Oct-2024
  • (2024)Towards Deeper Understanding of PPR-based Embedding Approaches: A Topological PerspectiveProceedings of the ACM Web Conference 202410.1145/3589334.3645663(969-979)Online publication date: 13-May-2024
  • (2024)TimeSGN: Scalable and Effective Temporal Graph Neural Network2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00255(3297-3310)Online publication date: 13-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 1, Issue 1
PACMMOD
May 2023
2807 pages
EISSN:2836-6573
DOI:10.1145/3603164
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2023
Published in PACMMOD Volume 1, Issue 1

Permissions

Request permissions for this article.

Author Tags

  1. dynamic graph
  2. proximity matrix factorization
  3. subset node embedding
  4. tree-SVD

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)125
  • Downloads (Last 6 weeks)12
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PSNE: Efficient Spectral Sparsification Algorithms for Scaling Network EmbeddingProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679540(1420-1429)Online publication date: 21-Oct-2024
  • (2024)Towards Deeper Understanding of PPR-based Embedding Approaches: A Topological PerspectiveProceedings of the ACM Web Conference 202410.1145/3589334.3645663(969-979)Online publication date: 13-May-2024
  • (2024)TimeSGN: Scalable and Effective Temporal Graph Neural Network2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00255(3297-3310)Online publication date: 13-May-2024
  • (2024)Incorporating Dynamic Temperature Estimation into Contrastive Learning on Graphs2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00224(2889-2903)Online publication date: 13-May-2024
  • (2024)PlatoD2GL: An Efficient Dynamic Deep Graph Learning System for Graph Neural Network Training on Billion-Scale Graphs2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00191(2421-2434)Online publication date: 13-May-2024
  • (2024)FICOM: an effective and scalable active learning framework for GNNs on semi-supervised node classificationThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00870-z33:5(1723-1742)Online publication date: 22-Jul-2024
  • (2023)Constrained Social Community RecommendationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599793(5586-5596)Online publication date: 6-Aug-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media