[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Multi-Stage Network Embedding for Exploring Heterogeneous Edges

Published: 07 December 2020 Publication History

Abstract

The relationships between objects in a network are typically diverse and complex, leading to the heterogeneous edges with different semantic information. In this article, we focus on exploring the heterogeneous edges for network representation learning. By considering each relationship as a view that depicts a specific type of proximity between nodes, we propose a multi-stage non-negative matrix factorization (MNMF) model, committed to utilizing abundant information in multiple views to learn robust network representations. In fact, most existing network embedding methods are closely related to implicitly factorizing the complex proximity matrix. However, the approximation error is usually quite large, since a single low-rank matrix is insufficient to capture the original information. Through a multi-stage matrix factorization process motivated by gradient boosting, our MNMF model achieves lower approximation error. Meanwhile, the multi-stage structure of MNMF gives the feasibility of designing two kinds of non-negative matrix factorization (NMF) manners to preserve network information better. The united NMF aims to preserve the consensus information between different views, and the independent NMF aims to preserve unique information of each view. Concrete experimental results on realistic datasets indicate that our model outperforms three types of baselines in practical applications.

References

[1]
Ivana Balazevic, Carl Allen, and Timothy Hospedales. 2019. TuckER: Tensor factorization for knowledge graph completion. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 5188--5197.
[2]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.
[3]
Piotr Bródka, Krzysztof Skibicki, Przemysław Kazienko, and Katarzyna Musiał. 2011. A degree centrality in multi-layered social network. In Proceedings of the International Conference on Computational Aspects of Social Networks. IEEE, 237--242.
[4]
Deng Cai, Xiaofei He, Xiaoyun Wu, and Jiawei Han. 2008. Non-negative matrix factorization on manifold. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM’08). IEEE Computer Society, 63--72.
[5]
Xiao Cai, Feiping Nie, and Heng Huang. 2013. Multi-view k-means clustering on big data. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Francesca Rossi (Ed.). 2598--2604.
[6]
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 891--900.
[7]
C. Christoudias, Raquel Urtasun, and Trevor Darrell. 2012. Multi-view learning in the presence of view disagreement. Arxiv Preprint Arxiv:1206.3242 (2012).
[8]
Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. 2018. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering 31, 5 (2018), 833--852.
[9]
Lieven De Lathauwer. 2009. A survey of tensor methods. In Proceedings of the International Symposium on Circuits and Systems (ISCAS’09). IEEE, 2773--2776.
[10]
Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 135--144.
[11]
Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 278--288.
[12]
Sofia Fernandes, Hadi Fanaee-T., and João Gama. 2018. Dynamic graph summarization: A tensor decomposition approach. Data Mining and Knowledge Discovery 32, 5 (2018), 1397--1420.
[13]
Andrea Franceschini, Damian Szklarczyk, Sune Frankild, Michael Kuhn, Milan Simonovic, Alexander Roth, Jianyi Lin, Pablo Minguez, Peer Bork, and Christian Von Mering. 2012. STRING v9. 1: Protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Research 41, D1 (2012), D808--D815.
[14]
Jerome H. Friedman. 2002. Stochastic gradient boosting. Computational Statistics 8 Data Analysis 38, 4 (2002), 367--378.
[15]
Tao-yang Fu, Wang-Chien Lee, and Zhen Lei. 2017. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1797--1806.
[16]
Xiao Fu, Kejun Huang, Wing-Kin Ma, Nicholas D. Sidiropoulos, and Rasmus Bro. 2015. Joint tensor factorization and outlying slab suppression with applications. IEEE Transactions on Signal Processing 63, 23 (2015), 6315--6328.
[17]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855--864.
[18]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the Annual Conference on Neural Information Processing Systems. 1024--1034.
[19]
Hannah Kim, Jaegul Choo, Jingu Kim, Chandan K. Reddy, and Haesun Park. 2015. Simultaneous discovery of common and discriminative topics via joint nonnegative matrix factorization. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 567--576.
[20]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR'17), Toulon, France, April 24-26, 2017, Conference Track Proceedings.
[21]
Tamara G. Kolda and Jimeng Sun. 2008. Scalable tensor decompositions for multi-aspect data mining. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM’08). IEEE, 363--372.
[22]
Abhishek Kumar and Hal Daumé. 2011. A co-training approach for multi-view spectral clustering. In Proceedings of the 28th International Conference on Machine Learning (ICML’11). 393--400.
[23]
Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788.
[24]
Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Proceedings of the 2001 Advances in Neural Information Processing Systems. 556--562.
[25]
Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Proceedings of the 2014 Advances in Neural Information Processing Systems. 2177--2185.
[26]
Jundong Li, Liang Wu, and Huan Liu. 2019. Multi-level network embedding with boosted low-rank matrix approximation. In Proceedings of the 2019 Advances in Social Networks Analysis and Mining.
[27]
Taibo Li, Rasmus Wernersson, Rasmus B. Hansen, Heiko Horn, Johnathan Mercer, Greg Slodkowicz, Christopher T. Workman, Olga Rigina, Kristoffer Rapacki, Hans H. Stærfeldt, et al. 2017. A scored human protein--protein interaction network to catalyze genomic interpretation. Nature Methods 14, 1 (2017), 61.
[28]
Yangxi Li, Bo Geng, Zheng-Jun Zha, Dacheng Tao, Linjun Yang, and Chao Xu. 2011. Difficulty guided image retrieval using linear multiview embedding. In Proceedings of the 19th ACM International Conference on Multimedia. ACM, 1169--1172.
[29]
Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, and Yong Rui. 2014. GeoMF: Joint geographical modeling and matrix factorization for point-of-interest recommendation. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 831--840.
[30]
Jialu Liu, Chi Wang, Jing Gao, and Jiawei Han. 2013. Multi-view clustering via joint nonnegative matrix factorization. In Proceedings of the 2013 SIAM International Conference on Data Mining. SIAM, 252--260.
[31]
Haiping Lu, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos. 2011. A survey of multilinear subspace learning for tensor data. Pattern Recognition 44, 7 (2011), 1540--1551.
[32]
Run-kun Lu, Jian-wei Liu, Yuan-fang Wang, Hao-jie Xie, and Xin Zuo. 2019. Auto-encoder based co-training multi-view representation learning. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 119--130.
[33]
Yuanfu Lu, Chuan Shi, Linmei Hu, and Zhiyuan Liu. 2019. Relation structure-aware heterogeneous information network embedding. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI'19), Honolulu, Hawaii, USA, January 27 - February 1, 2019.
[34]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov. (2008), 2579--2605.
[35]
James MacQueen. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Oakland, CA, 281--297.
[36]
Ryuta Matsuno and Tsuyoshi Murata. 2018. MELL: Effective embedding method for multiplex networks. In Companion Proceedings of the The Web Conference 2018. International World Wide Web Conferences Steering Committee, 1261--1268.
[37]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the 2013 Advances in Neural Information Processing Systems. 3111--3119.
[38]
Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. 2019. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4602--4609.
[39]
Jingchao Ni, Shiyu Chang, Xiao Liu, Wei Cheng, Haifeng Chen, Dongkuan Xu, and Xiang Zhang. 2018. Co-regularized deep multi-network embedding. In Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee, 469--478.
[40]
Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu. 2016. Asymmetric transitivity preserving graph embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1105--1114.
[41]
Weihua Ou, Shujian Yu, Gai Li, Jian Lu, Kesheng Zhang, and Gang Xie. 2016. Multi-view non-negative matrix factorization by patch alignment framework with view consistency. Neurocomputing 204, C (2016), 116--124.
[42]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 701--710.
[43]
William Phillips and Ellen Riloff. 2002. Exploiting strong syntactic heuristics and co-training to learn semantic lexicons. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP’02). 125--132.
[44]
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. ACM, 459--467.
[45]
Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, and Jiawei Han. 2017. An attention-based collaboration framework for multi-view network representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 1767--1776.
[46]
Yu Shi, Fangqiu Han, Xinwei He, Xinran He, Carl Yang, Jie Luo, and Jiawei Han. 2018. mvn2vec: Preservation and collaboration in multi-view network embedding. Arxiv Preprint Arxiv:1801.06597 (2018).
[47]
Ajit P. Singh and Geoffrey J. Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 650--658.
[48]
Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian optimization of machine learning algorithms. In Proceedings of Advances in Neural Information Processing Systems. 2951--2959.
[49]
Jian Tang, Meng Qu, and Qiaozhu Mei. 2015. Pte: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1165--1174.
[50]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1067--1077.
[51]
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. Arnetminer: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 990--998.
[52]
Lei Tang and Huan Liu. 2009. Relational learning via latent social dimensions. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 817--826.
[53]
P. Tseng. 2001. Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory 8 Applications 109, 3 (2001), 475--494.
[54]
Richard Van Noorden. 2014. Online collaboration: Scientists and the social network. Nature News 512, 7513 (2014), 126.
[55]
Lu Wang, Yu Song, Hong Huang, Fanghua Ye, Xuanhua Shi, and Hai Jin. 2020. Modeling heterogeneous edges to represent networks with graph auto-encoder. In Proceedings of the 25th International Conference on Database Systems for Advanced Applications.
[56]
Weiran Wang, Raman Arora, Karen Livescu, and Jeff Bilmes. 2015. On deep multi-view representation learning. In Proceedings of the 32nd International Conference on Machine Learning. 1083--1092.
[57]
Xiao Wang, Peng Cui, Jing Wang, Jian Pei, Wenwu Zhu, and Shiqiang Yang. 2017. Community preserving network embedding. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.
[58]
Yuying Xing, Guoxian Yu, Carlotta Domeniconi, Jun Wang, Zili Zhang, and Maozu Guo. 2019. Multi-view multi-instance multi-label learning based on collaborative matrix factorization. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI'19).
[59]
Linchuan Xu, Xiaokai Wei, Jiannong Cao, and S. Yu Philip. 2019. Multi-task network embedding. International Journal of Data Science and Analytics 8, 2 (2019), 183--198.
[60]
Linchuan Xu, Xiaokai Wei, Jiannong Cao, and Philip S. Yu. 2017. Embedding of embedding (eoe): Joint embedding for coupled heterogeneous networks. In Proceedings of the 10th ACM International Conference on Web Search and Data Mining. ACM, 741--749.
[61]
Hongming Zhang, Liwei Qiu, Lingling Yi, and Yangqiu Song. 2018. Scalable multiplex network embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3082--3088.
[62]
Ziwei Zhang, Peng Cui, Xiao Wang, Jian Pei, Xuanrong Yao, and Wenwu Zhu. 2018. Arbitrary-order proximity preserved network embedding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. ACM, 2778--2786.
[63]
Handong Zhao, Zhengming Ding, and Yun Fu. 2017. Multi-view clustering via deep matrix factorization. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.

Cited By

View all
  • (2023)Heterogeneous Network Embedding: A SurveyComputer Modeling in Engineering & Sciences10.32604/cmes.2023.024781137:1(83-130)Online publication date: 2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 15, Issue 1
February 2021
361 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3441647
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 December 2020
Accepted: 01 July 2020
Revised: 01 June 2020
Received: 01 December 2019
Published in TKDD Volume 15, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Network embedding
  2. data mining
  3. non-negative matrix factorization

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Heterogeneous Network Embedding: A SurveyComputer Modeling in Engineering & Sciences10.32604/cmes.2023.024781137:1(83-130)Online publication date: 2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media