[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2736277.2741093acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

LINE: Large-scale Information Network Embedding

Published: 18 May 2015 Publication History

Abstract

This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the ``LINE,'' which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available online\footnote{\url{https://github.com/tangjianpku/LINE}}.

References

[1]
A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, and A. J. Smola. Distributed large-scale natural graph factorization. In Proceedings of the 22nd international conference on World Wide Web, pages 37--48. International World Wide Web Conferences Steering Committee, 2013.
[2]
M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, volume 14, pages 585--591, 2001.
[3]
S. Bhagat, G. Cormode, and S. Muthukrishnan. Node classification in social networks. In Social Network Data Analytics, pages 115--148. Springer, 2011.
[4]
T. F. Cox and M. A. Cox. Multidimensional scaling. CRC Press, 2000.
[5]
J. R. Firth. A synopsis of linguistic theory, 1930--1955. In J. R. Firth (Ed.), Studies in linguistic analysis, pages 1--32.
[6]
M. S. Granovetter. The strength of weak ties. American journal of sociology, pages 1360--1380, 1973.
[7]
Q. Le and T. Mikolov. Distributed representations of sentences and documents. In Proceedings of The 31st International Conference on Machine Learning, pages 1188--1196, 2014.
[8]
O. Levy and Y. Goldberg. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems, pages 2177--2185, 2014.
[9]
A. Q. Li, A. Ahmed, S. Ravi, and A. J. Smola. Reducing the sampling complexity of topic models. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 891--900. ACM, 2014.
[10]
D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7):1019--1031, 2007.
[11]
C. D. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008.
[12]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
[13]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111--3119, 2013.
[14]
S. A. Myers, A. Sharma, P. Gupta, and J. Lin. Information network or social network?: the structure of the twitter follow graph. In Proceedings of the companion publication of the 23rd international conference on World wide web companion, pages 493--498. International World Wide Web Conferences Steering Committee, 2014.
[15]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. 1999.
[16]
B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701--710. ACM, 2014.
[17]
B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems, pages 693--701, 2011.
[18]
S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323--2326, 2000.
[19]
J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 990--998. ACM, 2008.
[20]
J. B. Tenenbaum, V. De Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319--2323, 2000.
[21]
L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(2579-2605):85, 2008.
[22]
S. Yan, D. Xu, B. Zhang, H.-J. Zhang, Q. Yang, and S. Lin. Graph embedding and extensions: a general framework for dimensionality reduction. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 29(1):40--51, 2007.
[23]
X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and J. Han. Personalized entity recommendation: A heterogeneous information network approach. In Proceedings of the 7th ACM international conference on Web search and data mining, pages 283--292. ACM, 2014.

Cited By

View all
  • (2025)Heterogeneous Spatio-Temporal Graph Contrastive Learning for Point-of-Interest RecommendationTsinghua Science and Technology10.26599/TST.2023.901014830:1(186-197)Online publication date: Feb-2025
  • (2025)Context Correlation Discrepancy Analysis for Graph Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348837537:1(174-187)Online publication date: Jan-2025
  • (2025)Network alignmentPhysics Reports10.1016/j.physrep.2024.11.0061107(1-45)Online publication date: Mar-2025
  • Show More Cited By

Index Terms

  1. LINE: Large-scale Information Network Embedding

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '15: Proceedings of the 24th International Conference on World Wide Web
    May 2015
    1460 pages
    ISBN:9781450334693

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 18 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. dimension reduction
    2. feature learning
    3. information network embedding
    4. scalability

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation
    • National Natural Science Foundation of China

    Conference

    WWW '15
    Sponsor:
    • IW3C2

    Acceptance Rates

    WWW '15 Paper Acceptance Rate 131 of 929 submissions, 14%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)735
    • Downloads (Last 6 weeks)79
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Heterogeneous Spatio-Temporal Graph Contrastive Learning for Point-of-Interest RecommendationTsinghua Science and Technology10.26599/TST.2023.901014830:1(186-197)Online publication date: Feb-2025
    • (2025)Context Correlation Discrepancy Analysis for Graph Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348837537:1(174-187)Online publication date: Jan-2025
    • (2025)Network alignmentPhysics Reports10.1016/j.physrep.2024.11.0061107(1-45)Online publication date: Mar-2025
    • (2025)Dynamic multi-scale feature augmentation for inductive network representation learningPattern Recognition10.1016/j.patcog.2024.111250161(111250)Online publication date: May-2025
    • (2025)Heterogeneous graph representation learning via mutual information estimation for fraud detectionJournal of Network and Computer Applications10.1016/j.jnca.2024.104046234(104046)Online publication date: Feb-2025
    • (2025)LeaDCDInformation Sciences: an International Journal10.1016/j.ins.2024.121341686:COnline publication date: 1-Jan-2025
    • (2025)Dynamic clustering-based consensus model for large-scale group decision-making considering overlapping communitiesInformation Fusion10.1016/j.inffus.2024.102743115(102743)Online publication date: Mar-2025
    • (2025)Variational spatial–temporal graph attention network for state monitoring and forecastingExpert Systems with Applications10.1016/j.eswa.2024.125718264(125718)Online publication date: Mar-2025
    • (2025)A transfer learning based patent transfer prediction model: Discussion about the condition of China and U.S.Expert Systems with Applications10.1016/j.eswa.2024.125335259(125335)Online publication date: Jan-2025
    • (2025)Software bug prediction using graph neural networks and graph-based text representationsExpert Systems with Applications10.1016/j.eswa.2024.125290259(125290)Online publication date: Jan-2025
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media