Abstract
The detection of central nodes in a network is a fundamental task in network science and graph data analysis. During the past decades, numerous centrality measures have been presented to characterize what is a central node. However, few studies address this issue from a statistical inference perspective. In this paper, we formulate the central node identification issue as a weighted kernel density estimation problem on graphs. Such a formulation provides a generic framework for recognizing central nodes. On one hand, some existing centrality evaluation metrics can be unified under this framework through the manipulation of kernel functions. On the other hand, more effective methods for node centrality assessment can be developed based on proper weighting coefficient specification. Experimental results on 20 simulated networks and 53 real networks show that our method outperforms both six prior state-of-the-art centrality measures and two recently proposed centrality evaluation methods. To the best of our knowledge, this is the first piece of work that addresses the central node identification issue via weighted kernel density estimation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Anderson RM, May RM (1992) Infectious diseases of humans: dynamics and control. Oxford University Press, New York
Batagelj V, Mrvar A (2006) Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/
Bavelas A (1948) A mathematical model for group structures. Hum Organ 7(3):16–30
Bian R, Koh YS, Dobbie G, Divoli A (2019) Identifying top-k nodes in social networks: a survey. ACM Computing Surveys 52(1)
Boldi P, Vigna S (2014) Axioms for centrality. Internet Math 10(3–4):222–262
Bonacich P (1987) Power and centrality: a family of measures. Am J Sociol 92(5):1170–1182
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst 30:107–117
Callon M (1990) Techno-economic networks and irreversibility. Sociol Rev 38(1):132–161
Castellano C, Pastor-Satorras R (2010) Thresholds for epidemic spreading in networks. Phys Rev Lett 105(21):218701
Chen D, Lü L, Shang M-S, Zhang Y-C, Zhou T (2012) Identifying influential nodes in complex networks. Phys A Stat Mech Appl 391(4):1777–1787
Das K, Samanta S, Pal M (2018) Study on centrality measures in social networks: a survey. Soc Netw Anal Min 8:13
De Meo P, Levene M, Messina F, Provetti A (2020) A general centrality framework-based on node navigability. IEEE Trans Knowl Data Eng 32(11):2088–2100
Deng Z, Chung F-L, Wang S (2008) FRSDE: fast reduced set density estimator using minimal enclosing ball approximation. Pattern Recogn 41(4):1363–1372
Erdös P, Rényi A (1959) On random graphs I. Publ Mathematicae 6:290–297
Eubank S, Guclu H, Kumar VA, Marathe MV, Srinivasan A, Toroczkai Z, Wang N (2004) Modelling disease outbreaks in realistic urban social networks. Nature 429(6988):180–184
Fan T, Lü L, Shi D, Zhou T (2021) Characterizing cycle structure in complex networks. Commun Phys 4:272
Frank O (2002) Using centrality modeling in network surveys. Soc Netw 24(4):385–394
Freeman LC (1978) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239
Girolami M, He C (2003) Probability density estimation from optimally condensed data samples. IEEE Trans Pattern Anal Mach Intell 25(10):1253–1264
Goh KI, Kahng B, Kim D (2001) Universal behavior of load distribution in scale-free networks. Phys Rev Lett 87:278701
Gramacki A (2018) Nonparametric kernel density estimation and its computational aspects, vol 37. Springer, Gewerbestrasse
Gurney K (1997) An introduction to neural networks. UCL press, New York
He C, Girolami M (2004) Novelty detection employing an \({L}_2\) optimal non-parametric density estimator. Pattern Recogn Lett 25(12):1389–1397
Hirsch JE (2005) An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA 102(46):16569–16572
Jackson MO, Watts A (2002) The evolution of social and economic networks. J Econ Theory 106(2):265–295
Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93
Kennedy D, Selverston AI, Remler MP (1969) Analysis of restricted neural networks. Science 164(3887):1488–1496
Kim TT, Poor HV (2011) Strategic protection against data injection attacks on power grids. IEEE Trans Smart Grid 2(2):326–333
Kiss IZ, Miller JC, Simon PL (2017) Mathematics of epidemics on networks. Springer, Gewerbestrasse
Liu X, Hong Z, Liu J, Lin Y, Rodríguez-Patón A, Zou Q, Zeng X (2020) Computational methods for identifying the critical nodes in biological networks. Brief Bioinform 21(2):486–497
Liu Y, Ruppert D (2021) Density estimation on a network. Comput Stat Data Anal 156:107128
Liu Y, Chen W, He Z (2021a) Essential protein recognition via community significance. IEEE Trans Comput Biol Bioinform 18(6):2788–2794
Liu Y, Wei X, Chen W, Hu L, He Z (2021b) A graph-traversal approach to identify influential nodes in a network. Patterns 2(9):100321
Liu Y, Liang H, Zou Q, He Z (2022) Significance-based essential protein discovery. IEEE Trans Comput Biol Bioinform 19(1):633–642
Lordan O, Sallan JM, Simo P (2014) Study of the topology and robustness of airline route networks from the complex network approach: a survey and research agenda. J Transp Geogr 37:112–120
Lordan O, Sallan JM, Escorihuela N, Gonzalez-Prieto D (2016) Robustness of airline route networks. Phys A Stat Mech Appl 445:18–26
Lü L, Chen D, Ren X, Zhang Q, Zhang Y, Zhou T (2016a) Vital nodes identification in complex networks. Phys Rep 650:1–63
Lü L, Zhou T, Zhang Q-M, Stanley HE (2016b) The H-index of a network node and its relation to degree and coreness. Nat Commun 7:10168
Miller JC, Ting T (2019) EoN (epidemics on networks): a fast, flexible python package for simulation, analytic approximation, and analysis of epidemics on networks. J Open Source Softw 4(44):1731
Morone F, Makse HA (2015) Influence maximization in complex networks through optimal percolation. Nature 524(7563):65–68
Newman ME (2002) Spread of epidemic disease on networks. Phys Rev E 66:016128
Okabe A, Satoh T, Sugihara K (2009) A kernel density estimation method for networks, its computational method and a GIS-based tool. Int J Geogr Inf Sci 23(1):7–32
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076
Rossi RA, Ahmed NK (2015) The network data repository with interactive graph analytics and visualization. In: AAAI . https://networkrepository.com
Sabidussi G (1966) The centrality index of a graph. Psychometrika 31(4):581–603
Sain SR (1994) Adaptive kernel density estimation. Rice University, Houston
Siegel S (1956) Nonparametric statistics for the behavioral sciences. McGraw-hill, New York
Stephenson K, Zelen M (1989) Rethinking centrality: methods and examples. Soc Netw 11(1):1–37
Terrell GR, Scott DW (1992) Variable kernel density estimation. Ann Stat 20:1236–1265
Wang T-C, Phoa FKH (2016) Focus statistics for testing network centrality on uncorrelated random graphs. Netw Sci 4(4):460–473
Wang S, Wang J, Chung F.I (2014) Kernel density estimation, kernel methods, and fast learning in large data sets. IEEE Trans Cybern 44(1):1–20
Waniek M, Michalak TP, Rahwan T, Wooldridge M (2017) On the construction of covert networks. In: Proceedings of the 16th conference on autonomous agents and multiagent systems. pp 1341–1349
Xie Z, Yan J (2008) Kernel density estimation of traffic accidents in a network space. Comput Environ Urban Syst 32(5):396–406
Yang Y, Nishikawa T, Motter AE (2017) Small vulnerable sets determine large network cascades in power grids. Science 358(6365):eaan3184
Zhang J, Xu X, Li P, Zhang K, Small M (2011) Node importance for dynamical process on networks: a multiscale characterization. Chaos Interdiscip J Nonlinear Sci 21:016107
Zhang H, Zhong S, Deng Y, Cheong KH (2021) LFIC: identifying influential nodes in complex networks by local fuzzy information centrality. IEEE Trans Fuzzy Syst 30(8):3284–3296
Acknowledgements
This work has been supported by the Natural Science Foundation of China under Grant No. 61972066 and the Dalian Young Science and Technology Talent Support Program No. 2023RQ056. Thanks to Prof. Pasquale De Meo from University of Messina for providing the source codes of Potential Gain. Thanks to Dr. Marcin Waniek from New York University Abu Dhabi for providing the source codes of the “captain network”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Evangelos Papalexakis.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Y., Feng, X., Lou, J. et al. Central node identification via weighted kernel density estimation. Data Min Knowl Disc 38, 1417–1439 (2024). https://doi.org/10.1007/s10618-024-01003-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-024-01003-4