[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

UACD: A Local Approach for Identifying the Most Influential Spreaders in Twitter in a Distributed Environment

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Efficient spreading of important information through social media can be highly beneficial, while quick spreading of false content is alarming. Finding the users who are the most influential at information spreading can help develop efficient strategies. However, with the increasing growth of gigantic social networks, existing methods either lack accuracy or have high latency, sometimes being infeasible within limited memory. In this study, we find that rich user-specific information can guide us toward designing more effective methods. We propose UACD, a novel method for identifying the most influential spreaders on the Twitter social network by combining both user-specific and topological information. We provide a distributed implementation of our proposed algorithm on the Amazon EC2 and compare our ranking result with the state-of-the-art methods. Results suggest that UACD is scalable and can process a very large network while being on average \(\mathbf {12.5}\%\) more accurate and \(\mathbf {175}{\times }\) faster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Agichtein E, Castillo C, Donato D, Gionis A, Mishne G (2008) Finding high-quality content in social media. In: Proceedings of the 2008 international conference on web search and data mining. ACM, pp 183–194

  • Ahajjam S, Badir H (2018) Identification of influential spreaders in complex networks using hybridrank algorithm. Sci Rep 8(1):11932

    Google Scholar 

  • Akoglu H (2018) User’s guide to correlation coefficients. Turk J Emerg Med 18(3):91–93

    Google Scholar 

  • Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2005) k-core decomposition: a tool for the visualization of large scale networks. arxiv:cs/0504107

  • Alvarez-Hamelin JI, Dall’Asta L, Barrat A, Vespignani A (2006) Large scale networks fingerprinting and visualization using the k-core decomposition. In: Advances in neural information processing systems, pp 41–50

  • Amazon EC (2006) Amazon. See https://aws.amazon.com/ec2/. Accessed 15 June 2018

  • Anderson RM, May RM, Anderson B (1992) Infectious diseases of humans: dynamics and control, vol 28. Wiley Online Library

  • Bakshy E, Rosenn I, Marlow C, Adamic L (2012) The role of social networks in information diffusion. In: Proceedings of the 21st international conference on World Wide Web. ACM, pp 519–528

  • Batagelj V, Zaveršnik M (2011) Fast algorithms for determining (generalized) core groups in social networks. Adv Data Anal Classif 5(2):129–145

    MathSciNet  MATH  Google Scholar 

  • Bayer JB, Ellison NB, Schoenebeck SY, Falk EB (2016) Sharing the small moments: ephemeral social interaction on snapchat. Inf Commun Soc 19(7):956–977

    Google Scholar 

  • Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, pp 12

  • Bhatia V, Rani R (2017) A parallel fuzzy clustering algorithm for large graphs using Pregel. Expert Syst Appl 78:135–144

    Google Scholar 

  • Borgatti SP (1995) Centrality and aids. Connections 18(1):112–114

    Google Scholar 

  • Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25(2):163–177

    MATH  Google Scholar 

  • Burgess JE (2011) Youtube. Oxford Bibliographies Online

  • Carmi S, Havlin S, Kirkpatrick S, Shavitt Y, Shir E (2007) A model of internet topology using k-shell decomposition. Proc Natl Acad Sci 104(27):11150–11154

    Google Scholar 

  • Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on World wide web, pp 675–684

  • Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the tenth international workshop on multimedia data mining. ACM, p 4

  • Chan HK, Wang X, Lacka E, Zhang M (2016) A mixed-method approach to extracting the value of social media data. Prod Oper Manag 25(3):568–583

    Google Scholar 

  • Chen W, Lakshmanan LVS, Castillo C (2013) Information and influence propagation in social networks. Synth Lect Data Manag 5(4):1–177

    Google Scholar 

  • Chen X, Vorvoreanu M, Madhavan K (2014) Mining social media data for understanding students’ learning experiences. IEEE Trans Learn Technol 7(3):246–259

    Google Scholar 

  • Cohen R, Havlin S, Avraham D (2003) Efficient immunization strategies for computer networks and populations. Phys Rev Lett 91:247901

    Google Scholar 

  • Data Science Bootcamp. Understand Jaccard index, Jaccard similarity in minutes. Online; Accessed 17 Oct 2020

  • Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Google Scholar 

  • Disney A (2020) Social network analysis 101: centrality measures explained. Online; Accessed 17 Oct 2020

  • Doerr B, Fouz M, Friedrich T (2012) Why rumors spread so quickly in social networks. Commun ACM 55(6):70–75

    Google Scholar 

  • Dorogovtsev SN, Goltsev AV, Mendes JFF (2006) K-core organization of complex networks. Phys Rev Lett 96(4):040601

    Google Scholar 

  • Driss OB, Mellouli S, Trabelsi Z (2019) From citizens to government policy-makers: social media data analysis. Gov Inf Q 36(3):560–570

    Google Scholar 

  • Farahat A, LoFaro T, Miller JC, Rae G, Ward LA (2006) Authority rankings from hits, pagerank, and salsa: existence, uniqueness, and effect of initialization. SIAM J Sci Comput 27(4):1181–1201

    MathSciNet  MATH  Google Scholar 

  • Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40:35–41

    Google Scholar 

  • Fu Y-H, Huang C-Y, Sun C-T (2015a) Identifying super-spreader nodes in complex networks. Math Probl Eng 2015: 0-0

  • Fu K, Nune R, Tao JX (2015b) Social media data analysis for traffic incident detection and management. Technical report

  • Garton L, Haythornthwaite C, Wellman B (1997) Studying online social networks. J Comput Mediat Commun 3(1):JCMC313

    Google Scholar 

  • Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N project report, Stanford, vol 1, no 12

  • Guille A, Hacid H, Favre C, Zighed DA (2013) Information diffusion in online social networks: a survey. ACM SIGMOD Rec 42(2):17–28

    Google Scholar 

  • Hagberg A, Swart P, Schult D (2019) NetworkX: a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Online; Accessed 07 Dec 2019

  • Han M, Daudjee K, Ammar K, Özsu MT, Wang X, Jin T (2014) An experimental comparison of Pregel-like graph processing systems. Proc VLDB Endow 7(12):1047–1058

    Google Scholar 

  • Heesterbeek JAP (2000) Mathematical epidemiology of infectious diseases: model building, analysis and interpretation, vol 5. Wiley, New York

    MATH  Google Scholar 

  • Hodas NO, Lerman K (2014) The simple rules of social contagion. Sci Rep 4:4343

    Google Scholar 

  • Hopcroft J, Lou T, Tang J (2011) Who will follow you back?: reciprocal relationship prediction. In: Proceedings of the 20th ACM international conference on information and knowledge management. ACM, pp 1137–1146

  • Hu Y, Manikonda L, Kambhampati S (2014) What we Instagram: a first analysis of Instagram photo content and user types. In: Proceedings of the international AAAI conference on web and social media, vol 8

  • Iyer KV et al (2009) All-pairs shortest-paths problem for unweighted graphs in o (n2 log n) time. Int J Comput Inf Eng 3(2):320–326

    Google Scholar 

  • Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst (TOIS) 20(4):422–446

    Google Scholar 

  • Keeling MJ, Rohani P (2008) Modeling infectious diseases in humans and animals. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Khaouid W, Barsky M, Srinivasan V, Thomo A (2015) K-core decomposition of large networks on a single pc. Proc VLDB Endow 9(1):13–23

    Google Scholar 

  • Kitsak M, Gallos LK, Havlin S, Liljeros F, Muchnik L, Stanley HE, Makse HA (2010) Identification of influential spreaders in complex networks. Nat Phys 6(11):888–893

    Google Scholar 

  • Klemm K, Serrano MÁ, Eguíluz VM, Miguel MS (2012) A measure of individual role in collective dynamics. Sci Rep 2:292

    Google Scholar 

  • Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: WWW ’10: proceedings of the 19th international conference on world wide web. ACM, New York, pp 591–600

  • Leskovec J, Adamic LA, Huberman BA (2007) The dynamics of viral marketing. ACM Trans Web 1(1):5-es

  • Li Q, Zhou T, Lü L, Chen D (2014) Identifying influential spreaders by weighted leaderrank. Physica A 404:47–55

    MathSciNet  MATH  Google Scholar 

  • Liu Y, Wu Y-FB (2018) Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In: Thirty-second AAAI conference on artificial intelligence

  • Liu Y, Tang M, Zhou T, Do Y (2015) Improving the accuracy of the k-shell method by removing redundant links: from a perspective of spreading dynamics. Sci Rep 5:13172

    Google Scholar 

  • Liu J-G, Lin J-H, Guo Q, Zhou T (2016) Locating influential nodes via dynamics-sensitive centrality. Sci Rep 6:21380

    Google Scholar 

  • Lou T, Tang J, Hopcroft J, Fang Z, Ding X (2013) Learning to predict reciprocity and triadic closure in social networks. ACM Trans Knowl Discov from Data (TKDD) 7(2):5

    Google Scholar 

  • Lü L, Zhang Y-C, Yeung CH, Zhou T (2011) Leaders in social networks, the delicious case. PLoS ONE 6(6):e21202

    Google Scholar 

  • Lü L, Medo M, Yeung CH, Zhang Y-C, Zhang Z-K, Zhou T (2012) Recommender systems. Phys Rep 519(1):1–49

    Google Scholar 

  • Mahajan V (2010) Innovation diffusion. In: Wiley international encyclopedia of marketing (Part 1. Marketing Strategy). Wiley Online Library

  • Makice K (2009) Twitter API: up and running: learn how to build applications with the Twitter API. O’Reilly Media, Inc., New York

    Google Scholar 

  • Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data. ACM, pp 135–146

  • Martella C (2012) Apache giraph: distributed graph processing in the cloud

  • Martella C, Shaposhnik R, Logothetis D, Harenberg S (2015) Practical graph analytics with Apache Giraph. Springer, Berlin

    Google Scholar 

  • Martin N (2020) How social media has changed how we consume news. Online; Accessed 15 June 2020

  • Massie ML, Chun BN, Culler DE (2004) The ganglia distributed monitoring system: design, implementation, and experience. Parallel Comput 30(7):817–840

    Google Scholar 

  • Miller JC, Ting T (2020) Eon (epidemics on networks): a fast, flexible python package for simulation, analytic approximation, and analysis of epidemics on networks. arXiv:2001.02436

  • Montresor A, De Pellegrini F, Miorandi D (2013) Distributed k-core decomposition. IEEE Trans Parallel Distrib Syst 24(2):288–300

    Google Scholar 

  • Nadkarni A, Hofmann SG (2012) Why do people use Facebook? Personal Individ Differ 52(3):243–249

    Google Scholar 

  • Newman MEJ (2005) A measure of betweenness centrality based on random walks. Soc Netw 27(1):39–54

    Google Scholar 

  • Newman MEJ (2010) Networks: an introduction. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Newman MEJ, Watts DJ, Strogatz SH (2002) Random graph models of social networks. Proc Natl Acad Sci 99(suppl 1):2566–2572

    MATH  Google Scholar 

  • Noether GE (1981) Why Kendall tau? Teach Stat 3(2):41–43

    Google Scholar 

  • Okamoto K, Chen W, Li X-Y (2008) Ranking of closeness centrality for large-scale social networks. In: International workshop on frontiers in algorithmics. Springer, pp 186–195

  • Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Soc Netw 32(3):245–251

    Google Scholar 

  • Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical report 1999-66, Stanford InfoLab, November 1999. Previous number = SIDL-WP-1999-0120

  • Pal A, Counts S (2011) Identifying topical authorities in microblogs. In: Proceedings of the fourth ACM international conference on web search and data mining. ACM, pp 45–54

  • Pastor-Satorras R, Vespignani A (2001) Epidemic spreading in scale-free networks. Phys Rev Lett 86(14):3200

    Google Scholar 

  • Pastor-Satorras R, Vespignani A (2002) Immunization of complex networks. Phys Rev E 65(3):036104

    Google Scholar 

  • Romero DM, Galuba W, Asur S, Huberman BA (2011) Influence and passivity in social media. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 18–33

  • Sabidussi G (1966) The centrality index of a graph. Psychometrika 31(4):581–603

    MathSciNet  MATH  Google Scholar 

  • Seidman SB (1983) Network structure and minimum degree. Soc Netw 5(3):269–287

    MathSciNet  Google Scholar 

  • Shang S, Hwang K (1995) Distributed hardwired barrier synchronization for scalable multiprocessor clusters. IEEE Trans Parallel Distrib Syst 6(6):591–605

    Google Scholar 

  • Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST). IEEE, pp 1–10

  • Smith D, Moore L, et al (2004) The SIR model for spread of disease: the differential equation model. Loci. (originally Convergence.) https://www.maa.org/press/periodicals/loci/joma/the-sir-model-for-spread-of-disease-the-differential-equation-model

  • Thomo A, Liu F (2017) Computation of k-core decomposition on giraph. arXiv:1705.03603

  • Twitter Developer Team. Twitter API. Online; Accessed 1 Mar 2021

  • Wang Z, Zhao Y, Xi J, Changjiang D (2016) Fast ranking influential nodes in complex networks using a k-shell iteration factor. Physica A 461:171–181

    Google Scholar 

  • Weng J, Lim E-P, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the third ACM international conference on web search and data mining. ACM, pp 261–270

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Abdullah Adnan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Technical Background

Appendix A Technical Background

In recent years, the number of users of social network sites have been increased by a large scale. People now can publish their statuses and get connected with other users which we can refer to as social relationship (Garton et al. 1997). As we have already discussed in the earlier section, we can formally represent an online social network using a graph, where each user is represented by a node of the graph and the social relationships among them are represented by edges (directed or undirected) (Newman et al. 2002). In this graph, the nodes play an important role to disseminate information. This knowledge of node spreadability is very significant while it comes to develop an efficient method of accelerating the spreading in the case of information diffusion (Mahajan 2010) as well as decelerate it in the case of diseases (Heesterbeek 2000; Keeling and Rohani 2008). Therefore, in recent years, the microscopic study of spreadability for each node has caught attention of the researchers. Moreover, it can be very useful in case of finding out the initial spreaders of a contagious disease (Anderson et al. 1992) or any information (Fu et al. 2015a). In this section, we define some terminologies related to our work and then we provide a brief description of the existing techniques of finding the most influential spreaders in a network. Finally, we discuss the SIR model (Smith and Moore 2004) which we use to evaluate the merit of our proposed technique.

1.1 Centrality measurement

In graph theory, centrality is a term to describe the importance of an individual vertex within a graph or a network. Centrality measures (Disney 2020) were initially developed for social network analysis since it helps answering the question, “Which vertices are the most influential in a graph?” The popular centrality measures for such analysis can be divided into two types: local centrality measures and global centrality measures, briefly described below.

1.1.1 Local centrality measures

To compute the centrality of a node, local measures generally use the information of a node and its neighborhood only. The number of neighbors (degree of a node) plays the main role in such local measures and they are more suitable for networks modeled as undirected graphs. The two most popular local centrality measures are described below:

(i) Degree centrality Among all the centrality measures, Degree centrality (Disney 2020) is the simplest one. It assumes the nodes with maximum number of neighbors to be the most influential in the network.

If, G(VE) is a graph with the set of vertices, V and the set of edges, E, then the degree of a vertex v can be denoted by \(d_v\), and is defined as the total number of edges incident to it (i.e., the number of neighbors of v). Now, if the number of vertices, |V| is n, then the degree centrality of node v is given as follows:

$$\begin{aligned} DC(v) = \dfrac{d_v}{(n-1)} \end{aligned}$$
(13)

Here, \((n-1)\) is used to normalize the value of degree centrality within 0 and 1. The most important reason for using degree centrality for finding the most influential spreaders is its simplicity and low computational complexity. However, this measure typically fails to identify the most influential spreaders accurately. Despite this, there are several cases where degree centrality can provide surprisingly good performance. For example, if the spreading rate is very small, degree centrality is reported to perform better in finding the spreadability of nodes than other well-known centrality measures (Klemm et al. 2012; Liu et al. 2016).

(ii) K-core decomposition In case of degree centrality, the number of neighbors is solely responsible in determining the influence of any node in the network. Later Kitsak et al. (2010) determined that the location of the nodes in the network can impact more significantly in their spreadability. They identified that nodes located at the center of the network possess higher probability to be the most significant spreaders in a network than the nodes located at the perimeter of the network. To sum up, they suggested that the core number of a node should be considered as a more suitable measure in order to identifying the most influential spreaders of the network, and this core number can be determined by the k-core decomposition (Dorogovtsev et al. 2006; Alvarez-Hamelin et al. 2005) of the network.

At the very beginning of the k-core decomposition method, all the nodes with degree 1 are removed. This process is recursively continued until no node with degree 1 is remaining in the network. These removed nodes are grouped together to form 1-core. After that, in a similar way, all the nodes with residual degree 2 are removed recursively until no node with degree 2 is remaining in the network. All these removed node are then grouped together to form 2-core. This process is continued until all the nodes are assigned to some core group. In this conventional k-core decomposition method, nodes having the largest core number are considered to be located at the center of the network and they are assumed to have the most spreadability. Figure 7 presents a simple network and the core number for the nodes.

Fig. 7
figure 7

A simple network with core numbers (k) of the nodes

The drawback of k-core decomposition method is that it has a tendency to assign the same core number (k value) to multiple nodes in case of large networks. Therefore, we may end up having a huge number of nodes as the most influential spreaders, which may not be a desired outcome in many cases. However, the simplicity and lower computational complexity make this measure very useful. We suggest that incorporating user-specific information while computing the core decomposition helps to find the influential spreaders more accurately. Based on this, we propose User Attributed Core Decomposition (UACD) method and show that our proposed method significantly improves the accuracy without incurring noticeable computational overhead.

1.1.2 Global centrality measures

Global measures consider the whole network topology while computing the centrality of the nodes. Some of the most popular global centrality measures are briefly described below:

(i) Closeness centrality Closeness centrality (Disney 2020) is a measurement which identifies the closeness of a node from all the other nodes in the network. If the network can be represented by a connected graph, then the normalized version of closeness centrality of any node u of the graph is calculated as the average of all the shortest paths between u and all other nodes of the network.

Let \(l_{ij}\) be the length of shortest path between any two nodes (\(v_i\), \(v_j\)), and n be the number of nodes in the network, then the average shortest distance of node \(v_i\) from all other nodes, i.e., closeness centrality can be defined as (Sabidussi 1966),

$$\begin{aligned} L_{v_i} = \dfrac{1}{n-1} \sum _{i \ne j}l_{ij} \end{aligned}$$
(14)

The closeness centrality of node \(v_i\) can be thought as the inverse of the average shortest distance, \(L_i\) and defined as follows:

$$\begin{aligned} CC(v_i) = \dfrac{1}{L_{v_i}} = \dfrac{n-1}{\sum _{i \ne j}l_{ij}} \end{aligned}$$
(15)

However, this equation will not be suitable to use for a disconnected graph where some nodes may be unreachable from a node \(v_i\). For graphs with multiple connected components, Wasserman and Faust (Opsahl et al. 2010) revise the definition of closeness centrality. The closeness centrality of node \(v_i\) is now measured as the ratio of “the fraction of the nodes in the network which are reachable from \(v_i\), ” to the “average shortest distance of \(v_i\) from the reachable nodes.” Let \(n_i\) be the number of reachable nodes in the network from node \(v_i\), then the modified formula of measuring closeness centrality is given as follows:

$$\begin{aligned} CC(v_i) = \dfrac{n_i-1}{n-1} \dfrac{n_i-1}{\sum _{i \ne j, v_j \text { is reachable from } v_i} (l_{ij})} \end{aligned}$$
(16)

(ii) Betweenness centrality Betweenness centrality (Disney 2020) determines how many times a node falls along the shortest path between two different nodes (i.e., acts as a bridge between those two nodes). Freeman (1977) introduces this measure for quantifying the control of a user on the communication between other users in a social network.

Let \(v_s\), \(v_f\), and \(v_i\) are three different nodes in a network represented by G(VE). Now we define \(n_{sf}^i = 1\) if node \(v_i\) lies on the shortest path between \(v_s\) and \(v_f\), and 0 otherwise. The betweenness centrality of node \(v_i\) is defined as:

$$\begin{aligned} BC(v_i) = \sum _{(s, f) \in V} n_{sf}^i \end{aligned}$$
(17)

However, there can be more than one shortest path between \(v_s\) and \(v_f\) and that may count a node for centrality measure more than once. For this reason, if total number of shortest paths between \(v_s\) and \(v_f\) is \(g_{sf}\), the equation for finding betweenness centrality of node \(v_i\) is updated as follows:

$$\begin{aligned} BC(v_i) = \sum _{(s, f) \in V} \dfrac{n_{sf}^i}{g_{sf}} \end{aligned}$$
(18)

(iii) Eigenvector centrality Eigenvector centrality (Disney 2020) computes the centrality of a node in a network based on the centrality of its neighbors. The weights of the neighbors are assigned in such a way that a high scoring neighbor contributes more to the centrality of the node. Assume that, a graph G(VE) is represented by an adjacency matrix \(A = \{a_{ij}\}\), where \(\{a_{ij}\} = 1\) if nodes \(v_i\) and \(v_j\) are neighbors and 0 otherwise.

If \(\lambda\) is the eigenvalue of graph G, the eigenvector centrality for node \(v_i\) is the \(i^{th}\) element of the vector \(\mathbf {x}\) defined by the equation:

$$\begin{aligned} A\mathbf {x} = \lambda \mathbf {x} \end{aligned}$$
(19)

There can be multiple eigenvalues, i.e., multiple values of \(\lambda\) and for which multiple solutions can be found. However, for the largest eigenvalue of the adjacency matrix A, according to the Perron–Frobenius theorem, there exists a unique solution of x which contains all positive entries (Newman 2010).

1.2 Susceptible-infected-recovered (SIR) model

The SIR model (Smith and Moore 2004) is one of the widely used models in epidemiology. In an SIR model, a node in the network can be at one of the three states: Susceptible, Infected, and Recovered. For better understanding, we explain these states using “people” instead of “nodes.”

  • S (Susceptible): The group of people who have not been infected with the disease yet. Additionally, they are not immune to the disease, and therefore, they are under the threat of being infected in the future.

  • I (Infected): The group of people who have already been infected with the disease. Moreover, they can transmit the disease to the susceptible neighbors with a probability of \(\beta\).

  • R (Recovered): The group of people who have either recovered from the disease or dead. The recovered people are immune to the disease and no longer can transmit the disease to the susceptible neighbors.

The SIR model is capable of finding the spreadability of a node accurately by considering the node as the only infected node at the beginning, running the simulation multiple times, and considering the average. If the network contains n nodes, one needs to simulate the entire network n times (with multiple runs) to find the spreadability of all the nodes. Although this is the most accurate method of finding the most influential spreaders, it is not realizable for a large network due to its high computational cost.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Adnan, T.M.T., Islam, M.S., Papon, T.I. et al. UACD: A Local Approach for Identifying the Most Influential Spreaders in Twitter in a Distributed Environment. Soc. Netw. Anal. Min. 12, 37 (2022). https://doi.org/10.1007/s13278-022-00862-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-022-00862-3

Keywords

Navigation