Identifying Influential Nodes in Social Networks: Exploiting Self-Voting Mechanism

Abstract

The influence maximization (IM) problem is defined as identifying a group of influential nodes in a network such that these nodes can affect as many nodes as possible. Due to its great significance in viral marketing, disease control, social recommendation, and so on, considerable efforts have been devoted to the development of methods to solve the IM problem. In the literature, VoteRank and its improved algorithms have been proposed to select influential nodes based on voting approaches. However, in the voting process of these algorithms, a node cannot vote for itself. We argue that this voting schema runs counter to many real scenarios. To address this issue, we designed the VoteRank* algorithm, in which we first introduce the self-voting mechanism into the voting process. In addition, we also take into consideration the diversities of nodes. More explicitly, we measure the voting ability of nodes and the amount of a node voting for its neighbors based on the H-index of nodes. The effectiveness of the proposed algorithm is experimentally verified on 12 benchmark networks. The results demonstrate that VoteRank* is superior to the baseline methods in most cases.

Introduction

In the past two decades, the rapid development of information technology has brought huge convenience to individuals. More and more people have gradually participated in various social networks, such as Facebook, Twitter, and WeChat, on which a pool of information is exchanged every day.^1,2 The boom of social networks has facilitated the research related to network analysis and mining. Among these primitive tasks of network analysis and mining, the influence maximization (IM)^3,4 has aroused remarkable attention of researchers due to its immense potential in a plentiful of applications such as viral marketing,⁵ disease analysis,⁶ rumor control,⁷ and social computing.⁸ The IM problem is defined to select a group of influential nodes as seed nodes to disseminate information on a network and make the influence of these nodes as broadly as possible.^9,10 Accordingly, the key issue to settle the IM problem is how to identify the influential nodes.^11,12

To address the issue, a large number of algorithms have been proposed in the literature. Kempe et al.¹¹ proved that it is nondeterministic polynominal-hard to obtain the optimal set of seed nodes in IM problem and proposed a greedy algorithm to settle the problem. However, their greedy algorithm is extremely inefficient in large-scale networks. Afterward, Leskovec et al.¹³ developed the Cost-Effective Lazy Forward selection (CELF) algorithm to reduce the computational cost of the greedy algorithm using a lazy forward strategy. By the same token, Goyal et al.¹⁴ proposed the CELF++ algorithm based on the submodular property of objective function. In comparison with CELF, CELF++ further enhances the time efficiency since it reduces the unnecessary checking of candidates. Although these algorithms have the ability to acquire nodes with high propagation quality, they are very time-consuming owing to the requirement of Monte Carlo simulation.

However, heuristic algorithms were designed to solve the IM problem. Degree centrality considers the degree of a node when determining its importance.¹⁵ Eigenvector centrality holds the view that a node is important if its neighbor is important.¹⁶ H-index is often used to evaluate the influence and reputation of journals or scholars. Lü et al.¹⁷ first extended H-index into networks to evaluate the influence of nodes. By mixing the degree and core number of a node as well as the diversity of its neighbors, Sheikhahmadi and Nematbakhsh² proposed the mixed core, degree and entropy (MCDE) algorithm to calculate the importance of the node. The cluster rank¹⁸ measures the influentiality of a node according to its local clustering coefficient. The idea behind cluster rank is that the neighbors with high clustering coefficients can shorten the range of information dissemination.

Inspired by the cluster rank algorithm, Zareie et al. presented an Extended Cluster Coefficient Ranking Measure (ECRM), in which they further take into account the common hierarchy between a node and its neighbors. Wang et al.¹⁹ proposed an improved K-shell²⁰ method, that is, IKS, by considering the information entropy of nodes. This method chooses influential nodes from not only the higher shell but also the lower shell since nodes in a lower shell sometimes may be more influential than nodes in a higher shell.² Zhang et al.²¹ came up with a vote-based method, called VoteRank, to find the influential spreaders in a network. The VoteRank method selects the spreaders as per the voting scores of nodes attained from their neighbors. The WVoteRank approach²² extended VoteRank by considering both node degree and edge weight in voting process.

WVoteRank can be applied for both unweighted and weighted networks. Moreover, Kumar and Panda²³ improved the WVoteRank approach by extending the scope of neighborhood from one-hop to two-hop. The EnRenew algorithm²⁴ leverages the information entropy of nodes when identifying influential spreaders and then outperforms VoteRank in terms of final affected scale. In our previous work,²⁵ we developed the $V o t e R a n k^{+ +}$ approach, which solves some issues in VoteRank by exploiting the diversities of nodes in voting process.

Among the above algorithms, VoteRank²¹ and its variants^22–25 addressed the IM problem efficiently and effectively to some extent. However, in these algorithms, the case of self-voting is not considered in voting process. In other words, when choosing influential nodes, a node cannot vote for itself in VoteRank and its variants. We argue that, when electing the influential nodes, self-voting is of significance due to the following two reasons: (1) in many real-world scenarios, one can certainly vote for oneself; (2) self-voting helps to the find influential nodes more accurately. In VoteRank, even the most influential node cannot vote for itself; only its neighbors can choose it as the influential node.

In this regard, we design an improved VoteRank algorithm, named VoteRank*, to finding influential nodes by exploiting the case of self-voting. Besides, we also take into account the diversities of nodes in VoteRank*. To be specific, we compute the voting ability of a node and the voting proportion of a node voting for others based on the H-index values of nodes. To verify the performance of VoteRank* algorithm, we conduct extensive experiments on 12 networks. Experimental results manifest that VoteRank* can enhance the spreading speed and ability compared with the state-of-the-art approaches.

The remaining of this article is organized as follows. We present the proposed method in The Proposed Method: VoteRank* section. In the Experiments section, we show and analyze the experimental results. Finally, the Conclusions section concludes this work.

The Proposed Method: VoteRank*

In this section, we describe our proposed VoteRank* algorithm. VoteRank* improves VoteRank²¹ and its successors^22–26 by mainly introducing the self-voting mechanism into the voting process when selecting influential nodes. In addition, VoteRank* also considers the difference of nodes in their voting ability and the different voting proportion between nodes. The VoteRank* algorithm is composed of the following steps: (1)

Initialize the voting ability of each node;

(2)

Compute the voting proportion that each node votes for itself and its neighbors;

(3)

Calculate the voting score of a node obtained from itself and its neighbors;

(4)

Choose the node that gets the highest voting score as an influential node;

(5)

Suppress the voting ability of the selected influential node and its one-hop and two-hop neighbors;

(6)

Repeat steps 3 to 5 until enough influential nodes are selected.

The detailed procedure of VoteRank* is shown in Algorithm 1. In what follows, the key components of the proposed algorithm are explained.

Initializing voting ability

To vote for the influential spreaders, we need to initialize a value for the voting ability of each node. In VoteRank, each node gets an initial voting ability as 1. However, this initialization is unreasonable. Since nodes in a network can be different from many aspects such as position, function, and role, their voting ability should also be diverse. To be specific, we initialize the voting ability of node v, in this article, according to the following equation: $v a_{v} = log (1 + \frac{h_{v}}{h_{m a x}}),$ (1)

where h_v is the H-index value of node v and $h_{m a x}$ is the maximal H-index value of all nodes in the network. To calculate the H-index value of node v, we take into account not only the degree of v but also the degrees of its neighbors. Consequently, compared with degree, H-index can measure the significance of a node more accurately.

Computing voting scores

In the voting process of VoteRank-like algorithms, a node can only vote for its neighbors. Even the most important node cannot vote for itself. Actually, on many real voting occasions, candidates can vote for themselves. Therefore, the voting mechanisms in those algorithms do not conform to common sense and are not conducive to the identification of influential spreaders. To overcome this drawback, the VoteRank* algorithm introduces a self-voting mechanism in the voting process for the first time. That is, in VoteRank*, a node can vote for itself as well as for its neighbors. Based on self-voting and H-index, we define the voting proportion that node u votes for neighbor v, denoted as $p_{u v}$ , in the following equation:

p_{u v} = \frac{h_{v}}{\sum_{w \in Γ_{u}} h_{w}},

(2)

where $Γ_{u}$ is the set of nodes including u and all of its one-hop neighbors. Here, H-index values of nodes are used to measure the importance of nodes. Given two nodes $x, y \in Γ_{u}$ , according to Equation (2), if $h_{x} > h_{y}$ , u gives more vote to x than to y.

In the voting process, a node gets votes from itself and its neighbors. By aggregating those votes, we can compute the voting score of the node. Formally, given node v, its voting score is calculated as follows:

In Equation (3), $p_{u v} \cdot v a_{u}$ is the amount of vote that v gets from u. This equation takes the sum of votes that v obtained and the number of v's neighbors into consideration. As mentioned in the study by Sun et al.,²² both of them contribute to the voting score of node v.

When all nodes obtained their voting scores, the node that has the largest score is regarded as the influential spreader identified in this iteration (line 10 in Algorithm 1).

Updating voting ability and voting scores

VoteRank* iteratively chooses the influential spreaders. After a spreader is selected, VoteRank* starts a new iteration to choose the next spreaders if the selected nodes are not enough (lines $11 \sim 26$ in Algorithm 1). Suppose v is the node selected in the last iteration, it will no longer participate in the subsequent voting turns. Meanwhile, to ensure that the selected spreaders are broadly distributed in the network, we reduce the selection probability of nodes those are close to v. To this end, we weaken the voting ability of these nodes. In VoteRank,²¹ only the nodes those voted for v are suppressed. However, the suppression range is not enough. Although VoteRank⁺⁺²⁵ restrains the voting ability of both one-hop and two-hop neighbors, there is a parameter in the suppression strategy. In this regard, we propose a parameter-free strategy to discount the voting ability of multi-hop neighbors. Suppose u is a l-hop neighbor of the influential node v, the voting ability of u is updated as follows: $v a_{u} = (log l) \cdot v a_{u}$ (4)

If u is a one-hop neighbor of v, the updated $v a_{u}$ is 0 since $l = 1$ . That is to say, u will no longer vote for others. This is reasonable because u has voted for the most influential node in the last iteration. As per Equation (4), we can discount the voting ability of multi-hop neighbors of the selected node. However, in the implementation of VoteRank*, in this study, we only suppress the one-hop and two-hop neighbors (lines $15 \sim 18$ in Algorithm 1).

Afterward, VoteRank* updates the voting scores of unselected nodes to pick the next spreader. In fact, in this iteration, only a subset of nodes needs to update their voting scores. Given a node u, if the voting ability of itself and its neighbors is not discounted in this iteration, its voting score keeps the same. Consequently, in VoteRank*, we only need to recompute the voting scores of nodes within three hops of the spreader selected in the last iteration (lines $20 \sim 25$ in Algorithm 1). On the basis of this way, VoteRank* avoids the unnecessary time cost.

Complexity analysis

Now, we analyze the time complexity of Algorithm 1. Let N be the number of nodes and $⟨k⟩$ be the average degree of nodes in the network. Lines $2 \sim 3$ initialize the voting ability of all nodes. Suppose the H-index values of all nodes are already obtained, the time complexity of lines $2 \sim 3$ is $O (N)$ . The complexity of lines $4 \sim 9$ , which calculate all nodes' voting scores, is $O (N {⟨k⟩}^{2})$ . In line 10, VoteRank* takes $O (N)$ to pick the first spreader. As a consequence, the time complexity of the above lines is $O (N + N {⟨k⟩}^{2})$ .

Next, VoteRank* iteratively identifies the remaining $s - 1$ influential nodes. Lines $15 \sim 16$ and lines $17 \sim 18$ , respectively, suppress the voting ability of the one-hop neighbors and two-hop neighbors of the just selected spreader, say v. The corresponding computational complexities are $O (⟨k⟩)$ and $O ({⟨k⟩}^{2})$ . In addition, we store the one-hop, two-hop, and three-hop neighbors of v in Q. Lines $20 \sim 25$ update the voting scores of nodes in Q. The time complexity is $O (q {⟨k⟩}^{2})$ , where $q = | Q |$ . Line 26 has the complexity of $O (N)$ . As a result, the complexity from line 12 to line 26 is $O (⟨k⟩ + {⟨k⟩}^{2} + q {⟨k⟩}^{2} + N) = O (q {⟨k⟩}^{2} + N)$ . In conclusion, the time complexity of Algorithm 1 is $O (N {⟨k⟩}^{2} + s q {⟨k⟩}^{2} + s N) \approx O (N {⟨k⟩}^{2} + N)$ since $s = N$ and $q < N$ .

Differences between VoteRank and VoteRank*

In this subsection, we highlight the differences between VoteRank and VoteRank*, which are listed as follows:

(1)

When initializing the voting ability of nodes, VoteRank assigns the same value to each node, whereas VoteRank* uses the H-index value of each node.

(2)

In VoteRank*, the neighbor set of a node includes its one-hop neighbor and itself; however, the neighbor set of a node only contains its one-hop neighbor in VoteRank.

(3)

In the voting process, a node votes the same for each of its neighbors in VoteRank, and the voting score of a node is the sum of the votes it got from its neighbors. In VoteRank*, a node votes different amount of votes for its neighbors, and the voting score of a node is computed based on the weighted sum of the votes it got from its neighbors and the number of neighbors.

(4)

After selecting a spreader, VoteRank weakens the voting ability of the one-hop neighbors of the spreader, whereas VoteRank* weakens the voting ability of its one-hop and two-hop neighbors.

Experiments

Data sets

In this article, we evaluate the performance of VoteRank* via extensive experiments on 12 networks drawn from multiple fields. These 12 networks are as follows: (1) an email network among members of a university (Email),²⁷ (2) a collaboration network of researchers working on Network Science (NetSci),²⁸ (3) a friendship network extracted from hamsterster.com (Hamster),²⁹ (4) a social network extracted from Facebook (Facebook),³⁰ (5) the metabolic network of Caenorhabdities elegans (CE),³¹ (6) the collaboration network between a group of jazz musicians (Jazz),³² (7) a network of scientists studying Condense Matter Physics (Condmat),³³ (8) a structure network of the internet at the router level (Router),³⁴ (9) a trust network encrypted based on the Pretty Good Privacy algorithm on July 2001 (PGP),³⁵ (10) the protein-to-protein interaction network of budding yeast (Yeast),³⁶ (11) a network describing the U.S. air transportation system in 2010 (USAir),³⁷ and (12) a power grid network of the United States (Power).³⁸

Table 1 displays the topological characteristics of the 12 networks. In this article, we regard all networks as undirected ones.

Table 1.

The basic structural characteristics of the 12 networks

Network	N	M	〈k〉	k _max	C
Email	1133	5451	9.62	71	0.22
NetSci	379	914	4.82	34	0.74
Hamster	2426	16,631	13.71	273	0.54
Facebook	63,731	81,7090	25.64	1098	0.61
CE	453	2025	8.94	237	0.65
Jazz	198	2742	27.69	100	0.62
Condmat	23,133	93,497	8.08	281	0.63
Router	5022	6258	106	2.49	0.01
PGP	10,680	24,316	4.55	205	0.27
Yeast	2224	6609	5.94	64	0.14
USAir	1574	17,215	21.87	314	0.50
Power	4941	6594	2.67	19	0.08

Here, N and M are the numbers of nodes and edges, respectively. $⟨k⟩$ and $k_{m a x}$ denote the average degree and maximal degree of nodes, respectively. C represents the clustering coefficient.

CE, Caenorhabdities elegans.

Performance metrics

Susceptible–Infected–Recovered model

So far, many researchers have adopted the Susceptible–Infected–Recovered (SIR) model³⁹ to investigate the effectiveness of IM algorithms. The SIR model is a famous epidemic spreading model, initially used to simulate the dynamic of disease spreading.³⁹ In the literature about IM problem, it has been adopted by researchers to evaluate the performance of IM algorithms.^19,21,40,41

In the SIR model, each node has one of three states: Susceptible (S), Infected (I), and Recovered (R). At the initial time, the states of all seed nodes are I, whereas others belong to S. In each propagation iteration, each infected node spreads the disease to its susceptible neighbors with probability $μ$ ; the infected neighbors will enter state I. Then, the infected nodes become recovered with probability $ξ$ ; the recovered nodes will no longer be infected. When the number of infected nodes becomes stable, the propagation process is stopped. Here, $β = \frac{μ}{ξ}$ denotes the infected rate, which is of crucial to the spreading process.^21,24 To better simulate the propagation process, we set $μ = 1.5 μ_{c}$ as in the study by Guo et al.,²⁴ where $μ_{c}$ is the spreading threshold of SIR.⁴²

Based on the SIR model, the infected scale $F (t)$ at time t and the final infected scale $F (t_{c})$ can be used to describe the spreading ability of the influential nodes.^19,21,24 The infected scale $F (t)$ is defined as follows: $F (t) = \frac{N_{I (t)} + N_{R (t)}}{N},$ (5)

where N is the number of nodes, and $N_{I (t)}$ and $N_{R (t)}$ indicate the numbers of nodes in states I and R at time t, respectively. The larger $F (t)$ is, the stronger the spreading ability of the seed nodes at time t is.

The final infected scale $F (t_{c})$ denotes the infected scale when steady state is reached. The formal definition of $F (t_{c})$ is as follows: $F (t_{c}) = \frac{N_{R (t_{c})}}{N},$ (6)

where $N_{R (t_{c})}$ is the number of nodes in state R when the spreading process is stable.

Average shortest path length

In addition to the SIR model, the average shortest path length L_s between the seed nodes is also used to evaluate the performance of an IM algorithm.²¹ Since the identified influential nodes are expected to widely distribute in a network, this measure can gauge the distribution of seed nodes. The computation of L_s is as follows:

where S is set of seed nodes, and $l_{u v}$ represents the length of the shortest path between u and v. If u cannot reach v, $l_{u v} = δ + 1$ , where $δ$ is the diameter of the giant connected component.

Results and analysis

In this subsection, we analyze the performance of VoteRank* by comparing it with VoteRank,²¹ Degree centrality,¹⁵ K-shell,²⁰ H-index,¹⁷ IKS,¹⁹ ECRM,⁴¹ MCDE,² EnRenew,²⁴ and VoteRank⁺⁺.²⁵

To propagate information according to an IM algorithm, we need to select s seed nodes, that is, the initial influential spreaders. For convenience, we define the ratio of seed nodes $ρ = \frac{s}{N}$ , where N is the number of nodes in a network.

Figure 1 shows the values of infected scale $F (t)$ obtained by different algorithms along with spread time t under the SIR model. We can see from the results that, with the same number of seed nodes, the proposed VoteRank* approach reaches the largest infected scales in most cases. Specially, the seed nodes identified by VoteRank* can influence more nodes than those selected by baselines over the networks of Email, Hamster, Power, USAir, and Condmat at the steady stage. But, the performance of some baselines, such as IKS and VoteRank, fluctuates dramatically across networks.

FIG. 1.

The infected scale $F (t)$ against time t on 12 networks. The results are the average of 1000 independent runs with β = 1.5 and ρ = 0.03 under the SIR model. CE, Caenorhabdities elegans; SIR, Susceptible–Infected–Recovered.

Meanwhile, from the point of view of propagation speed, to spread some message to the same number of nodes, we can spend less time using the seed nodes identified by VoteRank* than those picked out by compared methods on most networks. As a conclusion, the VoteRank* algorithm can find the influential nodes that have stronger information spreading ability in comparison with other methods. In addition, by comparing VoteRank* with VoteRank, EnRenew, and VoteRank⁺⁺, all of them are VoteRank-like algorithms, we observe that VoteRank* outperforms others on 7 of the 12 networks. This result indicates that the self-voting mechanism proposed in this article can facilitate the identification of influential spreaders.

Next, we analyze the final infected scales of these algorithms with different ratio of seed nodes $ρ$ . The experimental results are presented in Figure 2. It is evident from the figure that VoteRank* performs better than the baseline methods on the whole. Specifically, the performance of VoteRank* always ranks at the top places except on Facebook with larger values of $ρ$ . On the contrary, the baseline methods suffer from performance fluctuation over different networks. In addition, when the value of $ρ$ is small, the infection scales reached by different algorithms are not much different since there are a little number of seed nodes. However, the influence ranges reached by VoteRank* broaden rapidly with the increase of $ρ$ on most networks. Besides, Figure 2 demonstrates that the seed nodes identified by VoteRank* can affect larger scales of nodes than those selected by VoteRank, EnRenew, and VoteRank⁺⁺ in most of the case. This result further proves the superior effectiveness of the VoteRank* method.

FIG. 2.

The final infection scale $F (t_{c})$ with different ratio of seed nodes ρ. The results are the average of 1000 independent runs with β = 1.5 under the SIR model.

Kitsak et al.²⁰ uncovered that the distances between spreaders play a vital role in maximizing the influences. The wider the distribution of seed nodes, the broader the spread of information. Therefore, we further employ the average shortest path length L_s of seed nodes to measure the distribution of them. The results in Figure 3 show that VoteRank* is superior to baselines since it always ranks at the top places. In other words, VoteRank* can select seed nodes more widely in the network, especially in large-scale networks such as Condmat and Facebook. In addition, we find that the corresponding lines of both VoteRank* and VoteRank⁺⁺ are almost coincident. Besides self-voting, both methods consider the diversities of nodes and suppress the one-hop and two-hop neighbors. As a consequence, most of the influential nodes identified by them may be the same. Nevertheless, VoteRank* outperforms VoteRank and EnRenew in most of the cases. This further confirms that the VoteRank* algorithm can identify the nodes that are broadly distributed.

FIG. 3.

The average shortest path length L_s between spreaders selected by different methods on 12 networks.

Ablation study

In this subsection, we design an ablation experiment to further demonstrate the effectiveness of the proposed VoteRank*. The most important contribution of VoteRank* is introducing the self-voting mechanism into the voting process. Therefore, the ablation experiment is to analyze the influence of self-voting to the performance of VoteRank*. The results are shown in Figure 4. In this figure, VoteRank*_selfvoting denotes the version of VoteRank* without self-voting. As can be seen from Figure 4, compared with VoteRank*_selfvoting, VoteRank* can find the spreaders that have more infected scales on most networks. As a consequence, the self-voting mechanism plays a key role in the process of selecting spreaders.

FIG. 4.

Ablation study of self-voting on 12 networks. VoteRank*_selfvoting is the version of VoteRank* without self-voting. The results are the average of 1000 independent runs with β = 1.5 and ρ = 0.03 under the SIR model.

Conclusions

In this article, we proposed the VoteRank* algorithm to address the IM problem. Inspired by VoteRank and its variants, VoteRank* also leverages a voting approach to choose the influential spreaders. However, in VoteRank and its variants, a node cannot vote for itself. This is a serious defect since, in many real scenarios, one candidate can certainly vote for himself/herself. In VoteRank*, we first introduce the self-voting mechanism to address this defect. Besides, we also consider the diversity of nodes in the voting process of VoteRank*. To verify the performance of the VoteRank* algorithm, we carried out experiments on 12 real networks. The experimental results indicate that the VoteRank* algorithm outperforms the state-of-the-art baselines in most cases.

Footnotes

Authors' Contributions

P.L.: Conceptualization (supporting), software (lead), formal analysis (supporting), writing—original draft (lead), writing—review and editing (equal). L.L.: Conceptualization (lead), formal analysis (lead), writing—review and editing (equal). Y.W.: Software (supporting), writing—original draft (supporting). S.F.: Conceptualization (supporting), formal analysis (supporting).

Author Disclosure Statement

No competing financial interests exist.

Funding Information

This work was supported in part by the Science and Technology Program of Gansu Province (Nos. 21JR7RA458 and 21ZD8RA008) and the Supercomputing Center of Lanzhou University.

Abbreviations Used

References

Bond

, Fariss

, Jones

, et al. A 61-million-person experiment in social influence and political mobilization. Nature, 2012; 489(7415):295–298; doi: 10.1038/nature11421

Sheikhahmadi

, Nematbakhsh

. Identification of multi-spreader users in social networks for viral marketing. J Inform Sci, 2017; 43(3): 412–423; doi: 10.1177/0165551516644171

, Wang

, Lei

, et al. TIFIM: A Two-stage iterative framework for influence maximization in social networks. Appl Math Comput, 2019; 354:338–352; doi: 10.1016/j.amc.2019.02.056

Kim

, Kim

S-K

, Yu

. Scalable and parallelizable processing of influence maximization for large-scale social networks. In: ICDE, vol. 2013. 2013; pp. 266–277; doi: 10.1109/ICDE.2013.6544831

Leskovec

, Adamic

, Huberman

. The dynamics of viral marketing. ACM Trans Web, 2007; 1(1):1–39; doi: 10.1145/1232722.1232727

Zhu

, Zhi

, Guo

, et al. Analysis of epidemic spreading process in adaptive networks. IEEE Trans Circuits Syst II Express Briefs, 2019; 66(7):1252–1256; doi: 10.1109/TCSII.2018.2877406

, Pan

Scalable influence blocking maximization in social networks under competitive independent cascade models, Comput Netw 2017;123:38–50; doi: 10.1016/j.comnet.2017.05.004

Fan

, Zeng

, Sun

, et al. Finding key players in complex networks through deep reinforcement learning, Nat Mach Intell 2020;2(6):317–324; doi: 10.1038/s42256-020-0177-2

Cheung

, Luo

, Sia

, et al. Credibility of electronic word-of-mouth: Informational and normative determinants of on-line consumer recommendations. Int J Electron Commer, 2009; 13(4):9–38; doi: 10.2753/JEC1086-4415130402

10.

Zareie

, Sheikhahmadi

, Khamforoosh

. Influence maximization in social networks based on TOPSIS. Expert Syst Appl, 2018; 108:96–107; doi: 10.1016/j.eswa.2018.05.001

11.

Kempe

, Kleinberg

, Tardos

. Maximizing the spread of influence through a social network. In: KDD’03. Association for Computing Machinery, New York, NY, USA. 2003; p. 137146; doi: 10.1145/956750.956769

12.

Zareie

, Sheikhahmadi

. A hierarchical approach for influential node ranking in complex social networks. Expert Syst Appl, 2018; 93:200–211; doi: 10.1016/j.eswa.2017.10.018

13.

Leskovec

, Krause

, Guestrin

, et al. Cost-effective outbreak detection in networks. In: KDD’07. ACM;, 2007; pp. 420–429; doi: 10.1145/1281192.1281239

14.

Goyal

, Lu

, Lakshmanan

. CELF++: Optimizing the greedy algorithm for influence maximization in social networks. In: WWW’11. ACM, ACM Press;, 2011; pp. 47–48; doi: 10.1145/1963192.1963217

15.

Freeman

LC.

Centrality in social networks conceptual clarification. Soc Netw, 1978; 1(3):215–239; doi: 10.1016/0378-8733(78)90021-7

16.

Bonacich

, Lloyd

. Eigenvector-like measures of centrality for asymmetric relations. Soc Netw, 2001; 23(3):191–201; doi: 10.1016/S0378-8733(01)00038-7

17.

Lü

, Zhou

, Zhang

Q-M

, et al. The H-index of a network node and its relation to degree and coreness. Nat Commun, 2016; 7(1):10168; doi: 10.1038/ncomms10168

18.

Chen

, Gao

, Lü

, et al. Identifying influential nodes in large-scale directed networks: The role of clustering. PLoS One, 2013; 8(10):1–10; doi: 10.1371/journal.pone.0077455

19.

Wang

, Li

, Guo

, et al. Identifying influential spreaders in complex networks based on improved k-shell method. Phys A Stat Mech Appl, 2020; 554:124229; doi: 10.1016/j.physa.2020.124229

20.

Kitsak

, Gallos

, Havlin

, et al. Identification of influential spreaders in complex networks. Nat Phys, 2010; 6(11):888–893; doi: 10.1038/nphys1746

21.

Zhang

J-X

, Chen

D-B

, Dong

, et al. Identifying a set of influential spreaders in complex networks, Sci Rep 2016;6(1):27823; doi: 10.1038/srep27823

22.

Sun

H-L

, Chen

D-B

, He

J-L

, et al. A voting approach to uncover multiple influential spreaders on weighted networks. Phys A Stat Mech Appl, 2019; 519:303–312; doi: 10.1016/j.physa.2018.12.001

23.

Kumar

, Panda

. Identifying influential nodes in weighted complex networks using an improved WVoteRank approach. Appl Intell, 2022; 52(2):1838–1852; doi: 10.1007/s10489-021-02403-5

24.

Guo

, Yang

, Chen

, et al. Influential nodes identification in complex networks via information entropy. Entropy, 2020; 22(2):1–19; doi: 10.3390/e22020242

25.

Liu

, Li

, Fang

. Identifying influential nodes in social networks: A voting approach. Chaos Solit Fractals, 2021; 152:111309; doi: 10.1016/j.chaos.2021.111309

26.

Kumar

, Panda

. Identifying influential nodes in Social Networks: Neighborhood Coreness based voting approach. Phys A Stat Mech Appl, 2020; 553:124215; doi: 10.1016/j.physa.2020.124215

27.

Guimerá

, Danon

, Díaz-Guilera

. Self-similar community structure in a network of human interactions. Phys Rev E, 2003; 68(6):065103; doi: 10.1103/PhysRevE.68.065103

28.

Newman

MEJ.

Finding community structure in networks using the eigenvectors of matrices. Phys Rev E, 2006; 74(3):036104; doi: 10.1103/physreve.74.036104

29.

Kunegis

KONECT—The Koblenz Network Collection. In: WWW’13. 2013; pp. 1343–1350; doi: 10.1145/2487788.2488173

30.

Viswanath

, Mislove

, Cha

, et al. On the evolution of user interaction in facebook. In: WOSN’09. 2009; p. 3742; doi: 10.1145/1592665.1592675

31.

Jeong

, Tombor

, Albert

, et al. The large-scale organization of metabolic networks. Nature, 2000; 407(6804):651–654; doi: 10.1038/35036627

32.

Gleiser

, Danon

. Community structure in jazz. Adv Complex Syst, 2003; 6(4):565–574; doi: 10.1142/S0219525903001067

33.

Leskovec

, Kleinberg

, Faloutsos

. Graph evolution: Densification and shrinking diameters. ACM Trans Knowl Discov Data, 2007; 1(1):1–41; doi: 10.1145/1217299.1217301

34.

Spring

, Mahajan

, Wetherall

. Measuring ISP topologies with rocketfuel. IEEE/ACM Trans Netw, 2004; 12(1):2–16; doi: 10.1109/tnet.2003.822655

35.

Boguñá

, Pastor-Satorras

, Díaz-Guilera

, et al. Models of social networks based on social distance attachment. Phys Rev E, 2004; 70(5):056122; doi: 10.1103/PhysRevE.70.056122

36.

, Zhao

, Cai

, et al. Topological structure analysis of the proteinprotein interaction network in budding yeast. Nucleic Acids Res, 2003; 31(9):2443–2450; doi: 10.1093/nar/gkg340

37.

Colizza

, Pastor-Satorras

, Vespignani

. Reaction-diffusion processes and metapopulation models in heterogeneous networks. Nat Phys, 2007; 3(4):276–282; doi: 10.1038/nphys560

38.

Watts

, Strogatz

. Collective dynamics of small-world networks. Nature, 1998; 393(6684):440–442; doi: 10.1038/30918

39.

Pastor-Satorras

, Vespignani

. Epidemic dynamics and endemic states in complex networks. Phys Rev E 63 (6) (2001) 066117; doi: 10.1103/PhysRevE.63.066117

40.

Buscarino

, Fortuna

, Frasca

, et al. Disease spreading in populations of moving agents. EPL (Europhysics Letters), 2008; 82(3):38002; doi: 10.1209/0295-5075/82/38002

41.

Zareie

, Sheikhahmadi

, Jalili

, et al. Finding influential nodes in social networks based on neighborhood correlation coefficient. Knowl Based Syst, 2020; 194:105580; doi: 10.1016/j.knosys.2020.105580

42.

Castellano

, Pastor-Satorras

. Thresholds for epidemic spreading in networks. Phys Rev Lett, 2010; 105(21):218701; doi: 10.1103/PhysRevLett.105.218701