Net positive influence maximization in signed social networks

Abstract

In the real world, a large number of social systems can be modeled as signed social networks including both positive and negative relationships. Influence maximization in signed social networks is an interesting and significant research direction, which has gained some attention. All of existing studies mainly focused on positive influence maximization (PIM) problem. The goal of the PIM problem is to select the seed set with maximum positive influence in signed social networks. However, the selected seed set with maximum positive influence may also has a large amount of negative influence, which will cause bad effects in the real applications. Therefore, maximizing purely positive influence is not the final and best goal in signed social networks. In this paper, we introduce the concept of net positive influence and propose the net positive influence maximization (NPIM) problem for signed social networks, to select the seed set with as much positive influence as possible and as less negative influence as possible. Additionally, we prove that the objective function of NPIM problem under polarity-related independent cascade model is non-monotone and non-submodular, which means the traditional greedy algorithm is not applicable to the NPIM problem. Thus, we propose an improved R-Greedy algorithm to solve the NPIM problem. Extensive experiments on two Epinions and Slashdot datasets indicate the differences between positive influence and net positive influence, and also demonstrate that our proposed solution performs better than the state-of-the-art methods in terms of promoting net positive influence diffusion in less running time.

Keywords

Influence maximization signed social networks net positive influence polarity

1 Introduction

In recent years, Facebook, Twitter and other social networks have achieved greatly. Large-scale and diverse data provided by these online social networks promote the social computing research, in which influence maximization (IM) is a hot research topic. IM is the problem of choosing a subset of influential users named as seed node set, such that the propagation of influence could be maximized if we adopt the seed node set to spread the influence or information in social networks.

Kempe et al. [17] firstly formulated the influence maximization as a discrete stochastic optimization problem. They established that the optimization problem is NP-hard under Independent Cascade (IC) model and Linear Threshold (LT) model, and presented a greedy algorithm to solve the problem. Then for solving the scalability shortage of the solution of Kempe et al., improved greedy algorithms, scalable heuristics and a martingale approach were proposed in [7–10 , 18]. Furthermore, researchers also extended influence maximization problems from different perspectives, such as time-critical influence maximization [6, 24], competitive or comparative influence maximization [2 , 28], topic-aware influence maximization [4], influence maximization on the heterogeneous networks [34], and so on.

All the above studies were conducted in unsigned social networks which ignored the polarities of relationships between users. Actually, however, the relationships own the property of the polarity, which can divided into positive relationships (e.g., friend or like) and negative relationships (e.g., enemy or dislike). The social networks which contain not only positive relationships but also negative relationships are named as signed social networks. In the studies of influence maximization, ignoring the relationship polarities between users and treating signed social networks as unsigned ones roughly lead to overestimation of positive influence which will cause bad effect in practical applications finally. Influence maximization in signed social networks is a critical problem that remains pretty much open.

Aiming at signed social networks, few studies [13 , 35] proposed positive influence maximization (PIM) problem and developed several solutions. The goal of the PIM problem is to select the seed node set with maximum positive influence in signed social networks. However, maximizing purely positive influence may be not the final and best goal in signed social networks. Because, the seed set with maximum positive influence selected by methods in [13 , 35] may also has a large amount of negative influence. When we adopt the selected seed set with big positive influence and big negative influence to promote a product, it will cause the situation that many users like the product but plenty of other users dislike this product at the same time. This situation is not the results we expect. Therefore, in signed social networks, how to select the seed node set with as much positive influence as possible and as less negative influence as possible may be a better goal. To this end, we propose net positive influence and the corresponding problem of net positive influence maximization (NPIM). Net positive influence considers positive influence and negative influence simultaneously. Maximizing net positive influence means maximizing positive influence and minimizing negative influence at the same time. So, our proposed net positive influence is helpful to obtain better performance in the practical applications. Specifically, we make the following contributions in this paper:

We introduce the concept of net positive influence and propose the net positive influence maximization problem in signed social networks.

We prove that the influence function of NPIM problem under the polarity-related independent cascade (IC-P) diffusion model [38] is non-monotone and non-submodular, which means that the classic greedy algorithm is inapplicable to this problem. In this paper, an improved R-Greedy algorithm named as NPIM R-Greedy is developed to solve the NPIM problem.

The experiments are carried out on Epinions and Slashdot datasets. The comparison experimental results with closely related our study verify the superiority of our method.

The rest of this paper is organized as follows: In Section 1, related studies are reviewed. In Section 2, problem statement is presented. In Section 3, the Polarity-related independent cascade diffusion model is introduced. In Section 4, the influence function of NPIM under IC-P model is proved that it is both non-submodular and non-monotonic and then the NPIM R-Greedy algorithm is presented. In Section 5, the results from experiments prove that our method is effective. Finally, the conclusion and research direction is put forward in Section 6.

2 Related work

Influence maximization as an algorithmic technique for viral marketing was first proposed by Domingos and Richardson [12]. Kempe et al. [17] used discrete optimization to express it. They proved that the problem belongs to NP-hard under IC and LH models and also presented a greedy-approximation solution. To address the inefficiency issue of the greedy method, Leskovec et al. [18] proposed a seed nodes seeking method named as “Cost-Effective Lazy Forward” (CELF), which fully makes use of the sub modularity property of the objective function. Chen et al. proposed two heuristic methods, DegreeDiscountIC [8] and prefix excluding maximum influence arborescence (PMIA) [7], which leverage local arborescence structures of each node to approximate the influence propagation. Jung et al. [15] proposed the IRIE algorithm integrating the advantages of influence estimation (IE) and influence ranking (IR) methods, which runs much faster than the PMIA method. Cheng et al. [9] presented a static greedy algorithm that can strictly guarantee the submodularity of influence function in the process of the seed selection. They also [10] proposed an iterative ranking framework (IMRank) to solve the influence maximization problem.

Budak et al. [3] attempted to address the influence limitation problem where a bad campaign starts to propagate from a certain user in the network and limiting campaigns are released to restrict the impact of misinformation. Similarly, under the competitive linear threshold (CLT) model, He et al. [14] studied competitive influence diffusion in social networks. To simulate multiple cascades over a network, Pathak et al. [28] proposed a generalized version of the linear threshold model which allows nodes to switch states between different cascades. Borodin et al. [2] developed several natural extensions of the linear threshold model for the multiple-technology case. Chen et al. [5] extended the independent cascade model for incorporating the emergence and propagation of negative opinions. Furthermore, Ou et al. [27] took into account the fact that products are not purely competitive, and studied influence maximization for complementary goods. Liu [24] and Chen et al. [6] considered the time factor of influence diffusion, and explored the time constrained influence maximization problem. Li et al. [22] studied the location-aware influence maximization problem and proposed the extend maximum influence arborescence (MIA) algorithm to find the node set with maximum influence within a specific geographical region. Song et al. [3] developed the problem of targeted influence maximization that considers both time and location factors of influence spread. Wang et al. [34] proposed the problem of influence maximization on the heterogeneous networks. Chen et al. [4] studied topic-aware influence maximization problem.

Zheng et al. [41] proposed a hybrid algorithm based on centrality weighted link strength based on the idea of central heuristic. Zhu et al. [43] proposed a node influence recognition method based on community structure and k-shell node method. Nguyen et al. [26] developed a network model with non-uniform propagation probability between nodes. Qin et al. [30] proved that IC model could not accurately simulate the information diffusion structure in real networks, and proposed a three-step cascade model (TSCM) to simulate the information diffusion process in online social networks. Yang et al. [31] introduced the thermal diffusion model into the research on the maximization of personalized influence and proposed the target thermal greed algorithm. Wang et al. [36] proposed a new multipath asynchronous threshold model that quantifies the effects and tracks their diffusion and aggregation. Aiming at the problem of small influence scope caused by selecting only local optimal nodes, Zhu et al. [42] proposed a maximization algorithm based on structure holes and degree discount, taking into account the propagation advantages of core nodes and structure hole nodes comprehensively.

Li et al. [20] attempted to seek grassroots as seeds instead of traditional elites, and constructed comprehensive experiments to demonstrate the effectiveness of their solution. Yang et al. [39] formulated the general continuous influence maximization problem, and developed a general coordinate descent algorithmic framework to solve it. Keikha et al. [16] leveraged deep learning techniques for IM problem, which considers both local and global structure during network feature. Li et al. [23] studied the cost-aware target viral marketing (CTVM) problem aiming to seek the most cost-effective users to influence the most relevant users. Peng et al. [29] explored the problem of adaptive influence maximization with myopic feedback under the classic independent cascade model. Wu et al. [37] studied the problem of online influence maximization in social networks, which capitalizes on an important property named network assortativity. Banerjee et al. [1] proposed a community-based solution approach for solving the budgeted influence maximization problem. Ding et al. [11] proposed the realistic independent cascade (RIC) model for IM problem, which considers the acceptance probability of candidate seed nodes in social networks.

Although influence maximization has attracted tremendous attentions, few existing studies focused on influence maximization in signed social networks. Li et al. [21] proposed the polarity-related influence maximization (PRIM) problem in signed social networks, which contains positive influence problem and negative influence problem. They extended the standard Independent Cascade (IC) model to the signed social networks, and proved that the influence function of the PRIM problem under the extended IC model is monotonic and sub-modular. Then they used a greedy algorithm to solve the PRIM problem which can approximate the optimum solution within a factor of 1 - 1/e. Following the mind of [21], Wang [35], Shen et al. [32] extended Linear Threshold model to signed social networks for positive influence maximization. Shi et al. [13] studied the problem of selecting the optimal individual subset to disseminate the positive influence when time is bounded. They proved that such a problem is NP-hard, and proposed a heuristic algorithm based on greedy strategy. Li et al. [19] proposed the Polarity-related Linear Influence Diffusion (PLID) model, which can quickly and accurately estimate polarity-related influence of user sets without Monte-Carlo simulations. Yin et al. [40] proposed a novel Signed-PageRank (SPR) method, to study the problem of influence maximization for advertisement recommendation in signed social networks.

From the above review about influence maximization problem, we can find that researchers have noticed the importance of relationships polarities to the influence maximization. Existing studies extended influence maximization problem from unsigned social networks to signed networks, to seek the node set with maximum positive influence or negative influence. Different from all of them, this paper proposed the concept of net positive influence, and aims to find the seed set owing the biggest net positive influence. Compared with the existing studies, our research is closer to reality and is able to perform better in some application scenarios.

3 Problem statement

In signed social networks, existing influence maximization researches only considered how to seek the seed set owing maximum positive influence, however, ignored the fact that the seed set with lager positive influence may also have lager negative influence. Aiming at this drawback, we first propose the concept of net positive influence which is positive influence subtracting negative influence. Based on this concept, we introduce the Net Positive Influence Maximization (NPIM) problem, to seek the seed set with as much positive influence as possible and as less negative influence as possible. In some real application scenarios, net positive influence maximization will be more valuable than positive influence maximization. Next, we give the details of the NPIM problem definition.

We denote f⁺(·) as the positive influence function. Given a node set S, f⁺(S) returns the expected number of nodes activated to be positive state by S, which is considered as the positive influence of S. Similarly, we denote f^-(·) as the the negative influence, and f^-(S) returns the expected number of nodes activated to be negative state by S which is considered as the negative influence. Then, we define f^±(·) as the net positive influence, i.e., for the seed set S, f^±(S) = f⁺(S) - f^-(S). How to calculate the net positive influence of seed sets is a key problem. In this paper, we adopt Polarity-related Independent Cascade (named IC-P) diffusion model [21] to estimate the values of f⁺(S) and f^-(S), which is an extension of Independent Cascade model in signed social networks.

The definition of the NPIM problem is as following: a signed social network G, a information diffusion model as well as a non-negative number k are given, the goal is to seek a set S containing k seed nodes such that the expect net positive influence f^±(S) is maximized. This problem can be formalized as, $S^{\pm} = arg max_{S \subseteq V, | S | = k} f^{\pm} (S),$ (1)

In our NPIM problem, we design all nodes in the seed set as positive state or to be 100% influence positively. This design is decided by the specific application scenarios of the proposed NPIM problem. Here, we take a typical application(i.e. viral marketing) of the NPIM problem as example to explain the above assumptions. In viral marketing, the application of NPIM is primarily intended to identify the seed set owing the greatest net positive influence to promote some products over signed social networks. Under this scenario, how to design the initial seed set generally has two options. In the first option, the node set only contains positive nodes. In the second option, the node set contains not only positive nodes, but also negative nodes. If the second option is chosen, it implies that one company will look for some persons and pay them to post negative news related to the products. Obversely, this is not reasonable. Therefore, the latter option is not suitable to be applied in viral marketing. In the NPIM problem defined by us, although all the seed nodes are set to be positive, the negative relationships that exist in the signed social networks will produce negative information or influence [21].

4 Modeling Information diffusion in signed social network

In this paper, we adopt polarity related independent cascade model (IC-P) [21] to estimate the polarity influence of candidate seed sets. Here, we first introduce how to model a signed social network, and then present the details of IC-P model.

In this paper, we model a signed social network as a directed, weighted and signed graph G =(V, E, A, P). V is the set of nodes which correspond to users in the social network (“user” and “node” are alternative in this paper). E is the set of edges which correspond to relationships between users. A is a nonnegative adjacency matrix whose element A_u,v represents the diffusion probability of edge (u, v) in the graph. If and only if the edge (u, v) ∈ E, the value of A_u,v will be a positive value, otherwise zero. P is another matrix where each element P_u,v represents the sign of edge (u, v) in the graph. The value of P_u,v has three options {1, - 1, 0} which are corresponding to positive relationship, negative relationship and no relationship.

The IC-P model is an extension of the IC model in signed social networks that fuses the polarity of relations among users. In the IC-P model, the states of every node have three options: positive state, negative state and inactive state. For a user u, positive state means that this user likes or supports the disseminating information and s/he will continue to spread the information with his/her positive opinions. Negative state means that user u dislikes or opposes the disseminating information and s/he will continue to spread the information with his/her negative opinions. Inactive state means that u does not adopt the disseminating information, and also will not continue to spread the information. We denote S(u) as the state of node u, whose values 1,-1 and 0 represent the positive state, negative state, and inactive state of u, respectively.

As Li et al. introduced in [21], “in the IC-P model, the propagation process begins with an initial node set S where the state of each node can be positive or negative, and all of other nodes not belonging to S are inactive in the network. Here, we first define $N_{active}^{t} (v)$ as node v’s neighbor set where each element becomes positive or negative state at time t. The diffusion regulations of the IC-P model are as following: for a node u, if it is activated by its neighbors in time t - 1, the state of node u will change to be positive or negative state in time t. The new activated user u only has a single chance to activate each inactive neighbor v of him/her in time step t. In time t, each node $u \in N_{active}^{t} (v)$ attempts to activate v in any order with a probability of A_v,u. In the IC-P model, the activation of node v can only occur in one time step. Once node v is activated by one node belonging to $N_{active}^{t} (v)$ , this node will not be activated by other nodes in $N_{active}^{t} (v)$ any more. The propagation process will continue until all new activation nodes are exhausted.”

For a newly activated node v, its status S(v) depends on the polarity state of the node u that activated v and the relationship polarity between two users, meaning S(v) = S(u) × P_u,v. Therefore, if the polarity of node u is positive, and the relationship between u and v is positive, then the polarity of node v will be positive. if the polarity of node u is positive, and the relationship between u and v is negative, then the polarity of node v will be negative. If the polarity of the node u is negative, then the relationship between u and v is negative, and the polarity of the node v will be positive. If the polarity of the node u is negative, and the relationship between u and v is positive, then the polarity of the node v will be negative. We can see that, the social principles “the friend’s friend is friend, the friend’s enemy is enemy, the enemy’s enemy is friend and the enemy’s friend is enemy” has been integrated into the IC-P model.

5 Property and Solution for NPIM Problem

5.1 Property of NPIM problem

Theorem 1. In the NPIM problem, for any arbitrary instance of the IC-P model, the net positive influence function f^±(·) is not only non-submodule, but also non-monotonous.

For proving the Theorem 1, we first present what are monotonicity and submodularity. Given influence function f^±(·) and node set S, T, if f^±(S) ≤ f^±(T) whenever S ⊆ T, then f^±(·) is monotone. f^±(·) owns the submodularity if it satisfies a natural “diminishing returns” property: f^±(S ∪ {v}) - f^±(S) ≥ f^±(T ∪ {v}) - f^±(T), for all nodes v and all pairs of sets S ⊆ T, i.e., the marginal gain added from one node to S set should be the same as or higher than the marginal gain added from the same node to superset S.

To prove the monotonicity and submodularity of non-polarity influence function, Kempe et al. [15] developed a live-edge process that is equivalent to the propagation process. In this paper, we adopt a similar method to prove the Theorem 1.

In the live-edge process, an event will happen that a new activated node u will attempt to activate the neighbor node v with the probability of A_u,v, which seems to toss a double-side coin with A_u,v. In the view of this process, it’s nothing about that the coin tossing begins from the start of the whole process or the activation of time v. A coin tossing stands for the successful activation, the edges of such an event will become a candidate-live that if the starting node of the directional edge is activated, the close nodes may also be activated. If a node has more than one candidate-live edge, then we will randomly select one of them as the live edge, meanwhile the other candidate-live edges will be blocked.

Once the result of the coin tossing have been determined, we will select the live edge for each node and initialize all nodes in the seed set S as live, which will also allow us to clarify all positive nodes set at the end of the cascading process:

If there is only one path can reach the x formed by all live edges from one node in S, and the polarity of this path will be positive, node x will change to be positive. We define the path as follows, the live path is path(n₁, n_k) =(n₁, n₂, ⋯ , n_k) with the range from n₁ to n_k, and the polarity of the path(n₁, n_k) is $\prod_{i = 1}^{k - 1} P (n_{i}, n_{i + 1})$ .

For a node v, Li et al. [15] have proved that the probability of v activated to be positive by the diffusion process is the same as the probability of v determined to be positive in the live-edge process, under the IC-P diffusion model. Because, we choose the IC-P model to solve the NPIM problem. So, here we can adopt directly the live-edge process to prove the Theorem 1.

Proof of Theorem 1 Under the live-edge process of IC-P model, each edge will obtain an exact result (live or block) after the coin tossing or the live edge selection appears. Considering the probability space where each sampling point may specify a set of possible results for all edges, we use X to represent the result sets for those edges. Since the value of X is fixed, the value of $f_{X}^{\pm} (\cdot)$ is also determined and we can express it in a mathematical way, as described below. We denote R⁺(u, X) as the collection of all nodes that can be reached from u through one path consisting entirely of live edges, and the polarity of this path is positive. Similarly, R^-(u, X) is denoted as the collection of all nodes that can be reached from u through one path consisting entirely of live edges, and the polarity of this path is negative. $f_{X}^{\pm} (S) = \cup_{u \in S} {R^{+} (u, X) - R^{-} (u, X)}$ .

For a new node u adding to set S, some new live edges with positive or negative polarity will appear. Both positive influence and negative influence of seed node set will increase. There exists a situation where the number of new live edges with positive polarity is bigger than that with negative polarity. In this situation, $f_{X}^{\pm} (S \cup {v}) > f_{X}^{\pm} (S)$ , so $f_{X}^{\pm} (\cdot)$ is not monotone.

Next, we attempt to analyze the submodularity. We denote S and T as two nodes sets, and set S belongs to T, i.e. S ⊆ T. Here, we compare $f_{X}^{\pm} (S \cup {v}) - f_{X}^{\pm} (S)$ with $f_{X}^{\pm} (T \cup {v}) - f_{X}^{\pm} (T)$ .

$f_{X}^{\pm} (S \cup {v}) - f_{X}^{\pm} (S) = f_{X}^{+} (S \cup {v}) - f_{X}^{-} (S \cup {v}) - (f_{X}^{+} (S) - f_{X}^{-} (S)) = f_{X}^{+} (S \cup {v}) - f_{X}^{+} (S) - (f_{X}^{-} (S \cup {v}) - f_{X}^{-} (S))$ .

similarly,

$f_{X}^{\pm} (T \cup {v}) - f_{X}^{\pm} (T) = f_{X}^{+} (T \cup {v}) - f_{X}^{-} (T \cup {v}) - (f_{X}^{+} (T) - f_{X}^{-} (T)) = f_{X}^{+} (T \cup {v}) - f_{X}^{+} (T) - (f_{X}^{-} (T \cup {v}) - f_{X}^{-} (T))$ .

then,

$f_{X}^{\pm} (S \cup {v}) - f_{X}^{\pm} (S) - [f_{X}^{\pm} (T \cup {v}) - f_{X}^{\pm} (T)] = [f_{X}^{+} (S \cup {v}) - f_{X}^{+} (S) - (f_{X}^{+} (T \cup {v}) - f_{X}^{+} (T))] - [f_{X}^{-} (S \cup {v}) - f_{X}^{-} (S) - (f_{X}^{-} (T \cup {v}) - f_{X}^{-} (T))]$

Because, the influence function f⁺(·) and f^-(·) have been proved to be submodulairty [21]. So the value of the former part of the above equation is greater than 0, and the latter of the former part of the above equation is also greater than 0. There may exist a situation where the latter one is smaller than former one. In the situation, $f_{X}^{\pm} (S \cup {v}) - f_{X}^{\pm} (S) \leq f_{X}^{\pm} (T \cup {v}) - f_{X}^{\pm} (T)$ . Therefore, the function $f_{X}^{\pm} (\cdot)$ is not submodulairty.

5.2 Restricted Greedy Solution for NPIM problem

Due to the non-monotonicity and non-submodularity of the net positive influence function f^±(·) under IC-P model, the traditional greedy algorithm [17] becomes inapplicable. Lu and Lakshmanan [25] developed a U-Greedy algorithm for solving profit maximization problems which are non-monotonicity and non-submodularity. Furthermore, to solve the problem of Influence Maximization with Novelty Decay, Feng et al. [13] proposed a restricted greedy (R-Greedy) algorithm which improves the efficiency of the U-Greedy algorithm. Here, we resort to R-Greedy algorithm to solve the NPIM problem in this paper. The improved R-Greedy is named as NPIM R-Greedy algorithm, whose details are presented in Algorithm 1.

Algorithm 1 NPIM R-Greedy algorithm
1: Input: Given a signed social network G and the size of node set K;
2: Output: the node set S with maximum net positive influence;
3: For v = 1 to \|V\| do
4: calculate f^±(v) and insert (v, f^±(v)) into Q₀;
5: End for
6: For k = 1 to K do
7: maxMargin← - ∞;
8: For u ∈ V \ S_k-1, f^±(u) ≥ maxMargin do
9: If u ∈ Q_k-1 and $\inf_{k - 1}^{u} + f^{\pm} ({s + k - 1}) - f^{\pm} (S_{k - 1}) < maxMargin$ then
10: Continue;
11: Else
12: Calculate $\inf_{k}^{u}$ and insert (u, $\inf_{k}^{u}$ ) into Q_k;
13: If $\inf_{k}^{u} - f^{\pm} (S_{k - 1}) > maxMargin$ then
14: $maxMargin \leftarrow \inf_{k}^{u} - f^{\pm} (S_{k - 1})$ ;
15: s_k ← u;
16: End if
17: End if
18: End for
19: S_k ← S_k-1 ∪ {s_k};
20: f^±(S_k) ← f^±(S_k-1 + maxMargin);
21: End For
22: return S

The core idea of the NPIM R-Greedy algorithm is to select the first K nodes with maximal marginal influence, and then choose the seed nodes that have the maximum influence propagation. Algorithm 1 illustrates the NPIM R-Greedy algorithm, where S_k represents the collection of selected seeds. s_k represents the single seed node selected in the Round k. We add node u to the selected seed set S_k-1, the net positive influence of the new obtained set is $\inf_{k}^{u}$ . NPIM R-Greedy algorithm is able to optimize the seed nodes through dynamic pruning technology that deletes the nodes whose influence is smaller than maxMargin present in Line 8. For a node that has been checked in the round (k - 1), the upper bound of its marginal net positive influence is $\inf_{k - 1}^{u} + f^{\pm} ({s + k - 1}) - f^{\pm} (S_{k - 1})$ . If the maximum margin is larger than the upper bound, it also means that the node will be ignored (Line 9).

6 Experiments

6.1 Experimental setup

Datasets. Epinions and Slashdot are two famous online signed social networks. We choose these two networks as our experiment dataset sources. Specifically, we download Epinions and Slashdot datasets from large standard network datasets collection (http://snap.stanford.edu/). In both two signed networks, positive polarity or negative polarity is tagged on each relationship between users directly. Two signed social networks are modeled as two signed graphs.

Epinions. It is a popular web site for product review. Users in Epinions can label trust (positive) or distrust (negative) to other users. This dataset contains 11567 users and 93204 relationships with polarities.

Slashdot. This is a popular technical news site. Users in Slashdot can mark other users as friend (positive) or foe (negative). This dataset contains 10966 users and 44356 relationships with polarities.

Generating influence probabilities. Because the interactive information used to calculate the influence diffusion probability A_u,v of each link (u, v) in the graph G is not available, so we make use of two popular models [8, 21] to generate diffusion probabilities for edges in the graph in our experiments.

Weighted Cascade (WC) model. This model mainly makes use of the in-degree of node to generate diffusion probabilities. For an edge (u, v), the diffusion probability of the edge A_u,v is equal to 1/d(v), and d(v) is the in-degree of v.

Uniformly (UN) model. All edges in the graph are uniformly assigned a same probability value, which is set to be 0.01 in this paper.

Comparison methods. We compare our method (NPIM R-Greedy) with IC-P Greedy, IC Greedy, Positive Out-Degree and Out-Degree methods. We list the methods compared in our experiments as following:

NPIM R-Greedy. This is our method presented in Section 5.

IC-P Greedy. This method makes use of the greedy algorithm with the lazy-forward optimization to solve the PIM problem under the IC-P model [21].

IC Greedy. This method does not consider the relationship polarities and treat signed social networks as unsigned ones. In these unsigned social networks, we make use of the original greedy algorithm with the CELF optimization [17] to obtain the seed node set of size k owning biggest non-polar influence under the standard IC model.

Positive Out-Degree. This method selects top k nodes based on positive out degrees, which is also used in [21].

Out-Degree. This is a popular baseline method that selects top k nodes based on out degrees, which is also used in [7 , 21].

In order to get the net positive influence of the seed node sets chosen by the above compared algorithms, we simulate up to 20000 times for each selected seed node set using the IC-P diffusion model in the signed graph. Finally, we calculate the average value of 20000 simulations for each seed node set. We set the value of the selected number k as 50 in all algorithms. Then, we compare the net positive effects of all selected seed node sets in the range from 1 to 50. These simulations are performed on a desktop computer, configured with 8G memory and Intel four-core processors, Quad-Core Intel Core(TM)i7 870 (2.93GHz).

6.2 Experiment Result

Net positive influence spread. Figure 1 presents the performance concerning the NPIM problem, applying five different algorithms (NPIM R-Greedy, IC-P Greedy, IC Greedy, Positive Out-Degree and Out-Degree) with two types of diffusion probability (WC model, UN model) on the Epinions dataset. From the results shown in Fig. 3, we can find that the performance of our proposed method is optimal, while IC-P Greedy performs better than IC Greedy and Out-Degree. The bad performance of IC Greedy fully indicates that the necessary of influence maximization research taking into account polarities of relationships between users. Compared to IC-P Greedy, IC Greedy, Positive Out-Degree, and Out-Degree methods, our method is 1.0%, 26.5%, 15.1% and 35.7% better with WC model, and is 25.9%, 66.3%, 9.1% and 24.2% better with UN model, when the seed set size k is 50.

Fig. 1

Net positive influence spread performance of different methods on Epinions dataset for PIM problem.

Figure 2 presents the performance concerning the NPIM problem, applying five different algorithms with two types of diffusion probability on the Slashdot dataset. Similar with the results on the Epinions, our proposed method shows the best performance. Compared to IC-P Greedy, IC Greedy, Positive Out-Degree, and Out-Degree methods, our method is 8.3%, 49.6%, 14.1% and 43.7% better with WC model, and is 5.5%, 17.9%, 3.9% and 26.3% better with UN model, when the size of node set is 50.

Fig. 2

Net positive influence spread performance of different methods on Slashdot dataset for PIM problem.

Positive influence spread. To explain the necessity of the NPIM study, we also compare five methods in terms of positive influence spread. Figures 3 and 4 show the positive influence of five compared methods with two kinds of diffusion probability on Epinions and Slashdot datasets, respectively. From Figs. 3 and 4, we can see that the performance of our NPIM R-Greedy has great fluctuations. It achieves best performance on Epinions dataset with WC model, but performs worst on Slashdot dataset with UN model.

Fig. 3

Positive influence spread performance of different methods on Epinions dataset.

Fig. 4

Positive influence spread performance of different methods on Slashdot dataset.

Combining all experiment results shown by Figs. 1–4, we can obtain two viewpoints: (1) The relationship polarities in signed social network should be considered in the influence maximization problems. Ignoring relationship polarities inevitably leads to wrong estimation of influence. (2) Net positive influence is different from positive influence, the method obtaining the seed set with maximum positive influence can not take the biggest net positive influence. Based on the above viewpoints, we can conclude that our proposed NPIM study is necessity and indispensable.

Running time. Table 1 presents the running time of five comparison methods with two types of different propagation probabilities on the Epinions and Slashdot datasets. From Table 1, we find that Out-Degree and Positive Out-Degree need a similar amount of running time, which are the fastest among five methods. However, their performance are worse than NPIM R-Greedy in terms of net positive influence diffusion. About the other three methods, our NPIM R-Greedy is faster than IC-P Greedy and IC Greedy methods. The reason of why NPIM R-Greedy needs less time is that our method prunes low-influential nodes in a dynamic way. The number of calculate nodes in every round is much smaller than the size of the network.

Table 1

Running time of different methods on two datasets (Sec).

	Epinions Dataset		Slashdot Dataset
	WC model	UN model	WC model	UN model
NPIM R-Greedy	44791.577	8631.462	9871.951	6670.116
IC-P Greedy	153756.963	40569.846	50414.312	76600.241
IC Greedy	157001.991	29200.425	23200.413	22300.228
Positive Out-Degree	2.976	2.976	1.792	1.792
Out-Degree	2.381	2.381	1.146	1.146

From above comparison results, we find that our NPIM R-Greedy can obtain better performance than IC-P Greedy and IC Greedy methods in terms of net positive influence diffusion in much less running time. Therefore, we can conclude that our proposed NPIM R-Greedy is the most suitable solution for the NPIM problem. As the NPIM problem is closer to practical applications than PIM or IM problem, so this paper provides a better solution for viral marketing and opinions promotion.

7 Conclusions and future work

Influence maximization in signed social networks is a critical problem, which is further studied in this paper. First, we introduced the concept of net positive influence which is equal to positive influence subtracting negative influence, and then we proposed the net positive influence maximization problem. Next, we proved that the objective function of the NPIM problem under the IC-P diffusion model is non-monotone and non-submodular. Thus, an improved R-Greedy method is developed to solve the NPIM problem. Finally, extensive experiments on two large signed social networks demonstrate the superiority of our solution in terms of promoting net positive influence diffusion.

Several challenges and future directions remain. The influence diffusion is a temporal dynamic process. How to combine time factor and polarity property together for solving temporal polarity-related influence maximization problem will be an interesting problem. Another future direction is to conduct more new diffusion models for signed social networks, such as the extensions of the epidemic model and the game theory model.

Footnotes

Acknowledgments

This work is funded by the National Natural Science Foundation of China (No. 61702138 and No. 61602128 and No. 61672322 and No. 61672185), and the Shandong Province Natural Science Foundation of China (No. ZR2016FQ13), and the China Postdoctoral Science Foundation (No. 2019M662360, No.2017M621275 and No. 2018T110301), and Young Scholars Program of Shandong University, Weihai (No. 1050501318006), and Science and Technology Development Plan of Weihai City (No. 1050413421912).

References

Suman Banerjee, Mamata Jenamani and Dilip Kumar Pratihar, Combim: A community-based solution approach for the budgeted influence maximization problem, Expert Systems with Applications 125 (2019), 1–13.

Allan Borodin, Yuval Filmus and Joel Oren, Threshold models for competitive influence in social networks. In International Workshop on Internet and Network Economics. pages 539–550. Springer, 2010.

Ceren Budak, Divyakant Agrawal and Amr El Abbadi, Limiting thespread of misinformation in social networks, In Proceedings of the 20th international conference on World wide web, pages665–674. ACM, 2011.

Shuo Chen, Ju Fan, et al. Online topic-aware influence maximization, Proceedings of the VLDB Endowment 8(6) (2015), 666–677.

Wei Chen, Alex Collins, Rachel Cummings, Te Ke, Zhenming Liu, DavidRincon, Xiaorui Sun, Yajun Wang, Wei Wei and Yifei Yuan, Influence maximization in social networks when negative opinions may emerge and propagate, In SDM, volume 11, pages 379–390. SIAM, 2011.

Wei Chen, Wei Lu and Ning Zhang, Time-critical influence maximization in social networks with time-delayed diffusion process, In 26th AAAI Conference on Artificial Intelligence, pages592–598, 2012.

Wei Chen, Chi Wang and Yajun Wang, Scalable influence maximization for prevalent viral marketing in large-scale social networks, In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1029–1038. ACM, 2010.

Wei Chen, Yajun Wang, and Siyu Yang, Efficient influence maximization in social networks, In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 199–208. ACM, 2009.

Suqi Cheng, Huawei Shen, and et al. Staticgreedy: solving the scalability-accuracy dilemma in influence maximization, In Proceedings of the 22nd ACM international conference on Information & Knowledge Management, pages 509–518. ACM, 2013.

10.

Suqi Cheng, Huawei Shen, Junming Huang, Wei Chen and Xueqi Cheng, Imrank: influence maximization via finding self-consistent ranking, In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages 475–484. ACM, 2014.

11.

Jingyi Ding, Wenjing Sun, Jianshe Wu and Yuwei Guo, Influence maximization based on the realistic independent cascade model, Knowledge-Based Systems 191 (2020), 105265.

12.

Pedro Domingos and Matt Richardson, Mining the network value of customers, In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 57–66, 2001.

13.

Shanshan Feng, Xuefeng Chen, Gao Cong, Yifeng Zeng, Yeow Meng Chee and Yanping Xiang, Influence maximization with novelty decay in social networks, In Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.

14.

Xinran He, Guojie Song, Wei Chen and Qingye Jiang, Influence blocking maximization in social networks under the competitive linear threshold model, In SDM, pages 463–474. SIAM, 2012.

15.

Kyomin Jung, Wooram Heo and Wei Chen, Irie: Scalable and robust influence maximization in social networks, . pages 918–923, 2012.

16.

Mohammad Mehdi Keikha, Maseud Rahgozar, Masoud Asadpour and Mohammad Faghih Abdollahi, Influence maximization across heterogeneous interconnected networks based on deep learning, Expert Systems with Applications 140 (2020), 112905.

17.

David Kempe, Jon Kleinberg, and Éva Tardos, Maximizing the spread of influence through a social network, In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 137–146. ACM, 2003.

18.

Jure Leskovec, Andreas Krause et al. Cost-effective outbreak detection in networks, In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 420–429. ACM, 2007.

19.

Dong Li and Jiming Liu, Modeling influence diffusion over signed social networks, IEEE Transactions on Knowledge and Data Engineering, 2019.

20.

Dong Li, Wei Wang and Jiming Liu, Grassroots vs elites: Which ones are better candidates for influence maximization in social networks? Neurocomputing 358 (2019), 321–331.

21.

Dong Li, Zhi-Ming Xu, Nilanjan Chakraborty, et al. Polarity related influence maximization in signed social networks, PloS one 9(7) (2014), e102199.

22.

Guoliang Li, Shuo Chen, Jianhua Feng, Kian-lee Tan and Wen-syan Li, Efficient location-aware influence maximization, Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pages 87–98, 2014.

23.

Xiang Li, J David Smith, Thang N Dinh and My T Thai, Tiptop:(almost) exact solutions for influence maximization in billion-scale networks, IEEE/ACM Transactions on Networking 27(2) (2019), 649–661.

24.

Bo Liu, Gao Cong et al. Time constrained influence maximization in social networks, In 2012 IEEE 12th International Conference on Data Mining, pages 439–448. IEEE, 2012.

25.

Wei Lu and Laks VS Lakshmanan, Profit maximization over social networks, 2012 IEEE 12th International Conference on Data Mining, pages 479–488. IEEE, 2012.

26.

Duy-Linh Nguyen, Tri-Hai Nguyen, Trong-Hop Do and Myungsik Yoo, Probability-based multi-hop diffusion method for influence maximization in social networks, Wireless Personal Communications 93(4) (2017), 903–916.

27.

Han-Ching Ou, Chung-Kuang Chou and Ming-Syan Chen, Influence maximization for complementary goods: Why parties fail to cooperate? Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pages 1713–1722, 2016.

28.

Nishith Pathak, Arindam Banerjee and Jaideep Srivastava, A generalized linear threshold model for multiple cascades, 2010 IEEE International Conference on Data Mining, pages 965–970. IEEE, 2010.

29.

Binghui Peng and Wei Chen, Adaptive influence maximization with myopic feedback, Advances in Neural Information Processing Systems, pages 5574–5583, 2019.

30.

Yadong Qin, Jun Ma and Shuai Gao, Efficient influence maximization under tscm: a suitable diffusion model in online social networks, Soft Computing 21(4) (2017), 827–838.

31.

Yadong Qin, Jun Ma and Shuai Gao, Efficient influence maximization under tscm: a suitable diffusion model in online social networks, Soft Computing 21(4) (2017), 827–838.

32.

Chengguang Shen, Ryo Nishide, Ian Piumarta, Hideyuki Takada and Wenxin Liang, Influence maximization in signed social networks, In International Conference on Web Information Systems Engineering, pages 399–414. Springer, 2015.

33.

Chonggang Song, Wynne Hsu and Mong Li Lee, Targeted influence maximization in social networks, Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pages 1683–1692, 2016.

34.

Guan Wang, Qingbo Hu and Philip S Yu, Influence and similarity on heterogeneous networks. pages 1462–1466, 2012.

35.

Huanhuan Wang, Qun Yang, Lei Fang and Weihua Lei, Maximizing positive influence in signed social networks, In International Conference on Cloud Computing and Security, pages 356–367. Springer, 2015.

36.

Wenjun Wang and W Nick Street, Modeling and maximizing influence diffusion in social networks for viral marketing, Applied Network Science 3(1):6 (2018).

37.

Qingyun Wu, Zhige Li, Huazheng Wang, Wei Chen and Hongning Wang, Factorization bandits for online influence maximization, In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 636–646, 2019.

38.

Shuang-Hong Yang, Alexander J Smola, Bo Long, Hongyuan Zha and Yi Chang, Friend or frenemy? predicting signed ties in social networks, In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pages 555–564, 2012.

39.

Yu Yang, Xiangbo Mao, Jian Pei and Xiaofei He, Continuous influence maximization, ACM Transactions on Knowledge Discovery from Data (TKDD) 14(3) (2020), 1–38.

40.

Xiaoyan Yin, Xiao Hu, Yanjiao Chen, Xu Yuan and Baochun Li, Signed-pagerank: An efficient influence maximization framework for signed social networks, IEEE Transactions on Knowledge and Data Engineering, 2019.

41.

Zhi-yun Zheng, Yuan Fu, Lun Li, Dun Li and Zhen-fei Wang, Influence algorithm based on center weighted link in social networks, Computer Engineering and Design (1): (2017), 21.

42.

Jinghua Zhu, Yong Liu and Xuming Yin, A new structure-hole-based algorithm for influence maximization in large online social networks, IEEE Access 5 (2017), 23405–23412.

43.

X Zhu, X Zhao, and M Liu, Analysis of influential node based on community structure, Application Research of Computers 34(9) (2017), 2582–2585.