Abstract
In the real world, a large number of social systems can be modeled as signed social networks including both positive and negative relationships. Influence maximization in signed social networks is an interesting and significant research direction, which has gained some attention. All of existing studies mainly focused on positive influence maximization (PIM) problem. The goal of the PIM problem is to select the seed set with maximum positive influence in signed social networks. However, the selected seed set with maximum positive influence may also has a large amount of negative influence, which will cause bad effects in the real applications. Therefore, maximizing purely positive influence is not the final and best goal in signed social networks. In this paper, we introduce the concept of net positive influence and propose the net positive influence maximization (NPIM) problem for signed social networks, to select the seed set with as much positive influence as possible and as less negative influence as possible. Additionally, we prove that the objective function of NPIM problem under polarity-related independent cascade model is non-monotone and non-submodular, which means the traditional greedy algorithm is not applicable to the NPIM problem. Thus, we propose an improved R-Greedy algorithm to solve the NPIM problem. Extensive experiments on two Epinions and Slashdot datasets indicate the differences between positive influence and net positive influence, and also demonstrate that our proposed solution performs better than the state-of-the-art methods in terms of promoting net positive influence diffusion in less running time.
Introduction
In recent years, Facebook, Twitter and other social networks have achieved greatly. Large-scale and diverse data provided by these online social networks promote the social computing research, in which influence maximization (IM) is a hot research topic. IM is the problem of choosing a subset of influential users named as seed node set, such that the propagation of influence could be maximized if we adopt the seed node set to spread the influence or information in social networks.
Kempe et al. [17] firstly formulated the influence maximization as a discrete stochastic optimization problem. They established that the optimization problem is NP-hard under Independent Cascade (IC) model and Linear Threshold (LT) model, and presented a greedy algorithm to solve the problem. Then for solving the scalability shortage of the solution of Kempe et al., improved greedy algorithms, scalable heuristics and a martingale approach were proposed in [7–10, 18]. Furthermore, researchers also extended influence maximization problems from different perspectives, such as time-critical influence maximization [6, 24], competitive or comparative influence maximization [2, 28], topic-aware influence maximization [4], influence maximization on the heterogeneous networks [34], and so on.
All the above studies were conducted in unsigned social networks which ignored the polarities of relationships between users. Actually, however, the relationships own the property of the polarity, which can divided into positive relationships (e.g., friend or like) and negative relationships (e.g., enemy or dislike). The social networks which contain not only positive relationships but also negative relationships are named as signed social networks. In the studies of influence maximization, ignoring the relationship polarities between users and treating signed social networks as unsigned ones roughly lead to overestimation of positive influence which will cause bad effect in practical applications finally. Influence maximization in signed social networks is a critical problem that remains pretty much open.
Aiming at signed social networks, few studies [13, 35] proposed positive influence maximization (PIM) problem and developed several solutions. The goal of the PIM problem is to select the seed node set with maximum positive influence in signed social networks. However, maximizing purely positive influence may be not the final and best goal in signed social networks. Because, the seed set with maximum positive influence selected by methods in [13, 35] may also has a large amount of negative influence. When we adopt the selected seed set with big positive influence and big negative influence to promote a product, it will cause the situation that many users like the product but plenty of other users dislike this product at the same time. This situation is not the results we expect. Therefore, in signed social networks, how to select the seed node set with as much positive influence as possible and as less negative influence as possible may be a better goal. To this end, we propose net positive influence and the corresponding problem of net positive influence maximization (NPIM). Net positive influence considers positive influence and negative influence simultaneously. Maximizing net positive influence means maximizing positive influence and minimizing negative influence at the same time. So, our proposed net positive influence is helpful to obtain better performance in the practical applications. Specifically, we make the following contributions in this paper: We introduce the concept of net positive influence and propose the net positive influence maximization problem in signed social networks. We prove that the influence function of NPIM problem under the polarity-related independent cascade (IC-P) diffusion model [38] is non-monotone and non-submodular, which means that the classic greedy algorithm is inapplicable to this problem. In this paper, an improved R-Greedy algorithm named as NPIM R-Greedy is developed to solve the NPIM problem. The experiments are carried out on Epinions and Slashdot datasets. The comparison experimental results with closely related our study verify the superiority of our method.
The rest of this paper is organized as follows: In Section 1, related studies are reviewed. In Section 2, problem statement is presented. In Section 3, the Polarity-related independent cascade diffusion model is introduced. In Section 4, the influence function of NPIM under IC-P model is proved that it is both non-submodular and non-monotonic and then the NPIM R-Greedy algorithm is presented. In Section 5, the results from experiments prove that our method is effective. Finally, the conclusion and research direction is put forward in Section 6.
Related work
Influence maximization as an algorithmic technique for viral marketing was first proposed by Domingos and Richardson [12]. Kempe et al. [17] used discrete optimization to express it. They proved that the problem belongs to NP-hard under IC and LH models and also presented a greedy-approximation solution. To address the inefficiency issue of the greedy method, Leskovec et al. [18] proposed a seed nodes seeking method named as “Cost-Effective Lazy Forward” (CELF), which fully makes use of the sub modularity property of the objective function. Chen et al. proposed two heuristic methods, DegreeDiscountIC [8] and prefix excluding maximum influence arborescence (PMIA) [7], which leverage local arborescence structures of each node to approximate the influence propagation. Jung et al. [15] proposed the IRIE algorithm integrating the advantages of influence estimation (IE) and influence ranking (IR) methods, which runs much faster than the PMIA method. Cheng et al. [9] presented a static greedy algorithm that can strictly guarantee the submodularity of influence function in the process of the seed selection. They also [10] proposed an iterative ranking framework (IMRank) to solve the influence maximization problem.
Budak et al. [3] attempted to address the influence limitation problem where a bad campaign starts to propagate from a certain user in the network and limiting campaigns are released to restrict the impact of misinformation. Similarly, under the competitive linear threshold (CLT) model, He et al. [14] studied competitive influence diffusion in social networks. To simulate multiple cascades over a network, Pathak et al. [28] proposed a generalized version of the linear threshold model which allows nodes to switch states between different cascades. Borodin et al. [2] developed several natural extensions of the linear threshold model for the multiple-technology case. Chen et al. [5] extended the independent cascade model for incorporating the emergence and propagation of negative opinions. Furthermore, Ou et al. [27] took into account the fact that products are not purely competitive, and studied influence maximization for complementary goods. Liu [24] and Chen et al. [6] considered the time factor of influence diffusion, and explored the time constrained influence maximization problem. Li et al. [22] studied the location-aware influence maximization problem and proposed the extend maximum influence arborescence (MIA) algorithm to find the node set with maximum influence within a specific geographical region. Song et al. [3] developed the problem of targeted influence maximization that considers both time and location factors of influence spread. Wang et al. [34] proposed the problem of influence maximization on the heterogeneous networks. Chen et al. [4] studied topic-aware influence maximization problem.
Zheng et al. [41] proposed a hybrid algorithm based on centrality weighted link strength based on the idea of central heuristic. Zhu et al. [43] proposed a node influence recognition method based on community structure and k-shell node method. Nguyen et al. [26] developed a network model with non-uniform propagation probability between nodes. Qin et al. [30] proved that IC model could not accurately simulate the information diffusion structure in real networks, and proposed a three-step cascade model (TSCM) to simulate the information diffusion process in online social networks. Yang et al. [31] introduced the thermal diffusion model into the research on the maximization of personalized influence and proposed the target thermal greed algorithm. Wang et al. [36] proposed a new multipath asynchronous threshold model that quantifies the effects and tracks their diffusion and aggregation. Aiming at the problem of small influence scope caused by selecting only local optimal nodes, Zhu et al. [42] proposed a maximization algorithm based on structure holes and degree discount, taking into account the propagation advantages of core nodes and structure hole nodes comprehensively.
Li et al. [20] attempted to seek grassroots as seeds instead of traditional elites, and constructed comprehensive experiments to demonstrate the effectiveness of their solution. Yang et al. [39] formulated the general continuous influence maximization problem, and developed a general coordinate descent algorithmic framework to solve it. Keikha et al. [16] leveraged deep learning techniques for IM problem, which considers both local and global structure during network feature. Li et al. [23] studied the cost-aware target viral marketing (CTVM) problem aiming to seek the most cost-effective users to influence the most relevant users. Peng et al. [29] explored the problem of adaptive influence maximization with myopic feedback under the classic independent cascade model. Wu et al. [37] studied the problem of online influence maximization in social networks, which capitalizes on an important property named network assortativity. Banerjee et al. [1] proposed a community-based solution approach for solving the budgeted influence maximization problem. Ding et al. [11] proposed the realistic independent cascade (RIC) model for IM problem, which considers the acceptance probability of candidate seed nodes in social networks.
Although influence maximization has attracted tremendous attentions, few existing studies focused on influence maximization in signed social networks. Li et al. [21] proposed the polarity-related influence maximization (PRIM) problem in signed social networks, which contains positive influence problem and negative influence problem. They extended the standard Independent Cascade (IC) model to the signed social networks, and proved that the influence function of the PRIM problem under the extended IC model is monotonic and sub-modular. Then they used a greedy algorithm to solve the PRIM problem which can approximate the optimum solution within a factor of 1 - 1/e. Following the mind of [21], Wang [35], Shen et al. [32] extended Linear Threshold model to signed social networks for positive influence maximization. Shi et al. [13] studied the problem of selecting the optimal individual subset to disseminate the positive influence when time is bounded. They proved that such a problem is NP-hard, and proposed a heuristic algorithm based on greedy strategy. Li et al. [19] proposed the Polarity-related Linear Influence Diffusion (PLID) model, which can quickly and accurately estimate polarity-related influence of user sets without Monte-Carlo simulations. Yin et al. [40] proposed a novel Signed-PageRank (SPR) method, to study the problem of influence maximization for advertisement recommendation in signed social networks.
From the above review about influence maximization problem, we can find that researchers have noticed the importance of relationships polarities to the influence maximization. Existing studies extended influence maximization problem from unsigned social networks to signed networks, to seek the node set with maximum positive influence or negative influence. Different from all of them, this paper proposed the concept of net positive influence, and aims to find the seed set owing the biggest net positive influence. Compared with the existing studies, our research is closer to reality and is able to perform better in some application scenarios.
Problem statement
In signed social networks, existing influence maximization researches only considered how to seek the seed set owing maximum positive influence, however, ignored the fact that the seed set with lager positive influence may also have lager negative influence. Aiming at this drawback, we first propose the concept of net positive influence which is positive influence subtracting negative influence. Based on this concept, we introduce the Net Positive Influence Maximization (NPIM) problem, to seek the seed set with as much positive influence as possible and as less negative influence as possible. In some real application scenarios, net positive influence maximization will be more valuable than positive influence maximization. Next, we give the details of the NPIM problem definition.
We denote f+(·) as the positive influence function. Given a node set S, f+(S) returns the expected number of nodes activated to be positive state by S, which is considered as the positive influence of S. Similarly, we denote f-(·) as the the negative influence, and f-(S) returns the expected number of nodes activated to be negative state by S which is considered as the negative influence. Then, we define f±(·) as the net positive influence, i.e., for the seed set S, f±(S) = f+(S) - f-(S). How to calculate the net positive influence of seed sets is a key problem. In this paper, we adopt Polarity-related Independent Cascade (named IC-P) diffusion model [21] to estimate the values of f+(S) and f-(S), which is an extension of Independent Cascade model in signed social networks.
The definition of the NPIM problem is as following: a signed social network G, a information diffusion model as well as a non-negative number k are given, the goal is to seek a set S containing k seed nodes such that the expect net positive influence f±(S) is maximized. This problem can be formalized as,
In our NPIM problem, we design all nodes in the seed set as positive state or to be 100% influence positively. This design is decided by the specific application scenarios of the proposed NPIM problem. Here, we take a typical application(i.e. viral marketing) of the NPIM problem as example to explain the above assumptions. In viral marketing, the application of NPIM is primarily intended to identify the seed set owing the greatest net positive influence to promote some products over signed social networks. Under this scenario, how to design the initial seed set generally has two options. In the first option, the node set only contains positive nodes. In the second option, the node set contains not only positive nodes, but also negative nodes. If the second option is chosen, it implies that one company will look for some persons and pay them to post negative news related to the products. Obversely, this is not reasonable. Therefore, the latter option is not suitable to be applied in viral marketing. In the NPIM problem defined by us, although all the seed nodes are set to be positive, the negative relationships that exist in the signed social networks will produce negative information or influence [21].
In this paper, we adopt polarity related independent cascade model (IC-P) [21] to estimate the polarity influence of candidate seed sets. Here, we first introduce how to model a signed social network, and then present the details of IC-P model.
In this paper, we model a signed social network as a directed, weighted and signed graph G =(V, E, A, P). V is the set of nodes which correspond to users in the social network (“user” and “node” are alternative in this paper). E is the set of edges which correspond to relationships between users. A is a nonnegative adjacency matrix whose element Au,v represents the diffusion probability of edge (u, v) in the graph. If and only if the edge (u, v) ∈ E, the value of Au,v will be a positive value, otherwise zero. P is another matrix where each element Pu,v represents the sign of edge (u, v) in the graph. The value of Pu,v has three options {1, - 1, 0} which are corresponding to positive relationship, negative relationship and no relationship.
The IC-P model is an extension of the IC model in signed social networks that fuses the polarity of relations among users. In the IC-P model, the states of every node have three options: positive state, negative state and inactive state. For a user u, positive state means that this user likes or supports the disseminating information and s/he will continue to spread the information with his/her positive opinions. Negative state means that user u dislikes or opposes the disseminating information and s/he will continue to spread the information with his/her negative opinions. Inactive state means that u does not adopt the disseminating information, and also will not continue to spread the information. We denote S(u) as the state of node u, whose values 1,-1 and 0 represent the positive state, negative state, and inactive state of u, respectively.
As Li et al. introduced in [21], “in the IC-P model, the propagation process begins with an initial node set S where the state of each node can be positive or negative, and all of other nodes not belonging to S are inactive in the network. Here, we first define
For a newly activated node v, its status S(v) depends on the polarity state of the node u that activated v and the relationship polarity between two users, meaning S(v) = S(u) × Pu,v. Therefore, if the polarity of node u is positive, and the relationship between u and v is positive, then the polarity of node v will be positive. if the polarity of node u is positive, and the relationship between u and v is negative, then the polarity of node v will be negative. If the polarity of the node u is negative, then the relationship between u and v is negative, and the polarity of the node v will be positive. If the polarity of the node u is negative, and the relationship between u and v is positive, then the polarity of the node v will be negative. We can see that, the social principles “the friend’s friend is friend, the friend’s enemy is enemy, the enemy’s enemy is friend and the enemy’s friend is enemy” has been integrated into the IC-P model.
Property and Solution for NPIM Problem
Property of NPIM problem
For proving the Theorem 1, we first present what are monotonicity and submodularity. Given influence function f±(·) and node set S, T, if f±(S) ≤ f±(T) whenever S ⊆ T, then f±(·) is monotone. f±(·) owns the submodularity if it satisfies a natural “diminishing returns” property: f±(S ∪ {v}) - f±(S) ≥ f±(T ∪ {v}) - f±(T), for all nodes v and all pairs of sets S ⊆ T, i.e., the marginal gain added from one node to S set should be the same as or higher than the marginal gain added from the same node to superset S.
To prove the monotonicity and submodularity of non-polarity influence function, Kempe et al. [15] developed a live-edge process that is equivalent to the propagation process. In this paper, we adopt a similar method to prove the Theorem 1.
In the live-edge process, an event will happen that a new activated node u will attempt to activate the neighbor node v with the probability of Au,v, which seems to toss a double-side coin with Au,v. In the view of this process, it’s nothing about that the coin tossing begins from the start of the whole process or the activation of time v. A coin tossing stands for the successful activation, the edges of such an event will become a candidate-live that if the starting node of the directional edge is activated, the close nodes may also be activated. If a node has more than one candidate-live edge, then we will randomly select one of them as the live edge, meanwhile the other candidate-live edges will be blocked.
Once the result of the coin tossing have been determined, we will select the live edge for each node and initialize all nodes in the seed set S as live, which will also allow us to clarify all positive nodes set at the end of the cascading process:
If there is only one path can reach the x formed by all live edges from one node in S, and the polarity of this path will be positive, node x will change to be positive. We define the path as follows, the live path is path(n1, n
k
) =(n1, n2, ⋯ , n
k
) with the range from n1 to n
k
, and the polarity of the path(n1, n
k
) is
For a node v, Li et al. [15] have proved that the probability of v activated to be positive by the diffusion process is the same as the probability of v determined to be positive in the live-edge process, under the IC-P diffusion model. Because, we choose the IC-P model to solve the NPIM problem. So, here we can adopt directly the live-edge process to prove the Theorem 1.
For a new node u adding to set S, some new live edges with positive or negative polarity will appear. Both positive influence and negative influence of seed node set will increase. There exists a situation where the number of new live edges with positive polarity is bigger than that with negative polarity. In this situation,
Next, we attempt to analyze the submodularity. We denote S and T as two nodes sets, and set S belongs to T, i.e. S ⊆ T. Here, we compare
similarly,
then,
Because, the influence function f+(·) and f-(·) have been proved to be submodulairty [21]. So the value of the former part of the above equation is greater than 0, and the latter of the former part of the above equation is also greater than 0. There may exist a situation where the latter one is smaller than former one. In the situation,
Restricted Greedy Solution for NPIM problem
Due to the non-monotonicity and non-submodularity of the net positive influence function f±(·) under IC-P model, the traditional greedy algorithm [17] becomes inapplicable. Lu and Lakshmanan [25] developed a U-Greedy algorithm for solving profit maximization problems which are non-monotonicity and non-submodularity. Furthermore, to solve the problem of Influence Maximization with Novelty Decay, Feng et al. [13] proposed a restricted greedy (R-Greedy) algorithm which improves the efficiency of the U-Greedy algorithm. Here, we resort to R-Greedy algorithm to solve the NPIM problem in this paper. The improved R-Greedy is named as NPIM R-Greedy algorithm, whose details are presented in Algorithm 1.
The core idea of the NPIM R-Greedy algorithm is to select the first K nodes with maximal marginal influence, and then choose the seed nodes that have the maximum influence propagation. Algorithm 1 illustrates the NPIM R-Greedy algorithm, where S
k
represents the collection of selected seeds. s
k
represents the single seed node selected in the Round k. We add node u to the selected seed set Sk-1, the net positive influence of the new obtained set is
Experiments
Experimental setup
In order to get the net positive influence of the seed node sets chosen by the above compared algorithms, we simulate up to 20000 times for each selected seed node set using the IC-P diffusion model in the signed graph. Finally, we calculate the average value of 20000 simulations for each seed node set. We set the value of the selected number k as 50 in all algorithms. Then, we compare the net positive effects of all selected seed node sets in the range from 1 to 50. These simulations are performed on a desktop computer, configured with 8G memory and Intel four-core processors, Quad-Core Intel Core(TM)i7 870 (2.93GHz).
Experiment Result

Net positive influence spread performance of different methods on Epinions dataset for PIM problem.
Figure 2 presents the performance concerning the NPIM problem, applying five different algorithms with two types of diffusion probability on the Slashdot dataset. Similar with the results on the Epinions, our proposed method shows the best performance. Compared to IC-P Greedy, IC Greedy, Positive Out-Degree, and Out-Degree methods, our method is 8.3%, 49.6%, 14.1% and 43.7% better with WC model, and is 5.5%, 17.9%, 3.9% and 26.3% better with UN model, when the size of node set is 50.

Net positive influence spread performance of different methods on Slashdot dataset for PIM problem.

Positive influence spread performance of different methods on Epinions dataset.

Positive influence spread performance of different methods on Slashdot dataset.
Combining all experiment results shown by Figs. 1–4, we can obtain two viewpoints: (1) The relationship polarities in signed social network should be considered in the influence maximization problems. Ignoring relationship polarities inevitably leads to wrong estimation of influence. (2) Net positive influence is different from positive influence, the method obtaining the seed set with maximum positive influence can not take the biggest net positive influence. Based on the above viewpoints, we can conclude that our proposed NPIM study is necessity and indispensable.
Running time of different methods on two datasets (Sec).
From above comparison results, we find that our NPIM R-Greedy can obtain better performance than IC-P Greedy and IC Greedy methods in terms of net positive influence diffusion in much less running time. Therefore, we can conclude that our proposed NPIM R-Greedy is the most suitable solution for the NPIM problem. As the NPIM problem is closer to practical applications than PIM or IM problem, so this paper provides a better solution for viral marketing and opinions promotion.
Influence maximization in signed social networks is a critical problem, which is further studied in this paper. First, we introduced the concept of net positive influence which is equal to positive influence subtracting negative influence, and then we proposed the net positive influence maximization problem. Next, we proved that the objective function of the NPIM problem under the IC-P diffusion model is non-monotone and non-submodular. Thus, an improved R-Greedy method is developed to solve the NPIM problem. Finally, extensive experiments on two large signed social networks demonstrate the superiority of our solution in terms of promoting net positive influence diffusion.
Several challenges and future directions remain. The influence diffusion is a temporal dynamic process. How to combine time factor and polarity property together for solving temporal polarity-related influence maximization problem will be an interesting problem. Another future direction is to conduct more new diffusion models for signed social networks, such as the extensions of the epidemic model and the game theory model.
Footnotes
Acknowledgments
This work is funded by the National Natural Science Foundation of China (No. 61702138 and No. 61602128 and No. 61672322 and No. 61672185), and the Shandong Province Natural Science Foundation of China (No. ZR2016FQ13), and the China Postdoctoral Science Foundation (No. 2019M662360, No.2017M621275 and No. 2018T110301), and Young Scholars Program of Shandong University, Weihai (No. 1050501318006), and Science and Technology Development Plan of Weihai City (No. 1050413421912).
