Abstract
The analysis of communities and their evolutionary behaviors in dynamic social networks is a challenging topic. Although a lot of work on this topic is emerging, there are few methods which can reveal and track meaningful communities over time efficiently. In this paper, an approach of social community evolution based on gravitational relationship and community structure in a dynamic social network is proposed. Our proposed method that concentrates on both community structure and micro-blog content can be used to reveal and track the process of community evolution. First, we set up the undirected dynamic network in the original snapshot. Second, communities are initialized by constructing the basic nodes, and the other nodes can be iteratively searched and divided into corresponding communities. Third, we analyze community evolution and propose DCVIE algorithm to extract community evolution behavior. Finally, The experimental results show that the proposed method performs well on detecting communities as well as tracking communities evolution in dynamic social networks.
Introduction
Nowadays, Social media is still so popular because of its unprecedented convenience to permit human interaction with each other. With widely-available large-scale networks in social media, social activities of people are variable with the habits and hobbies that are slowly changing. This gives arise to the evolution of social network, such as there always exist many new registered users and disappearing users. Kumar et al. [18] exposed a surprising segmentation of these networks to simulate the evolution of social network. Our fundamental task is to study dynamic properties of social networks. The existing dynamic social network research mainly has two aspects from the perspective of analyzing object. One is studying of the type of users in social network [1, 18, 21] and the other is studying of the change of users and relations between users in social network [13, 14]. In the former one, users in social network are classified into three groups: passive, linkers and inviters. In the latter one, there are four adjusting policies specific to the four dynamic events: node adding, node deleting, link adding, link deleting.
Comparing with the existing methods, we have considered the evolution of users’ interest in a community. For example, there is a community
In this paper, we analyze the community evolution through the gravitational relationship and community structure. Inspiring by the dynamic community evolution algorithm (GCEA) [5] that based on gravitational relationship refactoring, we refactor the gravitational relationship by considering the community structure and the interest cohesion. Our method can analyze the community evolution from two aspects: the visible community evolution and the invisible community evolution. The visible community evolution contains four dynamic events: node adding, node deleting, link adding, link deleting, and the invisible community evolution is the changing of the users’ interest.
The community evolution scenarios raise new challenges to traditional community detection algorithms. To deal with the visible community evolution and the invisible community evolution, a higher rationality and a dynamic adaptability are necessary and challenging. Therefore, we propose the DCVIE (the dynamic community visible and invisible evolution algorithm) to analyze the evolution of community. Our main contributions in this paper can be summarized as follows:
The social network in the original snapshot is described as a tuple ( Communities are initialized by constructing the basic nodes. The other nodes can be iteratively searched and divided into corresponding communities. We analyze the community evolution and propose the DCVIE algorithm to track the visible and invisible evolution of the community. We analyze the performance of our method via experiments based on two datasets.
The rest of this paper is organized as follows. In Section 2, we survey the related work. We introduce the dynamic social network and its gravitational relationship in Section 3. In Section 4, we propose the approach, including a community detection algorithm, a dynamic community evolution algorithm and an algorithm for extracting community evolution behavior (DCVIE). We report the experimental results in Section 5. Finally, Section 6 concludes this paper and gives a brief account of future work.
Evolution of a particular social community can be represented as a sequence of events following each other in the successive timeframes within the temporal social network. In other words, the evolution is described by identifying group transformations from time
Four categories of community evolution algorithms
Four categories of community evolution algorithms
First, for the dynamic clustering methods, these methods [10, 20, 24, 26] are mainly based on the EC (Evolutionary Clustering) proposed by Chakrabarti et al. [3], seeing the time snapshot as clustering sample unit, integrated modeling the node distribution in time snapshot
Second, for the objective function optimization methods, these methods are guided by the optimization of community density function or modularity to estimate the community evolution by the structure changes. Blondel et al. [19] combined regional optimizing with hierarchical clustering to apply the FN [9] to dynamic community detection. Nguyen et al. [13, 14] designed four adjusting policies specific to the four dynamic events (node adding, node deleting, link adding, and link deleting). They proposed the Adaptive Finding Overlapping Community [13] and Quick Community Adaption [14] methods, which improved the community density function and modularity respectively. Guo et al. [2] and Zhou et al. [23] proposed the Evolutionary Community Structure Discovery and the multi-objective optimization method for dynamic community detection, respectively. The former’s goal was weighted dynamic community detection, and the latter revised the community assignments of new or changed vertexes during network update.
Third, for the representative node detection methods, Zhang et al. [30] proposed the Bulk Synchronous Parallel (BSP) method, which established the similarity measurement using the number of common neighbors and their connections. Duan et al. [4] proposed the compactness model to evaluate the regional community tightness. Chen et al. [31] proposed the representative-based community evolution tracing method, which made an assessment of the detected communities to determine which was stable. Takaffoli et al. [11] provided an evolutionary analysis method to identify the nodes, which adopts the steady-state and impact models to recognize the stable and representative nodes. Ma and Huang [6] proposed the Community Update and Tracking (CUT) model, which detected the seed community (the clique community).
Forth, for dynamic probability modeling methods, Sarkar and Moore [16] adopted the kernel method to construct the probability relation between nodes, and then utilized the probability relation between nodes and communities in each time segment to construct the Bayesian model between the nodes and latent communities. Sun et al. [7] proposed the Dirichlet Process Mixture (DPM) model which sampled the nodes in each time segment. They constructed the Bayesian model between nodes in the adjacent time segment. Thus, the latent communities can be detected by the LDA method. Xin et al. [22, 29] proposed the ARTs models to adaptively adjust the community structure as the the communitiesâ semantic cohesion changes.
The traditional community evolution methods as discussed above focus on the same point, that is community structure. Most of these methods follow the same framework. First, they detect community by a static community detection algorithm in each timestep. Then, optimal match between two consecutive timesteps is implemented to generate the evolution events. However, in many cases data is not expected to evolve dramatically. Thus it is time-consuming to separately consider the network data for each timestep. Moreover, it is very time-consuming to use static community detect algorithm to find community in each timestep. On the other hand, for social networks, we can analyze the network from content. Because the evolution of the community lies not only in its structure, but also in the nature of the community.
In this article, we use the principle of gravity in physics. First of all, we build the initial social network and use the gravity between users to detect community. Then, based on the evolving sub network, the community is evolved by reanalysis of its gravity. Finally, we will use algorithm DCVIE to track the evolution behavior of the community. In this article, we divide the community evolution behavior into visible and invisible evolution. The visible community evolution contains seven events, which are “Continue” event, “Shrinking” event, “Growing” event, “Splitting” event, “Merging” event, “Dissolve” event and “Form” event. For the “Continue” event, “Shrinking” event and “Growing” event, the changes of community are not large in the consecutive time. In these three visual evolution events, the community may have an invisible evolution event. And we will analyze the invisible evolution event by gravitational relationship.
The dynamic social network
To be able to describe evolution of social communities, we need to introduce the general concept of static social network. For a static network, it consists of nodes and edges, where a node indicates an individual who participates in the network and an edge between two nodes explains a link existing. A dynamic network is composed of a series of continuous static networks. In this article, the social network in the original snapshot is described as a tuple (
Gravitational relationship
In this paper, we analyze the evolution of the community through the gravitational relationship and community structure. Inspired by the dynamic community evolution algorithm (GCEA) [5] that based on gravitational relationship refactoring in dynamic networks, we refactor the gravitational relationship by considering the community structure and the interest cohesion. In the universe, large numbers of heavenly bodies form some stable structures called galaxies under the action of the gravity among them. In [5], Yin et al. tried to use the gravity to measure the close degree between any two nodes in the complex networks. They constructed a gravitational relationship matrix to detect the communities of a static network, and further divided the nodes into communities according to the close degree. In Physics, the formula of universal gravitation is as follow:
Where
If two nodes are connected, the
Inspired by the above formula, we refactor the gravitational relationship by considering the community structure and the interest cohesion.
In our method,
Where the
In order to better measure the gravity between two nodes, we use interest similarity to measure the distance between them. The greater the interest similarity of the two nodes, the smaller the distance between them. Thus, we measure the relationship between distance and interest similarity by Eq. (6).
In this article, we set G to 1, because G is the gravitational constant. Due to the degree of a node can partly reflect the local influence of the node, Yin et al. [5] used the size of degree to represent the mass of the node. We use the idea in [5] to let
Based on the above analysis, we can establish a weighted social network which takes the users as the nodes and the users’ relationships as the edge. The gravity between every two users in the social network can also be obtained.
In physics, two spheres attracts each other by gravity. A node with larger
The flow chart of the method we proposed.
Community detection aims at grouping nodes in accordance with the relationships among them to form strongly linked sub network from the entire social network. Different from the existing community detection algorithms, we use the Bnodes to attract other nodes to detect the community.
The greater the mass of a sphere, the more attractive it is in physics. Thus, we think of nodes with larger
After getting the initial community based on the Bnodes, we will detect community for other nodes by gravity. A node (not a basic node) would be drawn into a community directly if it only attracted by the community. However, if a node (not a basic node) is attracted by two or more communities simultaneously, it would belong to the community with the greatest attraction to it. Here, we define gravity of a node and a community as the average value of gravity between the node and other nodes in the community.
When we go through all nodes which not belong to the Bnodes, we get the communities of the social network. The method of community detection as shown in Algorithm 4.1. At step 1, we find all the basic nodes in the social network. At steps 2–8, the nodes that belong to Bnodes will be partitioned into the corresponding community. At steps 3–4, if the two basic nodes are connected, we put them in the same community. At steps 5–6, if a basic node is not connected to other basic nodes, we create a community for it. At steps 9–10, for nodes that do not belong to Bnodes, they belong to the community with the greatest gravity.
UTF8gkai Community detection[1]
Evolution of a social community can be described by identified community transformations from time
In a dynamic network, the reason for the occurrence of events mentioned above is that nodes and edges change over time. Next, we first analyze the evolution of the community, and then track the evolution of the community behavior. In general, a social network
UTF8gkai Dynamic community evolution[1]
After analyzing the evolution of the community, we will use algorithm DCVIE to track the evolution behavior of the community. In this article, we divide the community evolution behavior into visible and invisible evolution. The visible evolution of community contains seven events mentioned above, which are “Continue” event, “Shrinkin” event, “Growing” event, “Splitting” event, “Merging” event, “Dissolve” event and “Form” event. For the “Continue” event, “Shrinking” event and “Growing” event, the changes of community are not large in the consecutive time. Thus, we will analyze the invisible evolution of community. Because of the changes of nodes and edges, as well as the user’ interest changes in the consecutive time, the gravity of nodes in the community will change. Thus, we consider the average gravity change in the community as the invisible evolution of the community. The formula for average gravity change is shown in Eq. (10).
In order to track the evolution of the community, we define a community relationship map between two adjacent time snapshots according to the partial evolutionary graph proposed by [25]. First, we show the changes of social network through the ownership relationship between the community and the users. We only study the users belonging to
The ownership relationship between the communities and the users.
The community relationship map between two adjacent time snapshots.
The condition of community evolution behavior
Table 2 shows the evolution of the community. Parameters
UTF8gkai DCVIE[1]
In this section, we use two dataset that are labeled Dataset1 and Dataset2 to study the performance of our method. The Dataset1 is obtained from Sina micro-blog platform by intelligent reptiles. We obtain the Dataset2 from the Micro-blog User Portrait Contest in 2016. The detailed information of the two datasets is shown in Table 3. Before reporting the experimental results, we first describe the methods comparison and the experimental setup.
The two datasets for the experiments
The two datasets for the experiments
The number of communities at different snapshots. (
The average community gravity of each snapshot via iLCD, LDM-CET, GCEA and our method.
The average modularity of each snapshot via iLCD, LDM-CET, GCEA and our method.
All experiments are implemented in Python and performed on a PC with an Intel i7 4.0 GHz CPU and 8 GB memory. We conduct our experiments on two datasets and contrast the results with other proposed algorithms, such as the iLCD [17], LDM-CET [25] and GCEA [5]. The iLCD approach [17] updates the community structure at each timestep by analyzing the nodes and edges that have changed. It updates the communities mainly based on two metrics, i.e., the estimation of the mean number of second neighbors and the estimation of the mean number of robust second neighbors. The evolutionary behaviors of communities are obtained by matching the node-community memberships at two consecutive timesteps. The main idea behind LDM-CET [25] method is to detect dynamic communities by exploring the local views of nodes. Moreover, based on the dynamic communities that have been discovered, the global community structure can be derived by updating the historical community structure and the evolutionary behaviors of communities can also be tracked. Yin et al. [5] divide the evolutionary processes between adjacent snapshots into three phases and make use of the GCEA method to judge the evolutionary events of communities from the different variations of the core nodes chains, then adjust the communities via redistributing the core nodes chains of communities and refactor the gravitational relationship in the network. We first analyze the community quality of each timestamp from community gravity and community structure, and then analyze the accuracy, recall, and F values of each method to extract community evolution behavior.


Visual and invisible evolution ratio in Continue, Shrinking and Growing events.
Our method not only considers community structure, but detects communities by analyzing micro-blog content. Therefore, we will analyze the community quality from two aspects: the community gravity and community structure. The users in a common community have the highly similar interest in general. Thus, we contrast our method with other methods using an evaluating indicator, e.g., average community gravity. The average community gravity of a social network can be calculated by Eq. (11).
Where
Where
Figure 4 shows the number of communities in the dynamic evolution of the two networks. we do not show the iLCD, because iLCD detects a large number of overlapping communities for each timestep and the number of them is large. Compared with our approach and LDM-CET, the number of communities by using GCEA is relatively stable. The number of communities is different in two adjacent timestamps, so the communities are evolving. We calculate the average community gravity and the average modularity of communities in different snapshots on the four methods, and the comparative result showed on Figs 5 and 6, respectively. The qualities of the obtained communities on the four methods all keep a high level. As shown in Fig. 5, our approach has a higher average community gravity in comparison with the other three methods. The GCEA method has a low average community gravity, which can be clearly seen in the Dataset2. The higher the average community gravity is, the higher the similarity between users in the community. The average community gravity in Dataset2 is higher than in Dataset1, because Dataset2 is pre-processed. From Fig. 6, we can see that the four methods have similar modularity values. Obviously, the modularity in the Dataset2 is higher than in the Dataset1. Compared with the other three methods, the modularity of the iLCD method is relatively low, which we can see in the Dataset1 and in the Dataset2.
To evaluate different methods on tracking the evolutionary behaviors of communities, we extract the evolution behaviors according to the definitions in Table 2.
Table 4 shows the recall of the critical evolution events tracked by different methods. From the results, it can be observed that in most cases our method performs best, followed by GCEA. iLCD performs worst, because the number of communities detected by iLCD for each timestep is too large and many of them are small communities. The F-score values resulting from different methods are shown in Figs 7 and 8. It can be seen that our method and GCEA performs best on Dataset1, the differences are very slight. For the iLCD method, the F-score value in the Dataset2 is slightly higher than in the Dataset1. For our method, although F-score value in Dataset2 is slightly lower than that of Dataset1, our method still has higher F-score value compared to the other three methods. These results show that our method can lead to good results even if the network changes dramatically between two consecutive timesteps.
To study the evolution of users’ interest in the community, we discuss the invisible evolution of community in Continue, Shrinking and Growing events. As shown in Fig. 9, we can see that the community has an invisible evolution behavior at each timestep. At timestep 7–8, most communities have invisible evolution behavior. This shows that a lot of users’ interest has changed over this period of time. However, at timestep 9–10, only a small fraction of communities have invisible evolution behavior. This shows that only a few users have changed their interest over this period of time.
The recall of the community evolution events tracked by different methods.
The recall of the community evolution events tracked by different methods.
In this paper, an approach of social community evolution based on gravitational relationship and community structure in a dynamic social network is proposed to deal with Micro-blog community evolution problem.
The innovation of this paper is that, we analyze community gravity relationship and community structure at each snapshot, and extract community evolution behavior between two snapshots. First, we describe the social network in the original snapshot as a tuple (
In the future, it is interesting to study how to better analyze and utilize the impact of changes in users’ interest on the community.
Footnotes
Acknowledgments
This work is supported by the National Nature Science Foundation (Grant Nos 61472329 and 61532009), the grants from Scientific Research Fund of Sichuan Provincial Education Department (No. 17ZB0411) and the Innovation Fund of Postgraduate, Xihua University.
