Rumor detection model fused with static spatiotemporal information

Abstract

Spotting rumors from social media and intervening early has always been a daunting challenge. In recent years, Deep neural networks have begun to discover rumors by exploring the way of rumor propagation. The existing static graph models either only focus on the spatial structure information of rumor propagation or on time series propagation information but do not effectively combine them. This paper proposes the Static Spatiotemporal Model (SSM), which first extracts the textual semantic information and constructs undirected and directed propagation trees. Then obtains spatial structure information of rumor propagation through Graph Convolutional Network and extracts time series propagation information through the Recurrent Neural Network. The extracted spatiotemporal information is enhanced using different source node information hopping. Finally, SSM uses a weighted connection ensemble to rumor classification. Experimentally validated on datasets such as Weibo and Twitter, the results show that the proposed method outperforms several state-of-the-art static graph models. To better apply SSM in early detection and characterize early concepts, this paper presents a new data collection index for early detection, which can detect events that spread faster and have more significant influence in a targeted manner. The experimental results on the new indicators further verify the superiority of SSM as it can extract sufficient information in early detection or events with fewer participants.

Keywords

Rumor detection deep learning SSM spatiotemporal information early detection data collection index

1 Introduction

Many complex systems in nature and society can be described as complex networks [1]. Social networks are one instance that is developing extremely rapidly, users in which have developed into a vast volume. More and more people are accessing information, expressing opinions, and gaining attention on social platforms. The characteristics of low cost, large scale, and convenient manufacture also make the spread of rumors unscrupulous, causing great harm to the country, society, and individuals. Therefore, for the potential panic and threat brought by rumors, methods, and models with better performance are urgently needed to detect rumors effectively in social media.

Rumor detection was first conducted through fact-checking sites. Still, this manual method is time-consuming and labor-intensive with low coverage and is increasingly unsuitable for rumor detection in the field of social networking. As a result, people began to explore automatic detection methods, and most of the initial automatic detection methods were based on traditional machine learning. Traditional machine learning methods, such as decision tree [2, 3], support vector machine [2 –6], random forests [3, 7], logistic regression [7], naive Bayes [7], and traditional natural language processing [8], manually extracting features, such as user features, text content, metadata features, and dissemination structure feature to discover rumors, are time-consuming and labor-intensive and cannot extract deep features.

Recent automatic detection methods increasingly employ deep learning, which does not rely on manually extracted features. It is an end-to-end model and can extract deeper features to discover rumors with high accuracy. Some deep learning models focus on extracting deep semantic information of text content, such as Convolutional Neural Network (CNN) [9, 10], BERT pre-trained model [11], Long Short-Term Memory (LSTM) network [10, 12], Gated Recurrent Unit (GRU) [13] and so on. Rumors are usually deliberately written by imitating real news to mislead users, so to improve detection, the auxiliary information needs to be explored [14, 15]. Therefore, researchers begin to use increasingly models to explore the propagation information of rumors. RvNN [16], GRU [13, 17], ect. models can extract the time series propagation information of rumors, while Graph Convolutional Network (GCN) [18, 19] can extract the spatial structure information [20]. Temporal information focuses on the timing chain of deep propagation, while spatial information focuses on the surrounding neighbors of the source post, which we call spatial structure information. The spatiotemporal information has not been effectively combined for rumor discovery in the current static graph model. However, the depth propagation along the temporal relationship chain and the breadth diffusion within the community spatial structure are the two main characteristics of rumor propagation [18].

At the same time, the current early detection is not uniform for early concept characterization, and different standards are used for the collection of early data, such as the earliest stage [21], time window [18, 22], sample ratio, and number [16, 23] etc.

To simultaneously extract the spatial structure information and the time series propagation information of rumor propagation, this paper proposes a new static graph mixture model Static Spatiotemporal Model (SSM). The main contributions of this paper are listed as follows.

A rumor detection model, SSM, is proposed. It can effectively integrate static spatiotemporal information and text information for rumor detection. Experimental results show that the model outperforms several state-of-the-art static graph models.

Different from the experimental verification, combined with the data analysis, the rationality of using two layers instead of more layers to extract spatial structure information is analyzed in graph convolutional networks from the perspectives of spatial and frequency domains.

A new data collection index is proposed for early detection. It can better apply SSM to early detection, characterize early concepts, and detect events that spread faster and have more significant influence in a targeted manner. The experimental results on the new indicators further verify the superiority of SSM.

Two models are proposed based on SSM, Undirected SSM (UDSSM) and Directed SSM (DSSM), which combine the SSM model with the direction of spatial diffusion information. Experimental results on different datasets show that UDSSM has better detection performance, lower time complexity, and can extract much sufficient information in early detection or events with fewer users.

2 Related Work

This paper mainly introduces related work from two aspects: manual fact-checking and automatic detection methods.

2.1 Manual fact-checking

The initial rumor detection was done through fact-checking sites, mainly manual fact-checking. Manual fact-checking is the traditional way of fact-checking, either expert-based or crowdsourcing-based. Expert-based fact-checking is easy to manage and results in high accuracy, but it is expensive and limited in scale as the number of checks increases. Crowdsourcing-based fact-checking is easy to scale up but has low credibility and accuracy, as the cognitive biases and conflicting insights of the verifiers become a new hindrance. No matter which method is adopted, manual fact-checking is time-consuming and labor-intensive with low coverage. It has become increasingly unsuitable for the requirements of rumor detection in terms of scale, real-time, and accuracy.

2.2 Automatic detection method

2.2.1 Traditional Machine Learning

To make up for the lack of fact-checking, people began to explore automatic detection methods. People applied various machine learning algorithms to rumor detection and made many achievements. Traditional machine learning models train models by manually extracting user profiles, text content, tweet metadata, and dissemination structural features. Takahashi et al. [8] developed a system for detecting rumors from Twitter based on natural language processing techniques to extract text content features. Yang et al. [5] trained a support vector machine classifier for rumor detection based on attributes such as text content, user accounts, propagation features, and the newly proposed metadata features such as client information and location information. They proved the effectiveness of the newly proposed features. Kwon et al. [3] extracted three elements of time, structure, and language to train three classification models of the decision tree, random forest, and support vector machine. They proved the effectiveness and superiority of the selected features in detecting rumors. However, the methods mentioned above rely on feature engineering, i.e., hand-selecting elements, which is time-consuming and labor-intensive and cannot effectively extract deeper features.

2.2.2 Deep Learning

With the development of social networks, the number of users and tweets has increased, and the shallow features of rumors have become more and more blurred. The methods based on traditional machine learning have been unable to adapt to the detection of larger volumes of data, and people have begun to explore more effective ways for classification. People are increasingly favoring deep learning methods. The deep learning method does not rely on artificial features extraction, outputs end-to-end, and extracts deeper features. Indicators such as accuracy and recall are generally better than traditional machine learning algorithms.

Ma et al. [24] used a deep learning model (RNN) for rumor detection on Weibo for the first time and showed superior performance than traditional machine learning methods. Chen et al. [25] adopted Convolutional Neural Network (CNN) in SemEval-2017 task 8 for rumor short text classification. They ranked first in subtask B. Jin et al. [26] incorporated attention mechanism into RNN, based on multimodal features to detect rumors. However, the text content is highly counterfeit, and it is difficult for the proposed method to achieve higher accuracy by simply relying on the semantic features of the text. More research has begun to explore deep learning methods to learn deep structural features.

Ma et al. [16] were the first to deeply fuse structural and content semantic features based on a tree-like Recursive Neural Network (RvNN) for rumor detection. They propose two model variants, top-down and bottom-up, which achieve better results in both rumor classification and early detection. Chen et al. [23] incorporated a deep attention mechanism into an RNN to detect rumors, which could learn continuous hidden representations by capturing long-term correlations and contextual changes in post sequences. However, models such as RNN, LSTM, and GRU are more advantageous in extracting the time series propagation features along the chain. Still, they have shortcomings in obtaining the spatial diffusion features of the community structure. Bian et al. [18] used GCN to detect rumors for the first time. They aggregated the spatial structure information of rumor propagation from top-down and bottom-up directions and improved the accuracy again. Lotfi et al. [19] extracted the reply tree and user graph for each conversation using GCN to obtain information about users and how they interacted. They finally combined the outputs of the above two modules to detect rumors. Likewise, GCN cannot extract the time-series information of rumor propagation very well. Song et al. [27] proposed a rumor detection model based on a continuous-time dynamic diffusion network, which can effectively integrate spatiotemporal information and textual information, but the time complexity of the dynamic graph model is much higher.

The RNN is better at obtaining time-series propagation information. In comparison, the GCN is better at getting spatial structure information. But the current rumor detection model based on static propagation graphs does not effectively integrate these two kinds of information. Based on this idea, this paper proposes the Static Spatiotemporal Model (SSM) to more efficiently integrate the spatiotemporal information of rumor propagation, while avoiding the high time complexity of the dynamic graph model.

3 Problem Formulation

Let D = {c₁, c₂, …, c_i, …, c_n} be the rumor detection dataset, where $c_{i} = {s_{i}, r_{1}^{i}, r_{2}^{i}, \dots, r_{j}^{i}, \dots$ , $r_{m_{i - 1}}^{i}, E_{i}}$ means the i-rumor event, n represents the total number of events in the dataset, s_i means the source tweet of the event, $r_{j}^{i}$ describes the j-th retweet of the event, and m_i-1 represents the total number of retweets of the event.

$E_{i} = {e_{sr}^{i} | s, r = 0, 1, \dots, m_{i - 1}}$ describes the retweet relationship set of tweets in the event, $e_{sr}^{i}$ means the retweet or comment relationship from tweet s to tweet r in the event c_i, it represents a directed edge, and s, r = 0 illustrates the source tweet. At the same time, the adjacency matrix A_i ∈ {0, 1} ^m_i×m_i can be defined as follows. $\begin{matrix} a_{sr}^{i} = {\begin{matrix} 1, if e_{sr}^{i} \in E_{i} \\ 0, otherwise \end{matrix} \end{matrix}$ (1) $X^{i} = [x_{0}^{i T}, x_{1}^{i T}, . . ., x_{m_{i} - 1}^{i T}]$ represents the feature matrix of c_i, with each row representing a feature vector for a node.

The purpose of rumor detection is to train a classifier to do F : D → Y, where Y is the set of labels y_i for the datasets. In Weibo y_i ∈ Y = {F, T}, F and T representing false and true rumors, respectively. But in Twitter y_i ∈ Y = {N, F, T, U}, N, F, T and U representing non-rumors, false rumors, true rumors, and unverified, respectively.

4 SSM Rumor Detection Model

As shown in Fig. 1, SSM contains three steps: Data Preprocessing, Information Extraction and Representation, and Information Aggregation and Classification. Firstly, SSM preprocesses the original data, extracts the textual semantic information of the post to construct the word vector, and constructs the undirected and directed propagation trees respectively. Then the spatial structure information of the local spread of rumors is extracted through a two-layer graph convolutional neural network. At the same time, a recurrent neural network is to extract the time-series propagation information of the stand change of the post. SSM skips the connection of source node information in different ways to avoid the loss and masking of crucial details and finally aggregates all the information in weighted connection ensemble for rumor classification. Next, introduce the four modules of the SSM: Propagation Graph Construction, GCN module, GRU module, and Aggregation Classification, respectively.

Fig. 1

SSM framework.

4.1 Propagation graph construction

SSM refers to the propagation tree constructed by Ma et al. [16] and Bian et al. [18]. In the propagation tree, nodes represent user tweets, and edges represent comments, replies, and retweets between tweets. SSM firstly encodes the word vector, extracts the semantic information of the text, and constructs different word vectors according to modules to construct the feature matrix of the nodes. Then the undirected and directed propagation graphs are built, respectively. It establishes the adjacency list from top-down and bottom-up directions, which indicates the path between nodes. The relationship of parent nodes and child nodes established by recursively querying the connection of the nodes also suggests the path between nodes. SSM constructs undirected propagation graphs by combining adjacency lists in the upper and lower directions to form a symmetric adjacency matrix.

4.2 GCN Module

SSM uses a two-layer graph convolutional network to extract the spatial structure information of rumor propagation. And it utilizes the skip connection of the upper-layer source node information to enhance critical text information of each graph convolutional layer. This section combines data analysis to analyze the rationality of using two graph convolution layers instead of more layers and the reason for using source node information for skip connections. First, we introduce the Graph Convolution Layer (GCL).

4.2.1 Graph Convolutional Layer (GCL)

As shown in Fig. 2, the forward pass of each GCL is as follows: $\begin{matrix} X_{k} = σ ({\tilde{A}}_{sym} X_{k - 1} W_{k - 1}), \\ {\tilde{A}}_{sym} = {\tilde{D}}^{- 1 / 2} \tilde{A} {\tilde{D}}^{- 1 / 2}, \\ \tilde{A} = A + I, \\ {\tilde{D}}_{ii} = \sum_{j} {\tilde{A}}_{ij}, \end{matrix}$ (2) where X_k represents the output feature matrix of the k-th GCL; X₀ means the initial input feature matrix; ${\tilde{A}}_{sym}$ is the standardized self-connection adjacency matrix, i.e. the adjacency matrix plus a unit matrix of the same size; W_k-1 illustrates the weight matrix of the k-th GCL parameters; The degree matrix $\tilde{D}$ of the node is diagonal; σ (·) is the sigmoid activation function. GCN is a neural network model that stacks multiple layers with GCL as to the main body.

Fig. 2

GCL affine transformation.

4.2.2 Rationality Analysis of the Number of GCLs

Recent literature generally uses two-layer GCN to obtain spatial structure information, and two-layer GCN is more effective than more layers [19]. Both Li et al. [28] and Xu [29] et al. pointed out that the GCN model cannot stack as deeply as the CNN model in vision tasks, using the effectiveness of multi-layer GCN drops dramatically. We analyze the rationality of using two-layer GCN in the rumor detection task from the perspectives of spatial and frequency domains, respectively.

1) Spatial Domain

From the spatial perspective of graph signal processing, the adjacency matrix is equivalent to a graph displacement operator. The transformation from X_k-1 to X_k requires only the first-order neighbors of all nodes to participate in the calculation. That is to say, each additional layer of GCN is equivalent to aggregating information of the one-order neighbors of nodes additionally. For example, in Fig. 3, V₁, V₂, V₃, V₄, V₅ are node IDs, x₁, x₂, x₃, x₄, x₅ are node eigenvalues. For the V₁ node, in (a) → (b), only the news of the V₂ and V₃ nodes is aggregated. In (b) → (c), the V₁ node aggregates the information of node V₄ additionally, and the same is true for other nodes.

Fig. 3

2-layer graph convolution network neighbor information aggregation.

This paper analyzes several commonly used rumor data sets (Weibo, Twitter, BuzzFeed, PolitiFact, etc.) through the Neo4j graph database. We focus on the analysis of the order of the maximum propagation width [30] in the propagation graph of each type of data set. As shown in Tables 1 –5, ’Radius’ means the order of the maximum propagation width, and the content of the cell represents the number of events in the data set under the order and the label. Label ’0’ or ’False’ represents false rumors or fake news, and ’1’ Or ’True’ for true rumors or real news, ’Non-rumor’ for non-rumors, and ’Unverified’ is unverified information.

Table 1

Weibo Dataset [18]

Radius	1	2	3	4	5	6	7	9	10	11	13
label(0)	2227	108	10	3	2	0	0	0	0	0	1
label(1)	1890	344	34	24	11	5	1	1	1	1	0
Total Count	4117	452	44	27	13	5	1	1	1	1	1

Table 2

Dataset of Twitter15 and Twitter16 [16]

Radius	Twitter15				Twitter16
	1	2	3	4	1	2	4
label(False)	360	1	1	0	199	1	1
label(Ture)	364	1	1	0	203	3	0
label(Non-rumor)	369	4	0	0	201	4	0
label(Unverified)	365	2	1	1	195	1	0
Total Count	1458	8	3	1	798	9	1

Table 3

Dataset of BuzzFeed and PolitiFact

Radius	BuzzFeed Dataset		PolitiFact Dataset
	1	2	1	2
label(0)	1	119	2	118
label(1)	0	62	3	117
Total Count	1	181	5	235

Table 4

Dataset Statistics

Statistic	Weibo	Twitter15	Twitter16	Twitter
Posts	3805656	74918	40176	51245
Events	4664	1470	808	1137
True Rumors	2351	366	206	572
False Rumors	2313	362	200	562
Non-Rumors	0	373	197	0
Unverified	0	369	205	0
Avg. Posts/Event	816	51	50	45
Max Posts/Event	59318	813	813	752
Min Posts/Event	10	1	1	2
Avg. Depth/Event	5.19	4.38	4.68	4.11
Avg.Size /Event	540	36	32	32

Table 5

Rumor detection results on the Twitter dataset (Binary Categories)

Method	Class	Acc.	Prec.	Rec.	F ₁
RvNN ¹	T	0.896	0.878	0.883	0.872
	F		0.883	0.881	0.872
MAVE	T	0.904	0.911	0.897	0.899
	F		0.887	0.902	0.890
VAE-GCN	T	0.904	0.904	0.909	0.906
	F		0.906	0.899	0.902
AE-GCN	T	0.913	0.911	0.920	0.915
	F		0.919	0.904	0.912
GCN	T	0.916	0.916	0.919	0.916
	F		0.919	0.913	0.915
Bi-GCN	T	0.922	0.928	0.919	0.922
	F		0.920	0.927	0.922
EBGCN	T	0.922	0.909	0.937	0.922
	F		0.936	0.906	0.920
RDEA	T	0.924	0.927	0.921	0.923
	F		0.921	0.928	0.924
DSSM	T	0.938	0.951	0.925	0.935
	F		0.898	0.926	0.909
UDSSM	T	0.940	0.926	0.925	0.922
	F		0.922	0.932	0.923

Remark: ¹This paper uses the source node information skip connection to enhance the feature representation ability of the RvNN model, and the same is true for the following experiments.

From Tables 1 –3 and Fig. 4, although each dataset differs in propagation depth [30] and width [30], the maximum propagation width of the events of the existing real datasets mainly concentrates in 1-2 order, and the node participation degree (node number) of 1-2 order is far more than other orders. Two-layer GCN can aggregate enough spatial structure information and maintain the local characteristics of the node itself. However, as the node aggregation information spreads to more orders, each node can gradually converge to the whole graph information. The spatial structure covered by each node gradually converges, and the characteristics of the nodes, especially the source node or root node, are Masked by other nodes.

Fig. 4

Propagation graphs for several real datasets (red nodes represent source tweets, light blue nodes represent retweets).

2) Frequency Domain

From the frequency domain perspective of graph signal processing, the GCN model (here refers to undirected GCN) can be viewed as a low-pass filter [31], which can perform low-pass filtering on graph signals. The low-frequency signal is retained, the high-frequency signal is filtered, and the graph signal becomes smoother. That is, the signal characteristics of each node are more similar. As shown in Fig. 5, the difference between different colors represents the characteristic difference of the node. With the increase of GCN layers, the GCN has a more muscular scaling function in the low-frequency band, which will form a more effective low-pass filter. This stacked filtering operation will distinguish between nodes worse and worse, and the representation vectors of nodes tend to be consistent. Similarly, the source node information will be masked or weakened by other node information, making the classification task of the lower layer more difficult, which is also the Over-Smooth problem often encountered by multi-layer GCNs.

Fig. 5

Frequency domain filtering process [32].

In this paper, the GCN module is used to aggregate local information to the source node, and then the source node is classified, which is equivalent to typing the graph. In summary, although the impact that the nodes in the same graph tend to be similar on the classification task between graphs is less, in the process of aggregating more neighbor information of the source node, it’s node information will mask by other node information in the same graph. Between different propagation graphs, the text information of other nodes besides the source node also has different degrees of similarity. And too much aggregation of this similar information is also not conducive to the graph classification task.

4.2.3 Skip Connections and Spatial Information Extraction

Source node information, source tweets are proven to contain rich data to detect rumors, would be masked or weakened by other node information, affecting the lower-level classification task mentioned in Section 4.2.2. Hence, the skip connection is adopted between the source node feature and the output feature of each layer of GCN in this paper to enhance the source node information in the hidden layer to reduce the effect, as shown in Fig. 6.

Fig. 6

GCN module forward propagation.

From the above, the two-layer GCN forward propagation formula is obtained and shown as follows. $\begin{matrix} H_{1} = σ ({\tilde{A}}_{sym} XW), \\ {\tilde{H}}_{1} = concat (H_{1}, X^{root}), \\ {\tilde{H}}_{1} = relu ({\tilde{H}}_{1}), \\ H_{2} = σ ({\tilde{A}}_{sym} {\tilde{H}}_{1} W_{1}), \\ {\tilde{H}}_{2} = concat (H_{2}, H_{1}^{root}), \\ X_{GCN} = MEAN ({\tilde{H}}_{2}), \end{matrix}$ (3) where relu (·) is the activation function and is the average pooling operation.

4.3 GRU Module

In this paper, the directional time series propagation information of the stand change of the post is extracted through the recurrent neural network. The source node feature is passed through a fully connected layer and then skip-connected to the final time series information representation to avoid the loss and masking of crucial information. This section presents the factual basis for extracting time series information from stance changes in retweeted posts.

4.3.1 Gated Recurrent Unit (GRU)

GRU is a very effective variant of the LSTM network. It has a more straightforward structure than the LSTM network and can also solve the long-term dependency problem in the RNN network. GRU turns the three gate functions of LSTM into two, i.e. Update Gate and Reset Gate. The reduction of parameters makes GRU not only train faster than LSTM but also reduce the risk of overfitting. As shown in Fig. 7, the GRU module is a neural network model formed by stacking multiple layers with GRU units as the main body. The forward propagation of each gated recurrent unit is as follows. $\begin{matrix} r_{t} = σ (W_{rx} x_{t} + W_{rh} h_{t - 1} + b_{r}), \\ z_{t} = σ (W_{zx} x_{t} + W_{zh} h_{t - 1} + b_{z}), \\ {\tilde{h}}_{t} = tanh (W_{\tilde{h} x} x_{t} + W_{\tilde{h} h} (r_{t} ⊙ h_{t - 1}) + b_{\tilde{h}}), \\ h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ {\tilde{h}}_{t}, \end{matrix}$ (4) where x_t is the input at the moment t, h_t is the hidden state at the moment t, and the hidden state at the initial moment is 0; W_·x and W_·h represent the weight parameter matrix of the GRU unit. r_t, z_t and ${\tilde{h}}_{t}$ are the reset gate, update gate and calculation candidate hidden state respectively; σ (·) is the sigmoid activation function. In forwarding propagation, r (t) is used to control the previous memory state that needs to be retained, and is used to control the amount of information forgotten by the hidden layer at the last moment.

Fig. 7

GRU module forward propagation.

4.3.2 Stand Analysis

There is a stand link between retweeted or replied tweets in the processing of rumor spreading. And stand information has been proven to be a powerful indicator for detecting rumors and other misinformation [33, 34]. Ye et al. [35] and Wu et al. [36] mentioned that rumors are more likely to be questioned than facts or that controversial claims cause suspicion, surprise, etc. Ma et al. [16] found that false rumors often lead to doubts or denials in the process of spreading, while facts often lead to support or affirmation. Even if questioning facts during the spread, questioned tweets will again lead to doubts or denials. As shown in Fig. 8, different colors represent different stands.

Based on this fact, the GRU module is applied to extract the stand changes of tweets. The time series propagation information of tweet stand changes can help us explain the characteristics of rumor propagation from another perspective.

Fig. 8

Changes in stand on retweeting or replying to tweets as rumors and facts spread.

4.3.3 Skip Connections and Time-series Information Extraction

The GRU module adopted in this paper starts from the leaf nodes of the propagation graph and aggregates the time-series propagation information upwards. The input is the text feature representation of the node itself. The hidden state of the input is the sum of the hidden states of all child nodes of the node. The initial information of the source node passes through a fully connected layer and then connects with the hidden state obtained by converging. Forward propagation is shown as the Eq.(5). $\begin{matrix} {\hat{h}}_{k} = \underset{i \in S (k)}{\sum h_{i}}, \\ r_{k} = σ (W_{rz} x_{k} + W_{rh} {\hat{h}}_{k}), \\ z_{k} = σ (W_{zx} x_{k} + W_{zh} {\hat{h}}_{k}), \\ {\hat{h}}_{k} = tanh (W_{\tilde{h} x} x_{k} + W_{\tilde{h} h} (r_{k} ⊙ {\hat{h}}_{k})), \\ h_{k} = (1 - z_{k}) ⊙ {\hat{h}}_{k} + z_{k} ⊙ {\tilde{h}}_{k}, \\ x_{root} = FC (X^{root}), \\ X_{GRU} = concat (h_{root}, x_{root}), \end{matrix}$ (5) where ${\hat{h}}_{k}$ is the sum of the hidden states of all the child nodes of the node k; S (k) is the set of the child nodes of the node k; and the rest of the expressions refer to the Eq.(4). The final information representation is shown in Fig. 9.

Fig. 9

GRU module forward propagation.

4.4 Aggregate Classification Module

The aggregation classification module combines the spatiotemporal information through a weighted connection ensemble, shown as follows. $\begin{matrix} X = concat (α X_{GCN}, (1 - α) X_{GRU}), \\ y = Soft max (FC (X)), \end{matrix}$ (6) where FC (·) is the fully connected layer, and y ∈ R^1×class is the probability vector for all classes of predicted events. SSM takes the class with the highest probability as the predicted class and is trained iteratively by minimizing the cross-entropy between the prediction value and the label distribution of the actual data parameters of the model.

5 Experiments and Results

This paper compares the performance of the SSM model with several state-of-the-art static graph baseline models on Weibo, Twitter, and other datasets, respectively. A new data collection metric for early detection is adopted on the Weibo dataset, and we test the performance of SSM under the new metric.

5.1 Datasets

This paper uses the Weibo, Twitter15, Twitter16 datasets provided by Ma et al. [24, 37]. We select the true and false rumors in the Twitter15 and Twitter16 datasets as the facts and rumors of the new Twitter dataset. And we verify the performance of SSM against other baseline models on the Weibo, Twitter, Twitter15, and Twitter16 datasets from two perspectives. The experiments on the Twitter15 and Twitter16 datasets are four-category tasks, while the experiment on the new dataset Twitter constructed in this paper is a two-category task. In particular, this paper does not directly test the performance of SSM on the Weibo dataset but re-collects the events in the Weibo dataset and the nodes in the event through new indicators to obtain the Weibo_burst dataset, and evaluates the early detection performance of the model, see Section 5.4 for details. The statistics of the four datasets are shown in Table 4. Avg. stands for average, Depth represents the depth of the tree (the path distance from the root node to the leaf node), and Size means the width of the tree (the maximum number of nodes in all orders of a tree).

5.2 Experimental Setup

The models compared in this article include as follows.

1) RvNN [16]: A rumor detection method based on a tree-structured recurrent neural network, learning the time-series propagation information of rumors to detect rumors. And this paper uses the source node information skip connection to enhance the feature representation.

2) MVAE [38]: A rumor detection model combines a Multimodal Variational Autoencoder and a rumor classifier.

3) VAE-GCN [39]: A GCN-based variational graph autoencoder fake news detection model, using GCN as the encoder and Variational Graph Autoencoder (VGAE) as the decoder.

4) AE-GCN [39]: A GCN-based graph autoencoder fake news detection model, using GCN as encoder and Graph Autoencoder (VAE) as the decoder.

5) GCN [40]: An undirected GCN model, an advanced graph representation learning method, can effectively extract the spatial structure features of rumor propagation.

6) Bi-GCN [18]: A rumor detection model based on bidirectional GCN, which extracts the spatial structure information of rumor propagation through two modes of top-down and bottom-up.

7) EBGCN [41]: A rumor detection model based on Edge-enhanced Bayesian Graph Convolutional Network, which adaptively rethinks the reliability of latent relations by adopting a Bayesian approach.

8) RDEA [42]: A rumor detection model based on social media with Event Augmentations, which integrates three augmentation strategies by modifying both reply attributes and event structure to extract meaningful rumor propagation patterns and to learn intrinsic representations of user engagement.

9) DSSM: The static hybrid model fuses directed GCN and GRU proposed in this paper. The GCN module refers to the bi-directional graph convolution model of Bi-GCN. DSSM is mainly used to verify the heterogeneity and importance of time-series information.

10) UDSSM: The static hybrid model fuses undirected GCN and GRU proposed in this paper.

The experimental environment is PyTorch. In the training process, SSM uses cross-entropy loss function and reverse gradient descent to optimize parameters and uses Early Stopping to prevent overfitting. Five-fold cross-validation is adopted; the training period is 200 rounds; the early stop period (the loss value is lower than the current optimal value for n consecutive times) is ten rounds, and the batch size is 16. A weighted connection ensemble method is adopted to solve the difference in spatiotemporal information representation and classification ability.

5.3 Overall Performance

Table 5-7 shows the performance comparison of our model and other baseline models on the four datasets. The baseline models used in this paper are all deep learning methods. From the literature in recent years, We can find a unified conclusion that deep learning methods outperform traditional machine learning methods in performance.

Table 6
Rumor detection results on Twitter15 dataset (Four Categories)

Method Acc. N F T U

F ₁ F ₁ F ₁ F ₁

RvNN 0.762 0.753 0.687 0.828 0.690

MAVE 0.786 0.766 0.799 0.834 0.738

VAE-GCN 0.783 0.721 0.792 0.866 0.754

AE-GCN 0.780 0.706 0.801 0.854 0.755

GCN 0.797 0.749 0.802 0.863 0.769

Bi-GCN 0.827 0.815 0.795 0.866 0.812

EBGCN 0.814 0.768 0.821 0.874 0.787

RDEA 0.827 0.800 0.833 0.885 0.775

DSSM 0.840 0.759 0.785 0.877 0.750

UDSSM 0.853 0.818 0.843 0.866 0.766

Method	Acc.	N	F	T	U
RvNN	0.762	0.753	0.687	0.828	0.690
MAVE	0.786	0.766	0.799	0.834	0.738
VAE-GCN	0.783	0.721	0.792	0.866	0.754
AE-GCN	0.780	0.706	0.801	0.854	0.755
GCN	0.797	0.749	0.802	0.863	0.769
Bi-GCN	0.827	0.815	0.795	0.866	0.812
EBGCN	0.814	0.768	0.821	0.874	0.787
RDEA	0.827	0.800	0.833	0.885	0.775
DSSM	0.840	0.759	0.785	0.877	0.750
UDSSM	0.853	0.818	0.843	0.866	0.766

Table 7

Rumor detection results on Twitter16 dataset (Four Categories)

Method	Acc.	N	F	T	U
		F ₁	F ₁	F ₁	F ₁
RvNN	0.833	0.713	0.753	0.824	0.775
MAVE	0.829	0.737	0.780	0.816	0.748
VAE-GCN	0.826	0.716	0.848	0.927	0.733
AE-GCN	0.788	0.666	0.804	0.920	0.759
GCN	0.822	0.723	0.818	0.924	0.796
Bi-GCN	0.850	0.745	0.912	0.942	0.815
EBGCN	0.842	0.758	0.852	0.930	0.812
RDEA	0.862	0.764	0.866	0.930	0.864
DSSM	0.859	0.775	0.764	0.906	0.832
UDSSM	0.876	0.733	0.778	0.891	0.845

The models’ performance with GCN is generally better shows the effectiveness of the spatial structure features of rumor propagation in rumor detection. The GCN model has excellent advantages in extracting spatial structure features.

DSSM is better than Bi-GCN, indicating that the integration of timing information provides heterogeneous information different from spatial structure information.

Most of the performance indicators of the SSM model on the four datasets are better than RvNN, GCN, and Bi-GCN. It indicates that the fusion spatiotemporal information model provides a better feature representation for lower-level classification tasks than models that only extract time-series information or spatial structure information.

Although the SSM is better in detection effect, the time complexity of model training is higher due to the integration of the time series module. As shown in Table 8, it is about six times slower than Bi-GCN on the three datasets. It’s a problem after the fusion of time series models. For example, the dynamic graph model proposed by Song et al. [27] is about 16 times slower than Bi-GCN. Although AE-GCN and VAE-GCN have low training time complexity per epoch, there are too many training epochs when using the early stop method. The reason is that the continuous decrease of the loss value does not improve accuracy. However, SSM can generally complete each fold training earlier under the early stopping method. In section 5.4, this paper will propose a solution for applying SSM to early detection from data collection to alleviate the problem of excessive training time complexity.

Table 8

Model training time complexity (min/epoch)

Method	Twitter	Twitter15	Twitter16
RvNN	1.22	1.72	0.93
MAVE	0.10	0.10	0.08
VAE-GCN	0.10	0.10	0.08
AE-GCN	0.10	0.10	0.08
GCN	0.10	0.10	0.08
Bi-GCN	0.15	0.20	0.13
EBGCN	0.19	0.22	0.17
RDEA ²	0.12+0.05	0.17+0.07	0.17+0.05
DSSM	0.90	1.33	0.70
UDSSM	0.85	1.23	0.68

Remark: ²The former time is the comparison self-supervised learning time, and the latter time is the training time of each epoch of the model. The same is true for the following experiments.

5.4 Early rumor detection

5.4.1 Existing problems

The current definition of early detection varies, and the data collection standards are also different. Zhou et al. [21] defined the earliest stage of tweet posting, when rumors have not yet started to spread, as early. Bian et al. [18] and Kwon et al. [22] used time windows to collect early data for rumor detection, which is also a commonly used method in many works of literature. Chen et al. [23] and Ma et al. [16] collected data for early detection according to the proportion or number of samples.

These indicators are insufficient in characterizing early detection, and the Weibo dataset is an analysis example. As shown in Fig. 10, it is a 10,000-nodes level propagation graph. The abscissa represents the n-th addition of 30-users nodes, and the ordinate is the time taken in minutes. If the data is selected according to the time window, for example, two hours, the number of collected nodes can reach more than 2000. From the perspective of rumor spread, the rumors have already had a tremendous impact at this time, and the significance of early detection has become weaker. However, if the time window is 10 minutes, only more than 300 nodes are selected, which seems very suitable. It is a 500-nodes level propagation graph in Fig. 11. Only about 30 nodes are collected, so we can’t extract sufficient dissemination information. In this way, the 10 minutes time window is also inappropriate. If the ratio of nodes or the number of nodes is used to collect data for early detection, it is meaningless for rumor detection because we can’t know what scale a rumored event will develop. And how can it be selected according to the ratio or the number of nodes?

Fig. 10

10000-nodes level propagation graph.

Fig. 11

500-nodes level propagation graph.

This paper adopts a new method to select early detection data, called outbreak rumor detection.

5.4.2 New data collection metrics of outbreak rumor detection

Two metrics are first defined: N-users-time window T_N and threshold baseline B. T_N indicates the time it takes for each N-users participating in (repost, comment, etc.) the spread of rumors. Different types of data sets and different node-level propagation graphs have distinct user growth trends. At this time, threshold baselines can be used to select events at different levels. For example, in the Weibo dataset, measured by a 30-users-time window, compared with the overall growth trend, the period in which the 10,000-nodes propagation graph grows faster is around the 1-min baseline. In contrast, the 1,000-nodes level propagation graph is around 20 minutes.

Therefore, when the number of user nodes participating in the spread of rumors increases rapidly, and the average value of consecutive M times N-users-time window T_N is lower than B, the current observation point enters the outbreak period. All nodes before the observation point are selected for early detection. There are three advantages. Firstly, data is selected according to its growing trend, and more finer intervenes in the early stage when rumors spread faster. Secondly, rumors spread fast and have a significant impact, can be found. Thirdly, propagation nodes can be obtained in a controllable manner, as the number of nodes for most collected events is around a fixed value. A smaller number of nodes reduces the training complexity of the model proposed in this paper and makes up for the shortcoming of a long training period. The new indicator is suitable for early detection.

The settings of M, T_N, and B are determined in combination with the data type, the real-time detection capability of the device, the impact degree of the event concerned, and the number of nodes that need to aggregate information. As shown in Table 9, Graph Size is the scale of the event (the total number of participating user nodes), Num. is the number of events (or samples) of the scale and the indices, and Mis. represents the missed detection rate of events of the scale and the indices. Different settings of M, T_N, and B will affect the collection of nodes in events of different scales. In general, using several indicators in Table 9, events above the 500-nodes level can be well screened, the missed detection rate is low, and different indicators bring slight differences. At the same time, the number of nodes collected through various indicators is different, and most of them are determined according to the value of M and T_N.

Table 9
The settings of M, T_N, and B

Graph Size Total T₃₀ ≤ 20 T₃₀ ≤ 30 T₃₀ ≤ 30 T₂₀ ≤ 20 T₂₀ ≤ 30 T₂₀ ≤ 20

(M = 3) (M = 3) (M = 5) (M = 3) (M = 3) (M = 5)

Num. Mis. Num. Mis. Num. Mis. Num. Mis. Num. Mis. Num. Mis.

[0, 500) 3290 1125 65.80% 1348 59% 845 74.30% 1703 48.20% 1922 41.60% 1246 62.10%

[500, 1000) 652 588 9.80% 613 6.00% 593 9.00% 621 4.80% 631 3.20% 612 6.10%

[1000, 5000) 603 582 3.50% 589 2.30% 583 3.30% 592 1.80% 595 1.30% 588 2.50%

[5000, + ∞) 118 117 0.90% 117 0.90% 117 0.90% 117 0.90% 117 0.90% 117 0.90%

[500, + ∞) 1373 1287 6.30% 1319 3.90% 1293 5.80% 1330 3.10% 1343 2.20% 1317 4.10%

Nodes:Samples 90:2082 90:2335 150:1884 60:2640 60:2888 100:2239

Graph Size	Total	T₃₀ ≤ 20	T₃₀ ≤ 30	T₃₀ ≤ 30	T₂₀ ≤ 20	T₂₀ ≤ 30	T₂₀ ≤ 20
[0, 500)	3290	1125	65.80%	1348	59%	845	74.30%	1703	48.20%	1922	41.60%	1246	62.10%
[500, 1000)	652	588	9.80%	613	6.00%	593	9.00%	621	4.80%	631	3.20%	612	6.10%
[1000, 5000)	603	582	3.50%	589	2.30%	583	3.30%	592	1.80%	595	1.30%	588	2.50%
[5000, + ∞)	118	117	0.90%	117	0.90%	117	0.90%	117	0.90%	117	0.90%	117	0.90%
[500, + ∞)	1373	1287	6.30%	1319	3.90%	1293	5.80%	1330	3.10%	1343	2.20%	1317	4.10%
Nodes:Samples	90:2082	90:2335	150:1884	60:2640	60:2888	100:2239

Remark: ¹ The left side of ’:’ represents the number of nodes collected from the event, and the right side represents the number of events collected to this node number.

5.4.3 Experimental Results

This paper selects indicator T₃₀ ≤ 30 (M = 3) to perform early detection of rumors in the Weibo data set, and the new dataset collected is named Weibo_burst. Some primary conditions of the collected data are in Table 10, and the ratio of the two types of samples is about 1:1.8. The comparison results with each baseline model are in Table 11.

Table 10
Statistics of Weibo dataset under new data collection metrics

Statistic Posts Events True Rumors False Rumors Avg.Posts/Event Max Posts/Event Min Posts/Event

Weibo_burst 282000 2667 951 1716 106 2190 90

Statistic	Posts	Events	True Rumors	False Rumors	Avg.Posts/Event	Max Posts/Event	Min Posts/Event
Weibo_burst	282000	2667	951	1716	106	2190	90

Table 11

Early detection results on Weibo_burst dataset

Method	Class	Acc.	Prec.	Rec.	F ₁	epoch
RvNN ¹	F	0.901	0.923	0.927	0.922	5.37
MAVE	F	0.919	0.932	0.944	0.935	0.18
VAE-GCN	F	0.869	0.923	0.875	0.895	0.18
AE-GCN	F	0.875	0.913	0.898	0.901	0.18
GCN	F	0.863	0.894	0.898	0.891	0.18
Bi-GCN	F	0.893	0.899	0.931	0.911	0.60
EBGCN	F	0.880	0.868	0.764	0.810	0.47
RDEA	F	0.907	0.875	0.848	0.860	0.5+0.1
DSSM	F	0.933	0.948	0.951	0.947	4.15
UDSSM	F	0.943	0.954	0.957	0.954	3.97

As shown in Table 11, UDSSM outperforms several state-of-the-art static graph models in all aspects. We also tested the performance of the UDSSM model on the initial Weibo dataset. But compared with other models, the improvement effect is small, not as good as the Weibo_burst and Twitter data sets. It shows that UDSSM can extract much sufficient information than other models in early detection or events with fewer nodes. Moreover, under the new index, the time complexity of the UDSSM model is reduced from 75.13 minutes to 3.97 minutes. Still, the accuracy has not dropped much, indicating that the UDSSM model has a significant advantage in early detection.

6 Conclusion

This paper proposes a static graph mixture model that can effectively fuse spatiotemporal information, called SSM. It can simultaneously extract the spatial structure information and time-series information of rumor propagation. And the two kinds of information and textual semantic information are combined effectively to provide better feature representation for lower-level classification tasks. Results on two classification tasks in four datasets show that the proposed model outperforms several state-of-the-art static graph models overall. This paper also analyzes the rationality of using two-layer GCN in the GCN module from the spatial and frequency domains combined with data analysis, which is different from the experimental verification in other literature. In addition, a new data collection index for early detection, called outbreak rumor detection, is also proposed. It can describe early detection in a fine-grained manner, intervene in the evolution of rumor propagation on time, and screen events that spread quickly and have a more significant impact. Besides, our data collection index proposed in this paper makes up for the shortcomings of the model and reduces the time complexity.

Footnotes

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant (No.61803384).

References

Liu

S.X.

, et al., Extended resource allocation index for link prediction of complex network, Physica A: Statistical Mechanics and its Applications (2017), 174–183.

Zhao

, Resnick

, Mei

, Enquiring minds: Early detection of rumors in social media from enquiry posts, Proceedings of the 24th international conference on world wide web (2015), 1395–1405.

Kwon

, Cha

, Jung

, Chen

, Wang

, Prominent features of rumor propagation in online social media, 2013 IEEE 13th International Conference on Data Mining (2013), 1103–1108.

Cai

, Wu

, Lv

, Rumors detection in chinese via crowd responses, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) (2014), 912–917.

Yang

, Liu

, Yu

, Yang

, Automatic detection of rumor on sina weibo, Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics (2012), 1–7.

, Yang

, Zhu

K.Q.

, False rumors detection on sina weibo by propagation structures, 2015 IEEE 31st International Conference on Data Engineering (2015), 651–662.

Yang

, Wang

, Zhang

, Emerging rumor identification for social media with hot topic detection, 2015 12th Web Information System and Application Conference (WISA) (2015), 53–58.

Takahashi

, Igata

, Rumor detection on twitter, The 6th International Conference on Soft Computing and Intelligent Systems, and The 13th International Symposium on Advanced Intelligence Systems (2012), 452–457.

Chen

, Sui

, Hu

, Gong

, Attention-residual network with CNN for rumor detection, Proceedings of the 28th ACM International Conference on Information and Knowledge Management (2019), 1121–1130.

10.

, Zhang

, He

, Chen

, Microblog rumor detection based on comment sentiment and CNN-LSTM, Artificial Intelligence in China: Springer (2020), 148–156.

11.

Miao

, Rao

, Jiang

, Syntax and Sentiment Enhanced BERT for Earliest Rumor Detection, CCF International Conference on Natural Language Processing and Chinese Computing (2021), 570–582.

12.

Singh

J.P.

, Kumar

, Rana

N.P.

, Dwivedi

Y.K.

, Attention-based LSTM network for rumor veracity estimation of tweets, Information Systems Frontiers (2020), 1–16.

13.

Wang

, Guo

, Wang

, Li

, Tang

, Rumor events detection from chinese microblogs via sentiments enhancement, IEEE Access (2019), 103000–103018.

14.

Shu

, Wang

, Liu

, Beyond news contents: The role of social context for fake news detection, Proceedings of the twelfth ACMInternational Conference on Web Search and Data Mining (2019), 312–320.

15.

Shi

H.R.

, et al., Collusive Anomalies Detection Based on Collaborative Markov Random Field, Intelligent Data Analysis (2021).

16.

, Gao

, Wong

K.-F.

, Rumor detection on twitter with tree-structured recursive neural networks, Association for Computational Linguistics (2018).

17.

, Cai

, Chen

, A rumor events detection method based on deep bidirectional GRU neural network, 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC) (2018), 755–759.

18.

Bian

, et al., Rumor detection on social media with bi-directional graph convolutional networks, Proceedings of the AAAI Conference on Artificial Intelligence 34(1) (2020), 549–556.

19.

Lotfi

, Mirzarezaee

, Hosseinzadeh

and Seydi

, Detection ofrumor conversations in Twitter using graph convolutional networks, Applied Intelligence 51(7) (2021), 4774–4787.

20.

Peng

Y.B.

, et al., Graph convolutional networks-based robustness optimization for scale-free internet of things, Intelligent Data Analysis (2021).

21.

Zhou

, Jain

, Phoha

V.V.

and Zafarani

, Fake news early detection: A theory-driven model, Digital Threats: Research and Practice 1(2) (2020), 1–25.

22.

Kwon

, Cha

and Jung

, Rumor detection over varying timewindows, PloS One 12(1) (2017), e0168344.

23.

Chen

, Li

, Yin

, Zhang

, Call attention to rumors: Deep attention based recurrent neural networks for early rumor detection, Pacific-Asia conference on knowledge discovery and data mining (2018), 40–52.

24.

, et al., Detecting rumors from microblogs with recurrent neural networks, Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) (2016), 3818–3824.

25.

Chen

Y.-C.

, Liu

Z.-Y.

, Kao

H.-Y.

, Ikm at semeval-2017 task 8: Convolutional neural networks for stance detection and rumor verification, Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017) (2017), 465–469.

26.

Jin

, Cao

, Guo

, Zhang

, Luo

, Multimodal fusion with recurrent neural networks for rumor detection on microblogs, Proceedings of the 25th ACM international conference on Multimedia (2017), 795–816.

27.

Song

, Shu

and Wu

, Temporally evolving graph neural networkfor fake news detection, Information Processing & Management 58(6) (2021), 102712.

28.

, Han

, Wu

X.-M.

, Deeper insights into graph convolutional networks for semi-supervised learning, Thirty-Second AAAI conference on artificial intelligence (2018).

29.

, Li

, Tian

, Sonobe

, Kawarabayashi

K.-I.

, Jegelka

, Representation learning on graphs with jumping knowledge networks, International Conference on Machine Learning (2018), 5453–5462.

30.

Vosoughi

, Roy

and Aral

, The spread of true and false newsonline, Science 359(6380) (2018), 1146–1151.

31.

, Maehara

, Revisiting graph neural networks: All we have is low-pass filters, arXiv preprint (2019).

32.

Tremblay

, Goncalves

, Borgnat

, Design of graph filters and filterbanks, Cooperative and Graph Signal Processing: Elsevier (2018), 299–324.

33.

, Gao

, Wong

K.-F.

, Detect rumor and stance jointly by neural multi-task learning, Companion proceedings of the the web conference 2018 (2018), 585–593.

34.

Kochkina

, Liakata

, Zubiaga

, All-in-one: Multi-task learning for rumour verification, arXiv preprint (2018).

35.

, et al., An End-to-End Rumor Detection Model Based on FeatureAggregation, Complexity 2021 (2021), 1–16.

36.

, Li

, Hu

, Liu

, Gleaning wisdom from the past: Early detection of emerging rumors in social media, Proceedings of the 2017 SIAM International Conference on Data Mining (2017), 9–107.

37.

, Gao

, Wong

K.-F.

, Detect rumors in microblog posts using propagation structure via kernel learning, 2017: Association for Computational Linguistics (2017).

38.

Khattar

, Goud

J.S.

, Gupta

, Varma

, Mvae: Multimodal variational autoencoder for fake news detection, The World Wide Web Conference (2019), 2915–2921.

39.

Lin

, Zhang

, Fu

, A graph convolutional encoder and decoder model for rumor detection, 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) (2020), 300–306.

40.

Kipf

T.N.

, Welling

, Semi-supervised classification with graph convolutional networks, arXiv preprint (2016).

41.

Wei

, Hu

, Zhou

, Yue

, Hu

, Towards propagation uncertainty: Edge-enhanced bayesian graph convolutional networks for rumor detection, arXiv preprint (2021).

42.

, Li

, Zhou

, Yang

, Rumor detection on social media with event augmentations, Proceedings of the 44th International ACMSIGIR Conference on Research and Development in Information Retrieval (2021), 2020–2024.

Rumor detection model fused with static spatiotemporal information

Abstract

Keywords

1 Introduction

2 Related Work

2.1 Manual fact-checking

2.2 Automatic detection method

2.2.1 Traditional Machine Learning

2.2.2 Deep Learning

3 Problem Formulation

4.2 GCN Module

4.2.1 Graph Convolutional Layer (GCL)

4.3.1 Gated Recurrent Unit (GRU)

5.1 Datasets

5.2 Experimental Setup

5.3 Overall Performance

5.4.1 Existing problems

Table 10 Statistics of Weibo dataset under new data collection metrics Statistic Posts Events True Rumors False Rumors Avg.Posts/Event Max Posts/Event Min Posts/Event Weibo_burst 282000 2667 951 1716 106 2190 90

Footnotes

Acknowledgments

References

Table 10
Statistics of Weibo dataset under new data collection metrics

Statistic Posts Events True Rumors False Rumors Avg.Posts/Event Max Posts/Event Min Posts/Event

Weibo_burst 282000 2667 951 1716 106 2190 90