Abstract
To enhance infection diseases interval prediction, an improved model is proposed by integrating neighborhood fuzzy information granulation (NNIG) and spatial-temporal graph neural network (STGNN). Additionally, the NNIG model can efficiently extract the most representative features from the time series data and identifies the support upper and lower bounds. NNIG model transfers time series data from numerical level to granular level, and processes data feed it into STGNN for interval prediction. Finally, experiments are conducted for evaluation based on the COVID-19 data. The results demonstrate that the NNIG outperforms baseline models. Further, it proves beneficial in offering a valuable approach for policy-making.
Keywords
Introduction
Infectious diseases are illnesses that result from the invasion of pathogenic microorganisms or their toxic products into a host organism. These microorganisms can be transmitted from an infected person, animal, or contaminated object to a healthy host, leading to the development of symptoms and disease. Infectious diseases pose a substantial global burden, exerting a profound effect on both public health systems and economies, with a disproportionate impact on vulnerable populations [31]. An instance of a major disease outbreak was the COVID-19 pandemic in China in December 2019 [17], which rapidly disseminated both within the country and internationally. As of February 17, 2020, the epidemic has extended its reach to all provinces in mainland China and 27 additional countries and regions, with over 70,000 confirmed cases [10]. The impact of infectious diseases on human populations is greatly influenced by the rate and extent of their dissemination across geographical regions [23]. Also, the advanced transportation infrastructure and increased proximity of human interactions, coupled with travel as a powerful catalyst in disease emergence, present formidable challenges in the control and prevention of infectious diseases [34]. Therefore, establishing an efficient monitoring and early warning system, grounded in current population flow, is crucial. This initiative aims to mitigate the mortality rate and economic losses resulting from major infectious diseases, while enhancing the publics’ capacity for prompt outbreak response.
In fact, over the past few years, a wide array of models has been proposed for predicting the trend of infectious diseases, including compartment models, machine learning models and deep learning models. Compartment models include the SIR model [7] and the SEIR model [1]. In response to the evolving situation of the COVID-19 pandemic, improved compartment models have been developed to enhance prediction capabilities by accounting for different factors such as vaccination [30] and the presence of symptoms [11]. He et al. [15] utilized the SEIR model to simulate the development of the COVID-19 pandemic in Hubei Province, China. In this model, multiple parameters were configured and incorporated into SEIR equations. However, the set parameters could influence the predict results, and the parameters changed over time in response to both temporal dynamics and government interventions. Also, the compartment models, which rely on constant parameters to describe the evolution of a single population, are not sufficient to capture the global spread of a pandemic. Additionally, the traditional time series models, such as ARIMA and SARIMA, are more appropriate for countries where the epidemic situation experiences less variability [33]. But these models are limited in their analysis as they overlook the spatial correlation of diseases. Nowadays, machine learning and deep learning models have gained popularity in the study of point predicting. For example, Kuo and Fu [18] constructed a COVID-19 prediction model based on multiple machine learning techniques, which only focused on mobility data and omitted the consideration of population flow data between places. Yeung et al. [37] assessed the predictive performance of several machine learning models not explicitly tailored for time-series data in forecasting the growth of confirmed infections. However, these models didn’t consider population mobility.
In recent years, deep learning models have emerged as powerful tools for capturing the complex correlations within time series sequence data. Models based on recurrent neural networks and their extensions, in particular, have seen widespread use in a variety of applications [8, 28]. Zeroual et al. [40] presented a comparative study of five deep learning methods for predicting the number of new and recovered COVID-19 cases, which evaluated the performance of these techniques and sheds light on their efficacy for epidemic forecasting. Verma et al. [32] designed the deep learning model to capture the intricate patterns of COVID-19 outbreak dynamics, which shows the results of the recurrent neural network are noteworthy but ignores the impact of human mobility in the spread of infectious diseases. However, existing models, such as the compartment model and machine learning models, have not incorporated this crucial factor. There are currently many deep learning models for learning spatiotemporal features, such as graph neural networks and Convolutional LSTM (ConvLSTM). Gao et al. [13] used a graph attention neural network to capture spatial-temporal trends of disease dynamics. Li et al. [19] constructed a ConvLSTM network to learn the spatial-temporal characteristics of floating offshore platform motion nonlinear dynamic system. Gatta et al. [14] combined LSTM with graph convolutional neural networks to infer the parameters of SIR and SIRD models. Deng et al. [9] developed a framework that combined graph structures and time-series features to improve the accuracy of forecasting influenza-like illness. However, these papers primarily focus on numerical predictions with limited parameters. Despite the significant progress achieved in this field, most existing studies mainly concentrate on numerical levels that provide accurate numerical values of expected outcomes, yet fail to provide a reliable forecast interval for decision-making purposes. As a result, the control and management of infectious diseases are hindered by the lack of consideration for spatial mobility in current models. In order to effectively address the spread of infectious diseases, it is imperative to incorporate the factor of human mobility into the modeling process and address uncertainties in the spread of emerging infectious diseases. As a result, this paper plans to introduce spatial-temporal graph neural network to develop infectious disease interval prediction.
In recent years, there has been a trend of combining machine learning methods with uncertainty estimation techniques to address uncertainties effectively. Thus, fuzzy information granulation (FIG) has emerged as a valuable tool for tackling the afore-mentioned issues. FIG has the capability to transform numerical time series into granulated time series, and predictions can be made using granulated time series (GTS). This approach significantly reduces the dimensionality of data while lowering computational costs. Moreover, the output of GTS prediction models is also considered as information particles, which remain meaningful entities. For example, Yang et al. [36] put forward a forecasting model based on FIG and back propagation neural network framework for monthly runoff. Pang et al. [25] developed an interval prediction model for the usable duration of batteries according to FIG and linguistic description. Li et al. [20] combined FIG with multi-objective optimization approach for wind speed interval predicting. In these FIG study, the median of the sequence is employed as the representative feature, and then the membership function is utilized to determine the support upper and lower bounds. However, the FIG theory has limitations in measuring the information content of each value in time series data, making it challenging to identify the value with the most significant information as the kernel. Nowadays, with the development of models in artificial intelligence and neural networks, there has been a growing trend of incorporating FIG into these models. For instance, Yin et al. [38] utilized support vector machine (SVM) to model information granules extracted from the original time series data. Pan et al. [24] employed least square support vector machine (LSSVM) to make battery health trend predictions through the use of FIG. Li et al. [21] combined FIG with LSTM networks adding attention mechanisms to achieve interval prediction of energy sources. Currently, there are fewer studies that integrate spatial-temporal neural network with FIG to process disease spatial-temporal data and construct an interval prediction model. Additionally, the FIG theory ignores the similarity of time sequence data in the fuzzy clustering process by treating time series data as independent individuals. Therefore, the distance between time points needs to be considered as a similarity measure.
In summary, to address the aforementioned limitations, this study uses the idea of clustering, that is, the time series data is regarded as an independent individual. Additionally, the concept of neighborhood rough set is utilized to identify the most informative feature as the kernel. Furthermore, fuzzy information granulation is introduced as the input layer of the spatial-temporal neural network to extract time series data information. This granular-level information can then be utilized for interval predictions within the spatial-temporal graph neural network.
The contribution of this paper can be summarized as follows: With the goal of enhancing interpretability of representative characteristic in the time series, the neighborhood fuzzy information granulation method is proposed to improve the fuzzy information granulation through the incorporation of the neighborhood concept. This study endeavors to provide an interval prediction of infectious diseases through the integration of fuzzy information granulation and spatial-temporal graph neural network theory. The process involves transforming the original numerical level data into granular level data through the neighborhood fuzzy information granulation, followed by utilizing this granular-level data as input to a spatial-temporal graph neural network for the purpose of interval estimation. The effectiveness and superiority of the proposed model are evaluated through multiple numerical and interval evaluation metrics. The results of the evaluation demonstrate that our model outperforms the baseline models, thereby validating its effectiveness in forecasting and analyzing disease trends.
The organizational structure of this paper is as follows: Section 2 provides an overview of the foundational theories pertaining to the domain of fuzzy information granulation and graph neural networks. Section 3 elaborates how to use the idea of neighborhood to obtain the most representative feature, support upper bound and support lower bound in the time series. Section 4 demonstrates the integration of an neighborhood fuzzy information granulation approach with a framework that employs a graph neural network considering the spatial and temporal aspects of disease transmission for predicting the spread of infectious diseases. Section 5 details a demonstration of the validity of the proposed method through COVID-19 datasets, accompanied by a thorough explanation of the obtained experimental results. Finally, section 6 delineates future work.
Background theory
This section discusses some basic theory related to the fuzzy information granulation and graph neural network.
Fuzzy information granulation
FIG, is introduced by Zadeh [39], which involves the segmentation of time series data into distinct informational granules. By breaking down complex problems into simpler, meaningful components, FIG enables a semantic organization of the original time series data into semantically significant informational granules. Granulation process transforms the time series prediction problem from a numerical level to a granular level, thereby facilitating a more comprehensive understanding of the issue.
FIG is the central theme of this study. The process of transforming the original numerical time series into granules through this method facilitates the requirements for interval prediction in data processing. Furthermore, FIG possesses the capability of revealing the inherent characteristics of the time series and reduces the computational demands of the model. The method involves a two-step process that starts with the discretization of the time series, followed by the granulation of time windows.
The discretization of time series refers to dividing a raw time series into several uniform and disjoint segments, known as time windows. This process constitutes the foundation of fuzzy information granulation, as it enables the representation of complex and continuous data. Given a time series X = {x1, x2, ⋯ x
n
}, n is the number of data samples contained in time series X. This study utilizes a constant-length segmentation approach to partition the time series into fixed-width windows, generating subsequences that consist of an identical count of sample data points: x {x1, x2, ⋯ x
k
},
During the process of FIG for time series x = {x1, x2, ⋯ x
k
}, each subset of the temporal window is subjected to processing to generate fuzzy granules. These fuzzy granules can be viewed as representations of unique information pertaining to the corresponding time window. Zadeh [39] provided a general expression of the fuzzy information granulation process:
In Equation (1), x is an element in the universe of discourse U (referred to in this paper as infectious disease time series data). λ is utilized to denote the probability that the element x belongs to the G. And G is a fuzzy information set of U, which is defined by membership function. The membership function is a fundamental characteristic of G, and is used to determine the degree to which each element of U belongs to G. We introduce the triangle membership function A (x ; l, m, u) as follows:
In the Equation (2), l is the support lower bound. u is the support upper bound, and m is a sound numerical representative of time series.
In order to form appropriate fuzzy information granules, two conditions need to be met: (1) the fuzzy granules should be able to reasonably represent the original information and accurately capture the underlying information in time series. (2) The fuzzy granule should have certain specificity. To achieve a balance between these two conditions and ensure the accurate and appropriate encapsulation by FIG, a function Q
A
is devised with the aim of attaining an optimal solution,
Graph Neural Networks (GNNs) possess the capability to manipulate and analyze data that are structured in the form of graphs. They can be used in various applications, such as social network analysis, molecule property prediction, and recommendation systems. These networks can be classified into several categories. We opt to utilize the Spatial-Temporal Graph Neural Networks (STGNN) model for processing graph-structured data due to its ability to model dynamic node inputs while simultaneously capturing the spatial-temporal dependencies of the graph and predicting future node values or labels [35]. The marginal weights in our study are determined by the population flow data and the infectious trend of each region is evaluated by considering the spatial-temporal dependence.
In our study, we aim to leverage the Diffusion Convolutional Recurrent Neural Network (DCRNN) [22] model to analyze the complex spatial-temporal data associated with the spread of infectious diseases. This model combines diffusion convolution with recurrent neural network and adopts the encoder-decoder architecture to capture the time dependence.
A common approach for illustrating the propagation of an epidemic is to model the epidemic as a weighted directed graph G = (ν, ɛ, W), where v = (v1, v2, ⋯ , v
N
) is the nodes sets (Geographical location), and N = |v| is the number of nodes. ɛ represents the edges sets between the nodes. W is weighted adjacency matrix representing the node proximity. These relationships are quantified by the edge weights, which reflect the magnitude of the interactions between the nodes. We denote the epidemiological data
DCRNN is to establish a diffusion process within a graph. This model operates on the assumption that information can be propagated between adjacent nodes with a particular transition probability. Through multiple iterations of the diffusion process, the information distribution within the graph ultimately reaches an equilibrium state. The diffusion convolution formula is
To introduce time dependence, the framework utilizes a variant of RNN: GRU. Diffusion convolution is used instead of GRU matrix multiplication. The formulas are shown below.
In Equations (5)–(6), r(t), u(t) are reset gate and update gate at time t, respectively. H(t) denotes the output at time t. ★G denotes the diffusion convolution defined in Equation (4).
In addition, the gate adjusts the state of the GRUs and hidden cells using the following formula:
Θr★G, Θu★G, Θc★G are learnable parameters of the diffusion convolution filters.
DCRNN model is utilized for addressing the complex spatial-temporal dependence present in the spatial-temporal data. The networks’ training is optimized by leveraging the capability of backpropagation through time to generate the desired target future time series. As a result, it is suitable for a range of spatiotemporal prediction tasks and can be an ideal choice for our study.
In this chapter, the proposed model is described in detail, and the kernel, support upper bound and support lower bound in time series are identified and calculated by this model.
Kernel recognition
In this study, fuzzy information granulation is primarily employed for interval construction, and the pivotal aspect of information granulation lies in the selection of effective membership functions, such as triangular membership functions, trapezoidal membership functions, or gaussian membership functions [4, 26].
In the conventional methods, given a time series x (x1, x2, ⋯ , x t ), the triangular fuzzy function is utilized to generate a fuzzy set (l, m, u) from a given time series, where l u represent the support lower bound of the sequence and the support upper bound of the sequence, respectively. And m represents the median value of the sequence, which serves as the kernel of the triangular membership degree.
In order to extract more information from time series data, it is commonly treated as entirely independent entities. In our study, the “kernel” is defined as the value containing the maximum information, as the optimal feature. However, due to the limitations in measuring information, traditional methods are hard to determine the median as the time point containing the most information. To overcome this limitation, we adopt the concept of calculating distances between features in the neighborhood rough set theory to identify the kernel of the time series. We define the data distance between each time point in a time series and treat it as a measure of information similarity. Specifically, as the distance between data points in a time series decreases, the information they contain becomes more similar. Conversely, as the distance increases, the information disparity in the sequence gradually expands. For example, suppose there are two time series t, t′ and t1 = 2, t2 = 3 ; t1′=2, t2′=30. Clearly, the information disparity in t′ is more significant than t. Firstly, we arrange the values of time points in ascending order to obtain the minimum xmin and maximum xmax values. Consequently, the distance between the xmin and xmax values can be regarded as the maximum disparity in information within the time series. And we introduce the concepts of the non-neighborhood multiset of the xmin and the non-neighborhood multiset of xmax. These multisets encompass data points outside the neighborhood of both the xmin and xmax respectively, resulting in significant disparities in information from the xmin to the xmax values. So, we can consolidate a complete non-neighborhood multiset by merging the non-neighborhood multiset of the xmin and xmax. This serves as the comprehensive non- neighborhood multiset representing the overall information of the time series. To further identify representative features of the time series, we calculate the average avg of the whole non-neighborhood multiset. avg is considered as the potential kernel with the most information content in the time series, because it integrates information from all data points. Finally, to determine the time point containing the most information, we compute the distance between the avg and each value in the time series, utilizing the absolute value of the distance to measure their information similarity distance = {|x1 - avg|, ⋯ , |xk-1 - avg|, |x k - avg|}. A smaller distance min (distance) indicates less disparity in information from the avg with the maximum information content, making it a representative feature in the time series as the kernel. We define this method as neighborhood fuzzy information granulation (NNIG). Constructing the non-neighboring multiset of the xmin and xmax follows the steps below.
Firstly, arrange the values within the time window in ascending order to obtain (x1, x2, ⋯ , x k ).
(1) Find the non-neighborhood multiset of x1 and x
k
in x (x1, x2, …, x
k
) sorted in ascending order. We search for x1 non-neighborhood multiset. First, ascending order and calculate the distance
(2) Compute the average of all values in the non-neighborhood multisets avg in the non-neighborhood multiset of x1 and x
k
,
(3) Determine which value is closest to the non-neighborhood multiset of the maximum value and the non-neighborhood multiset of the minimum value. We need to calculate the distance between each value in x (x1, x2, …, x k ) and avg, and the subscript of its minimum corresponds to the kernel m = arg min(|avg - x1|, ⋯ , |avg - x k |) of the sequence x (x1, x2, …, x k ).
Support upper bound and support lower bound
Based on the kernel computed by the proposed model, the time series is partitioned into two sides, with the left-side data determining the support lower bound l and the right-side data determining the support upper bound u. The above process is shown in Fig. 1.

Left and right-side data.
To address the rationality condition (1) within the reasonable granulation criterion, the corresponding membership degree A (x) is computed. The data on the left side can be obtained based on Equation (9).
For the data on the right sides, we have
For the condition (2) of appropriate fuzzy information granulation criteria, define sp (specificity):
In this study, the distance |m - l| between m and l is chosen as the specific function f2 of characteristic degree. So, sp for l is
For m > l, we get Equation (13)
Analogously, the function about u can be represented by means of Equation (14)
And u > m,
To satisfy conditions (1) and (2), the function is constructed as follows:
For the support upper bound u,
The Q (l) derivative is calculated as follows:
The Q (u) derivative is calculated as follows:
The support upper bound and support lower bound can be obtained by taking the maximum value of each derivative as follows:
To facilitate understanding, we take an example: the data at seven time points is x = {18, 30, 20, 45, 37, 21, 40}, and the kernel, support upper bound and support lower bound are calculated based model proposed above. The sequence is first sorted in ascending order, as 18,20,21,30,37,40,45. The minimum value is 18 and the maximum is 45, so we search for a non-neighborhood multiset of these two values. Next, find the non-nearest neighbor set of the minimum value: calculate the distance between each value and the minimum value △x = (2, 3, 12, 19, 22, 27). Then we calculate △x′ = (1, 9, 7, 3, 5) and get the gap value: gap = 9. In Δx, values greater than gap are 12,19,22,27. So the non-nearest set of 18 is {30, 37, 40, 45}. Then calculate the distance between each value and 45: △x = (27, 25, 24, 15, 8, 5) and △x′ = (2, 1, 9, 7, 3). And we can get the gap value of 45 is
To compute the mean of all non-nearest neighbor sets: avg= (18 + 20 + 21 + 30 + 37 + 40 + 45)/ 7 = 30.14. And calculate the minimum distance from x to avg: min(Δx
avg
) =0.14. So, the kernel of the x is 30. The support upper bound l can be obtained by,
The flowchart depicting the computation of the support upper bound, kernel, and support lower bound is demonstrated in Fig. 2.

Overview of the NNIG method.
In this section, we endeavor to explicate the interval prediction of time series through the implementation of DCRNN based spatial-temporal graph neural network, incorporating the principles of neighborhood information granulation. By constructing the nearest neighbor information granules, we aim to uncover the granular nature of the underlying time series trends. The process is comprehensively illustrated in Fig. 3.

Research framework.
The time series data is partitioned into sub-time series segments with the fixed time window. The kernel, support upper bounds and support lower bounds of each sub-time series are extracted based on the granulation method above mention. We construct kernel, support upper bounds and support lower multisets and put into DCRNN.
Development of spatial-temporal interval prediction networks
The DCRNN model has the capability of processing spatiotemporal data. We aim to improve the accuracy of model interval predictions through the perspective of combining fuzzy information granulation with STGNN. To accomplish this, three STGNNs are established to learn three separate fuzzy information granulation datasets, each normalized prior to being input into the network.
In summary, interval prediction is conducted for a comprehensive assessment of the spread trend of infectious diseases, as shown in Fig. 3.
Experiments
Dataset and preprocessing
In order to validate the efficacy of the proposed method for interval prediction, this study employs the daily confirmed in COVID-19 data 1 . In spatial data, in order to accurately represent population mobility data, this study use the number of Facebook social connectedness [2] to measure mobility between two regions.
To improve the accuracy and reliability of the prediction results, data preprocessing is performed, including data cleaning and data normalization. In the first step, abnormal data in the dataset are eliminated through data cleaning process and missing data are filled through linear interpolation. In the second step, considering that variations in the input values have significant impacts on the parameter optimization efficiency during model training process, data normalization is conducted to reduce model prediction errors and improve model training efficiency.
Based on the raw data, we filter out the COVID-19 epidemic time series data of countries that contain population mobility data. After data preprocessing, we obtain 841 days of continuous time series data of 42 countries (regions) and population mobility data of each region, constructing a COVID-19 spatial-temporal dataset.
We assume a fuzzification time window k = 5, through the method of neighborhood fuzzy granulation, and we obtain the support lower bounds set, kernel set, and support upper bound set. Next, we set the input sequence length N
X
= 3 and the output sequence length N
Y
= 1, meaning we use three days of historical data to predict the data for the fourth day. Consequently, we gain three input matrices and output matrices, denoted as X and Y respectively, which are then inputted into the spatiotemporal prediction model. The specific steps are shown in Fig. 4 below. We constructed three spatial-temporal graph neural networks separately, each receiving input from the upper-bound particle set, lower-bound particle set, and kernel particle set. The shape of the input matrixes for each neural network is [T, N, L
in
] and the shape of output matrixes is [T, N, L
out
]. T is the number of matrices,

Granulation process.
In this study, we first undertake numerical prediction, evaluating the performance of the DCRNN model in comparison to other models. Subsequently, utilizing the optimal model, we conduct interval prediction. The performance of these predictions is assessed through the utilization of various evaluation indices. For numerical prediction, our evaluation metrics include Mean Absolute Error (MAE), Root Mean Square Error (RMSE) as our evaluation indices.
In this study, the interval forecasting performance is evaluated by prediction interval coverage probability (PICP) [27], prediction interval normalized root mean square width (PINRW) and prediction interval normalized average deviation (PINAD) [29]. PICP, which represents the probability that the actual data falls within the forecasted interval, is denoted as 1 if the true value is contained within the forecast interval, and 0 otherwise. An ideal PICP value of 100% signifies complete coverage of all targets by the prediction interval. PINRW calculates the normalized mean square width of the prediction interval, and PINAD is a metric for quantifying the prediction error of interval prediction, with values close to 0 indicating minimal prediction error. The formulas for evaluating the interval prediction performance are shown below.
In the Equation (26), A is the difference (ymax - ymin) between the maximum and minimum value of the actual value.
In interval prediction, the comparison models used are primarily the FIG and the Min-Max models. These models serve as benchmark models for interval prediction.
FIG: In the experiment, we take the median value as the kernel and calculate the support upper bound and support lower bound by Equations (20) and (21) [25].
Min-Max: This method takes the maximum value in the time series as the upper bound, the minimum value as the lower bound, and the median as the kernel, providing a simple and direct interval prediction method.
The experiments in this paper mainly includes several parts: numerical prediction experiment, performance experiment and ablation experiment. To investigate the efficacy of spatiotemporal graph neural network models in predicting the trend of infectious diseases, the DCRNN model is compared to several of the most advanced methods including RNN, LSTM [16], GRU [6], XGBoost [5], Random Forest (RF) [3], Attention-LSTM (A-LSTM), Attention-GRU (A-GRU). The performance experiment is mainly to compare the performance difference between the NNIG method proposed and the baseline model in the prediction of COVID-19 data. And we analysis the influence of prediction horizon and historical sequence length on interval prediction. Finally, the ablation experiment is to verify the effectiveness of the NNIG method. In the ablation experiment, we compare the performance of the NNIG model with baseline models by comparing different fuzzy time windows. Therefore, the fuzzy time window values are T = 11, 9, 7, 5 respectively to select time windows for experiments.
Numerical prediction experiments
To explore the performance of multiple models in short-term and long-term predictions, we set the input sequence and output sequence to N X = 3, N Y = 1, N X = 10, N Y = 5 and N X = 20, N Y = 7. The result is shown in Table 1. The numerical prediction results indicate that the DCRNN model outperformed its competitors in solving the research problems, demonstrating its potential to be a highly promising tool for addressing the spatial-temporal numerical prediction problem. In conclusion, the results demonstrate that the DCRNN model can effectively predict the value and lay a solid foundation for subsequent interval prediction.
COVID-19 Results of the numerical prediction
COVID-19 Results of the numerical prediction
We configured the proposed model under the same settings as the baseline models. The fuzzy time window is set to 11, and the historical sequence length N X of the model are specified as 7, 5, 3. Additionally, the prediction horizon N Y of the model are also set to 3 and 1.
(1) Impact of prediction horizon
The interval forecasting performance for 3 and 1 day ahead prediction is demonstrated in Fig. 5. It is observed that a general declining trend in various evaluation metrics. Specifically, in terms of the PICP, the NNIG model shows a notable decrease, with a decline rate of 12.45%; the FIG model has a decline rate of 15.47%; while the Min-Max model exhibits a slight decrease of 2.32%. This indicates that with the prediction horizon increasing, the NNIG model experiences a more significant decline in the coverage probability of the predicted interval. Both PINAD and PINRW show a declining trend, suggesting an effective improvement in prediction accuracy.

Comparison of forecasting performance for Prediction Horizon in terms of PICP PINAD PINRW.
We observe that PINAD and PINRW show a continuous downward trend as the prediction horizon increases. This means that with the extension of prediction horizon, the prediction interval becomes more compact, thus effectively improving the accuracy of
prediction. In particular, when the prediction interval shrinks and the time series shows a non-stationary trend, it is more difficult for the true value to accurately fall into the prediction interval, resulting in a significant decline in the PICP value. Therefore, we selected the forecast results of a certain country for detailed visualization and analysis, and adopted specific parameter configurations N X = 7, N Y = 1. Through the visualization results in Fig. 6, we can observe that the NNIG model shows excellent performance in the interval prediction of non-stationary time series, and can capture the sudden changes of time series more accurately. Although there are some values where the true value falls outside the predicted interval, the model can still predict the corresponding trend effectively.

Comparison of experimental results in certain country.
(2) Impact of historical sequence length
From Fig. 7, it is evident that the impact of historical input sequence length on interval prediction results is relatively small. This could be attributed to the effectiveness of the three interval models in capturing time patterns and trends within the time series when dealing with longer input sequences, thereby reducing sensitivity to the length of the input sequence. As the historical sequence length decreases, although certain models may experience some influence under specific circumstances, the overall trend does not exhibit significant correlation. Among the evaluation metrics, the NNIG model demonstrates a slight advantage over the comparative models. It effectively addresses the complex relationships within time series data, enhancing predictive accuracy. This emphasizes the robust performance of the NNIG method across varying historical sequence lengths, providing strong support for its flexibility in practical applications.

Comparison of forecasting performance for Historical Sequence Length in terms of PICP PINAD PINRW.

Ablation experiment results.
In summary, by analyzing the interval prediction results of historical sequence length and predict horizon length, we observed that with the increase of predict horizon length, the prediction interval showed a significant shrinking trend, which led to the decline of PICP, PINAW and PINRW indexes. In contrast, reducing historical sequence length did not show a significant correlation in the metrics. In our experimental results, our proposed NNIG model successfully deals with the interval prediction of non-stationary time series, and can accurately judge the trend of future time series. Moreover, the NNIG model is superior to the comparison model in all three indexes, which not only significantly improves the accuracy of interval prediction, but also enhances the interpretability of the kernel in the time series. The model can accurately identify and select the most representative features in the time series, which shows that the proposed model is effective in spatial-temporal interval prediction.
To assess the significant differences in constructing intervals using the three methods, the upper bounds and lower bounds obtained from the aforementioned three methods were subtracted to form intervals. Firstly, we employed the Shapiro-Wilk test, a statistical method used to determine whether a sample follows a normal distribution. The results indicated that all obtained samples deviated from normal distribution. Following this, we proceeded with the Wilcoxon signed-rank test, a paired-sample non-parametric test, to assess the significance of differences between continuous variables obtained from intervals. The results of the test revealed significant differences between the intervals obtained by NNIG and those obtained by FIG and Min-Max methods. Additionally, within the intervals generated by FIG and Min-Max methods, no significant differences were observed among certain countries. This suggests that some countries share similar information in the intervals derived from FIG and Min-Max methods.
In order to further verify the effectiveness of NNIG model, the optimum parameters of fuzzy time window are selected. We choose the best parameters of the fuzzy time window from 11,9,7,5.
PICP results obtained by NNIG model are superior to those obtained by FIG and Min-Max models. Among them, the mean value of NNIG results was 72.98%, while that of FIG results was 59.86%, and that of Min-Max results was the worst, with an average value of 58.15%. In each model, different fuzzy time window values have relatively little influence on the results, and there is no obvious correlation. This indicates that the effect of the change of fuzzy time window on each model is relatively balanced in PICP metrics and no significant trend or correlation is shown. It can be seen from the PINAD result chart that the results of NNIG are all smaller than those of the comparison model, and the prediction model has a better effect. As the number of fuzzy time Windows decreases, PINRW and PINAD generally increase, indicating that when the time series is short, it is easier to find the most representative features. This finding further proves the superiority of NNIG method in processing short time series data.
Based on the performance analysis and the ablation experiment results, the proposed NNIG method shows obvious advantages and potential in the interval prediction of emerging infectious diseases.
Conclusion
In this research, we present a novel approach for time series interval prediction of infectious disease. This is achieved by developing a spatial-temporal graph neural network based on fuzzy information granulation. This proposed method constitutes a valuable contribution to the field of interval prediction by offering a new perspective and strategy to tackle the challenge of infectious disease prediction. To enhance the interpretability of fuzzy information granulation, we introduce an optimization neighbor method to optimize fuzzy information granulation. The final model, a spatial-temporal graph neural network based on the neighborhood fuzzy information granulation, is demonstrated to be superior to the baseline model in evaluation criteria through extensive experimentation and analysis. It is noteworthy to mention that this work implements time series predictions at the granular level rather than the numerical level, which significantly reduces the dimensionality of the problem and provides a more comprehensive and accurate picture of infectious disease dynamics. As a preliminary effort, this work highlights the potential for further improvements by enriching the infectious disease transmission network map with additional multi-attribute information such as, flight information and demographics. The evaluation of the impact of this supplementary data on prediction outcomes is another key aspect for future investigation.
