Establish a trend fuzzy information granule based short-term forecasting with long-association and k-medoids clustering

Abstract

In the existing short-term forecasting methods of time series, two challenges are faced: capture the associations of data and avoid cumulative errors. For tackling these challenges, the fuzzy information granule based model catches our attention. The rule used in this model is fuzzy association rule (FAR), in which the FAR is constructed from a premise granule to a consequent granule at consecutive time periods, and then it describes the short-association in data. However, in real time series, another association, the association between a premise granule and a consequent granule at non-consecutive time periods, frequently exists, especially in periodical and seasonal time series. While the existing FAR can’t express such association. To describe it, the fuzzy long-association rule (FLAR) is proposed in this study. This kind of rule reflects the influence of an antecedent trend on a consequent trend, where these trends are described by fuzzy information granules at non-consecutive time periods. Thus, the FLAR can describe the long-association in data. Correspondingly, the existing FAR is called as fuzzy short-association rule (FSAR). Combining the existing FSAR with FLAR, a novel short-term forecasting model is presented. This model makes forecasting at granular level, and then it reduces the cumulative errors in short-term prediction. Note that the prediction results of this model are calculated from the available FARs selected by the k-medoids clustering based rule selection algorithm, therefore they are logical and accurate. The better forecasting performance of this model has been verified by comparing it with existing models in experiments.

Keywords

Trend fuzzy information granule fuzzy long-association rule long-association k-medoids clustering based rule selection algorithm short-term forecasting

1 Introduction

Short-term forecasting of time series is a worthy study topic, as it involves many researches and application fields. Various intelligent models have been proposed for this topic, including the neural network model [1], the support vector regression model [2], and the autoregressive integrated moving average model [3].

The neural network (NN) model is an artificial intelligence model with the best abilities for self-learning and non-linear fitting [1]. Extensions of this model, including the artificial neural network model [4], the long short-term memory neural network model [5] and the convolutional recurrent neural network model [6], have been built for short-term forecasting. These models capture the correlations in data clearly and can be applied flexibly. However, their complexity and low interpretability limit their promotion.

The support vector regression (SVR) model is a prediction model developed from the support vector machine model [7]. Better prediction results can be obtained from this model because of its non-liner nature and single global minimum. However, the forecasting performance of SVR is heavily influenced by parameters and el function as analyzed in literatures [8 –10].

The autoregressive integrated moving average (ARIMA) model consists of three components: autoregressive (AR), integration (I), moving average (MA), where AR and MA calculate future values from lagged variables and past errors, respectively [11, 12]. The best feature of this model is simple in form, however, the high requirement in data stationarity makes it impossible to be applied in some fields.

Other forecasting models, such as exponential smoothing [13], fuzzy time series based model [14] and fuzzy cognitive map based model [15], have been analyzed in depth by many scholars.

The superiorities of the above mentioned models in one-step forecasting have been proven through experiments, however, due to the occurrence of cumulative errors in the successive iteration of one-step prediction, their performance in short-term forecasting changes dramatically. For this reason, the fuzzy information granule based model has become popular in the short-term forecasting of time series [16, 17].

The fuzzy information granule (FIG) is constructed on the data with similar characteristics, which conforms to the thinking habit of human beings to consider the similar objects as a whole. Thus, the forecasting based on it is very interpretable. Lu et al. constructed FIGs on a stock time series, these granules express the fluctuations and trend of data exactly [18, 19]. Li et al. granulated time series to a series of FIGs in short-term temperature prediction [20]. The linear fuzzy information granule was proposed by Yang et al. for Mackey–Glass forecasting [21]. In these models, the fuzzy information granules are of equal-size, and the fuzzy association rules (FARs) established on them for describing the data association are between two consecutive equal-size time periods.

Lu et al. applied the unequal-size FIGs in stock time series forecasting [22]. Wang et al. created a granular time series with unequal-size by considering the temporal information of data for forecasting [23, 24]. Guo et al. combined FIGs of varying sizes with hidden Markov models to make predictions [25]. Additionally, a granular computing method was raised in [26] with the aid of clustering algorithm based on the dynamic time warping distance for long-term forecasting. The fuzzy information granules used in these models are of unequal-size, and the mined fuzzy association rules reflect the associations from an antecedent trend to a consequent trend at two consecutive unequal-size time periods. It should be pointed out that the mined FARs in both literatures [18 –21] and literatures [22 –26] are used to calculate the final short-term predictions.

The fuzzy information granule based forecasting models aforementioned have two faults can be found through the analysis in them:

the fuzzy association rules are constructed by two granules at consecutive time periods, they only capture the association of data from two consecutive time periods (called as short-association);

all the constructed rules are used for forecasting, while neglect the availability of each rule.

Except the short-association, another kind of association exists in real time series: the association of data from two non-consecutive time periods (called as long-association). For example, the long-association between the continuous rising temperature and the short-term falling temperature in a seasonal temperature time series. The existing FARs can’t reflect this association, and are therefore ignored in many forecasting. Moreover, different association rules will have different effects on future predictions.

In order to overcome faults 1) and 2), a new kind of association rule and a fuzzy inference system are put forward. This new rule is the fuzzy long-association rule (FLAR) constructed by two granules at non-consecutive time periods, where the “long” represents the time distance between the premise granule and the consequent granule. And the FLARs reflect the long-association between antecedent trend and consequent trend in the data. After constructing such rules in time series analysis, the characteristic of data can be captured with accuracy by making up for the neglect of long-association. Correspondingly, the existing rule is called as fuzzy short-association rule (FSAR).

Besides FLAR, a fuzzy inference system is built up based on the k-medoids cluster. This system contributes to selecting the association rules for forecasting by considering the clustering relationship between the premise granule and the current granule, where the selected rules are called the available FARs. Logical and accurate short-term predictions are calculated through this system.

On the basis of fuzzy long-association rules, a forecasting model is constructed to solve short-term time series prediction. The fuzzy reasoning and forecasting of this model are carried out using two kinds of rules: fuzzy short-association rule and fuzzy long-association rule. These rules comprehensively characterize the distribution and association of data. The mechanism of this model is that it calculates the short-term predictions from multiple historical data, which avoids the cumulative errors that occur in the existing models. Moreover, the predictions are calculated by the available FARs, which result in accurate and reasonable prediction results.

According to the above analysis, the main contributions of this paper can be concluded as follows:

Fuzzy long-association rule is constructed to describe the long-association in time series ignored by other models;

K-medoids clustering based rule selection algorithm is proposed, which beneficial in obtaining logical predictions by selecting the available fuzzy association rules for forecasting;

Fuzzy long-association rule based short-term forecasting model is put forward, it contributes to obtaining accurate predictions by avoiding cumulative errors.

The rest of this paper is arranged as follows: Some preliminary data about fuzzy information granule and k-medoids cluster are given in Section 2. On the basis of these data, a new kind of association rule and its prediction model are introduced in Sections 3 and 4 respectively. This model uses the k-mediods clustering based algorithm to select the available rules for forecasting. In order to demonstrate the availability of the fuzzy long-association rule and corresponding forecasting model, five experiments are analyzed in Section 5. From these experiments, we can obtain the conclusions described in Section 6.

2 Preliminary data

After recalling the fuzzy information granule based model in time series forecasting [27 –29], some related works about fuzzy information granule and k-medoids cluster are introduced in this section, which are necessary in the discussion of this paper.

2.1 Linear fuzzy information granule

Information granule was proposed by Zadeh in 1979 [30]. Following this, various representations of it, such as fuzzy sets and rough sets, were proposed [31 –33]. The information granule is constructed under two requirements: semantic soundness and justifiable granularity. These two considerations imply that the information granule is high interpretable, and captures the data information effective. Thus, the information granule and lots of improved granules have been applied in the knowledge representation and time series analysis [34 –36].

In this paper, a special information granule is expropriated: linear fuzzy information granule (LFIG), whose core is a line function of time that matches with the distribution and trend characteristic of the data [21]. Each LFIG LG(k, b, σ, T) carries a membership function such that the membership degree of value x at time t belonging to LG is expressed as: size812 $f (x; kt + b) = exp (- \frac{(x - (kt + b))^{2}}{2 σ^{2}}), t \in [1, T]$ (1) where μ(t) = kt + b is a time-dependent core line, k, b ∈ R are, respectively, the slope and intercept of the core line, σ measures the dispersion degree of x around μ(t), and T is the time span or the size of granule LG. Such an LFIG depicts a significant feature of a time series: the time-dependent change of data.

For a given time series X ={ x₁, x₂, ⋯ , x_n }, a series of successive subsequences are obtained by partitioning X under a given time granularity (T). Each subsequence is represented as an LFIG according to Equation (1). In this process, X is transformed to a granular time series, namely a series of LFIGs. The above process forms linear fuzzy information granulation. More detail about this granulation is introduced in the following steps:

Step 1: partition the given time series into a series of subsequences

For time series X, p subsequences are partitioned and shown in Equation (2), where T is the time granularity given ahead and $p = [\frac{n}{T}]$ .

$\begin{matrix} {x_{1}, x_{2}, \dots, x_{T}}, {x_{T + 1}, x_{T + 2}, \dots, x_{2 T}}, \dots, \\ {x_{(p - 1) T + 1}, x_{(p - 1) T + 2}, \dots, x_{n}} \end{matrix}$ (2)

Step 2: build a linear fuzzy information granule on each subsequence

According to Equation (1), an LFIG is established by line regression on each subsequence, and then X is transformed to a granular time series G ={ G₁, G₂, ⋯ , G_q }, where G_l = LG(k_l, b_l, σ_l, T)(l = 1, 2, ⋯ , q).

Through establishing granular time series G, the trend characteristics of time series X can be reflected accurately.

2.2 The k-medoids cluster

K-medoids cluster is a partition-based clustering method, which takes the object closest to the center as the clustering center [37, 38]. It is a commonly clustering method with high precision and robustness. In literature [39], Tavakkol et al. increased the accuracy of k-medoids clustering by considering the knowledge of uncertain data. Ushakov et al. put forward a high-quality clustering algorithm for huge-scale datasets through employing a nearest neighbor strategy to approximate the dissimilarity matrix [40]. And combine with the minimum spanning tree, Huang et al. proposed an improved k-medoids algorithm to cluster different kinds of spectral lines and determine the spectral line of interest [41].

In the k-medoids algorithm, the similarity of two objects is measured by the Euclidean distance. As required: the similarity (the distance) between two LFIGs LG₁(k₁, b₁, σ₁, T) and LG₂(k₂, b₂, σ₂, T) is calculated through the l₁-type Hausdorff distance studied in literatures [21, 43], described below: size712 $\begin{matrix} D_{1} ({LG}_{1}, {LG}_{2}) = \\ {\begin{matrix} \frac{1}{2} Δ {kT}^{2} + (Δ b + \frac{\sqrt{2 π}}{2} Δ σ) T, {if t}^{*} < 0, \\ - Δ {kt}^{* 2} + 2 Δ {bt}^{*} + T (- Δ b + \frac{T}{2} Δ k + \frac{\sqrt{2 π}}{2} Δ σ), {if t}^{*} \in [1, T], \\ \frac{1}{2} Δ {kT}^{2} + (Δ b + \frac{\sqrt{2 π}}{2} Δ σ) T, {if t}^{*} > T . \end{matrix} \end{matrix}$ (3) where $t^{*} = \frac{b_{1} - b_{2}}{k_{1} - k_{2}}$ indicates the intersection of two center lines: μ₁(t) = k₁t + b₁, μ₂(t) = k₂t + b₂; Δb = abs(b₁ - b₂), Δσ = abs(σ₁ - σ₂), Δk = abs(k₁ - k₂).

2.3 Evaluate indexes

Three evaluation indexes, mean absolute error (MAE), average forecasting error rate (AFER) and root mean squared error (RMSE), are used to evaluate the forecasting performance of each model. These indexes, whose functions are defined as follows, have been used in many studies: $MAE = \frac{1}{n} \sum_{i = 1}^{n} | x_{i}^{*} - x_{i} |$ $AFER = \frac{1}{n} \sum_{i = 1}^{n} \frac{| x_{i}^{*} - x_{i} |}{x_{i}}$ (4) $RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i}^{*} - x_{i})}^{2}}$ where x_i represents the actual value at moment i, $x_{i}^{*}$ is corresponding prediction. It is worth noting that the smaller the values of MAE, AFER and RMSE, the better the forecasting performance.

3 Fuzzy long-association rule

According to the analysis in Introduction section, it is of great significance to establish association rules for depicting the characteristics and short-term forecasting of time series [26, 42]. Thus, in this section, a new kind of fuzzy association rule is constructed consistent with the features in time series.

Suppose the given time series is X ={ x₁, x₂, ⋯ , x_n }, and the granular time series established on it is G = {G(t) |t = … , 0, 1, ⋯}.

In the existing forecasting models, a fuzzy association rule in the form of G(t - 1) → G(t) is constructed to describe the association between the premise granule (G(t - 1)) and the consequent granule (G(t)) from consecutive time periods (i.e., the (t - 1)th time period and the tth time period), and it reflects the influence on the consequent observations from the premise observations at the consecutive time periods. However, this kind of association rule neglects a special association in time series, that is, the association between two granules from non-consecutive time periods, for example granules G(t - 3) and G(t). Such association exists widely in real time series, especially in seasonal and periodic time series. In order to characterize this association, we propose a new kind of association rule here, its definition is given as follows:

Definition 1. Let {G(t) |t = 1, 2, ⋯ } be a granular time series. If granule G(t) is determined by granule G(t - p), and this relationship can be expressed as $G (t - p) \to G (t)$ then we call it a fuzzy association rule (FAR), where p is called the time lag of FAR and p ⩾ 1, p ∈ Z₊.

From the number of time lag p, the fuzzy association rule can be discussed from the following two aspects:

1) p = 1

In this case, the FAR in the form G(t - 1) → G(t) is called fuzzy short-association rule (FSAR), whose antecedent and consequent are from two consecutive time periods. The FSAR describes the association between the premise granule and the consequent granule at consecutive time periods (short-association).

2) p > 1

In this case, the FAR in the form G(t - p) → G(t) is called fuzzy long-association rule (FLAR), whose antecedent and consequent are from two non-consecutive time periods. The FLAR describes the association between the premise granule and the consequent granule at non-consecutive time periods (long-association).

Based on Definition 1, Fig. 1 expresses two fuzzy association rules: a fuzzy short-association rule G(t - 1) → G(t) and a fuzzy long-association rule G(t - 3) → G(t). By the comparison in Fig. 1, we can find that the FSAR reflects the influence on the current observations from the hisorical observations at their consecutive moments, while the FLAR reflects the influence on the current observations from the hisorical observations at their non-consecutive moments. Where G(t - 3) , G(t - 1) and G(t) are granules constructed on time series X.

Fig. 1

Two kinds of fuzzy association rules.

Table 1

The abbreviation of important concepts used in this paper

Notation	Meaning
FAR	Fuzzy association rule
FSAR	Fuzzy short-association rule
FLAR	Fuzzy long-association rule
p	The time lag of fuzzy association rule

Remark 1: The proposed fuzzy long-association rule takes the long-association of time series into consideration. Thus, the fault 1) analyzed in the Introduction section can be solved.

To facilitate the use of the above concepts, the abbreviations of them are listed as follows:

Yang et al. constructed fuzzy association rules G(t - 1) → G(t) for short-term forecasting, which are FSARs [20, 21]. His forecasting only considers the short-association in time series, while both short-association and long-association are considered in our study. That is to say, the FSARs and FLARs are established for depicting the associations in time series analysis and forecasting in our proposed model.

4 Fuzzy long-association rule based short-term forecasting

After mining various fuzzy association rules in time series, the next work is to make prediction from them. Since the prediction is implemented on different time periods, select the appropriate association rule is important in accurate forecasting. Therefore, an association rule selection algorithm is presented in Section 4.1. Combine with this algorithm, a novel short-term forecasting model is put forward in Section 4.2. The detailed contents are introduced below.

Let the given time series be X ={ x₁, x₂, ⋯ , x_n }, the granular time series established on X be G = {G(t) |t = … , 0, 1, ⋯}.

4.1 K-medoids clustering based rule selection algorithm

For the forecasting on different time periods, it’s key to select the appropriate association rule, which is conducive to obtaining logical and accurate prediction results. Where the appropriate association rules are referred to as the available fuzzy association rules defined in Definition 2.

Definition 2. Suppose the forecasting is to be realized at the tth time period, the (t - p)th granule is G(t - p). A fuzzy association rule G(k) → G(k + p) is called an available fuzzy association rule, if G(k) and G(t - p) belong to the same cluster, where G(k) and G(t - p) are clustered by the k-medoids clustering.

According to the number of p, the available FAR can be divided into two categories: the available FSAR and the available FLAR.

In Definition 2, granules G(k) and G(t - p) belong to the same cluster means they have similar time series trend characteristics. It’s natural to deduce that the two granules G(k + p) and G(t) have the same trend features. Thus, we use G(k) → G(k + p) as the available FAR to inference and forecast next.

On the basis of Definition 2, an available fuzzy association rule selection algorithm is presented, introduced in Algorithm 1, where G(k) and G(t - p) are clustered by the k-medoids clustering. Note that the distance used in the k-medoids cluster is l₁-type Hausdorff distance (expressed in Equation (3)), which measures the similarity between two granules.

Algorithm 1 K-medoids clustering based available
fuzzy association rule selection algorithm
Inputs: The FAR base R_p constructed according to
Definition 2 and the current granule G(t - p)
(p = 1, 2, …, M).
Outputs: The available FAR base $R_{p}^{*}$ .
1: Initialization: $R_{p}^{*} = \emptyset (p = 1, 2, \dots, M)$ ;
2: forp = 1 : M
3: for each rule in the FAR base R_p: G(k) → G(k + p)
4: ifG(k) and G(t - p) belong to the same cluster, then
5: Add G(k) → G(k + p) to the available FAR base $R_{p}^{*}$ ;
6: else
7: Keep G(k) → G(k + p) in FAR base R_p;
8: end if
9: end for
10: end for

Figure 2 expresses the construction of FAR bases. In this process, if granules G_{i
₁}, G_{i
₂}, …, G_{i
_j} belong to the same cluster, the FARs whose premise granule is one of G_{i
₁}, G_{i
₂}, …, G_{i
_j}, namely

Fig. 2

The framework of construction fuzzy association rule bases.

size8.512 $G_{i_{1}} \to G_{(i_{1} + p)}, G_{i_{2}} \to G_{(i_{2} + p)}, \dots, G_{i_{j}} \to G_{(i_{j} + p)}$ (5) are constructed a FAR base (i = 1, 2, …, c).

On the basis of the FAR bases established in Fig. 2, the Algorithm 1 selects the available FARs as follows: if the current granule G_k and G_{i
₁}, G_{i
₂}, …, G_{i
_j} are belonging to the same cluster, the FARs in Equation (5) are the available FARs.

By the selection algorithm of the available FARs, if p = 1, the available FSARs can be mined; if p > 1, the available FLARs can be mined.

Remark 2: The k-medoids clustering based selection algorithm of the available fuzzy association rule considers the availability of each rule in forecasting based on the clustering relationship between the premise granule of FAR and the current granule. Therefore, the fault 2) analyzed in Introduction section can be deal with, and the logical prediction results can be inferenced.

4.2 Fuzzy long-association rule based short-term forecasting model

Based on the new proposed fuzzy association rules and the available association rule selection algorithm, a novel short-term forecasting model is arranged here, namely the fuzzy long-association rule based forecasting model.

In this model, two kinds of FARs are established for calculating predictions: fuzzy short-association rule and fuzzy long-association rule. These rules take into account not only the short-association used in the existing models, but also the long-association frequently presented in time series. Thus, they can accurately capture the correlation characteristics of the data.

After constructing FARs, a novel fuzzy inference system is proposed. This system calculates prediction results from the available FARs selected by the k-medoids clustering based rule selection algorithm, which makes for obtaining accurate and reasonable predictions.

The new proposed model consists of four steps, as introduced below and shown in Fig. 3.

Fig. 3

The framework of the fuzzy long-association rule based short-term forecasting model.

Step 1: transform the given time series to a series of granules

Step 2: mine FARs (FLARs and FSARs) on the obtained granular time series

Step 3: select the available FSARs and FLARs

Step 4: forecast from these available rules

Suppose the forecasting is carried out on a given time series X ={ x₁, x₂, …, x_n }, its process is described as the following steps:

Step 1: transform time series X to a granular time series

By the linear fuzzy information granulation described in Section 2.2, X is granulated to a granular time series G ={ G₁, G₂, …, G_q }, G_l = LG(k_l, b_l, σ_l, T)(l = 1, 2, …, q), where T is the time granularity given ahead and $q = [\frac{n}{T}]$ .

Step 2: mine the FARs on granular time series G

In the proposed forecasting model, two types of FARs are established according to Definition 2, which are FLARs and FSARs:

A FSAR is in the form G_(i-1) → G_i, where G(t - 1) = G_(i-1), G(t) = G_i. Let R₁ be the set of all the established FSARs;

A FLAR is in the form G_(i-p) → G_i, where G(t - p) = G_(i-p), G(t) = G_i. Let R_p(p = 2, 3, …, M) be the set of all the established FLARs.

Remark 3: The maximum time lag M determines the number of types of FLAR, and is given in advance.

Step 3: select the available FARs

Two types of FARs are mined in Step 2, thus two types of available FARs are selected according to Algorithm 1, which are the available FSARs and the available FLARs.

Here, take the forecasting of values at the tth time period ({ x_T×(t-1)+1, x_T×(t-1)+2, …, x_T×t }) as an example, and the historical granules constructed on the (t - 1) th and (t - p) th time periods, G_(t-1) and G_(t-p), are used to select the available FARs and forecasting.

According to granule G_(t-1), N1 available FSARs are selected in R₁ and collected in $R_{1}^{*}$ by Algorithm 1: size9.512 $G_{i_{1}} \to G_{(i_{1} + 1)}, G_{i_{2}} \to G_{(i_{2} + 1)}, \dots, G_{i_{N 1}} \to G_{(i_{N 1} + 1)}$

According to granule G_(t-p),Np available FLARs are selected in R_p and collected in $R_{p}^{*}$ by Algorithm 1 (p = 2, 3, ⋯ , M): size912 $G_{j_{1}} \to G_{(j_{1} + p)}, G_{j_{2}} \to G_{(j_{2} + p)}, \dots, G_{j_{Np}} \to G_{(j_{Np} + p)}$

Next, we will take the forecasting from these selected rules.

Step 4: forecast from the selected available FARs

From the selected two types of available FARs, two kinds of predictions are obtained of value x_T×(t-1)+r(r = 1, 2, ⋯ T), namely $X_{T \times (t - 1) + r}^{1}$ and ${X_{T \times (t - 1) + r}^{p} | p = 2, 3, \dots, M}$ . They are calculated by the following two sub-steps:

Step 4.1: forecast $X_{T \times (t - 1) + r}^{1}$ from the available FSARs in $R_{1}^{*}$

Based on the number of available FSARs N1, the prediction $X_{T \times (t - 1) + r}^{1}$ of x_T×(t-1)+r is calculated from the following two cases (r = 1, 2, ⋯ T):

Case 1: N1 ≠ 0

With the selected available FSAR G_{i
_h} → G_{(i_h+1)}, and G_{(i_h+1)} = LG(k_{(i_h+1)}, b_{(i_h+1)}, σ_{(i_h+1)}, T), (h = 1, 2, ⋯ , N1) size8.512 $\begin{matrix} X_{T \times (t - 1) + r}^{1} = \\ \frac{X_{T \times (t - 1) + r}^{1} (1) + X_{T \times (t - 1) + r}^{1} (2) + \dots + X_{T \times (t - 1) + r}^{1} (N 1)}{N 1} \end{matrix}$ where $X_{T \times (t - 1) + r}^{1} (h) = k_{i_{h}} \times r + b_{i_{h}}, r \in [1, T] (h = 1, 2, \dots, N 1)$ .

Case 2: N1 = 0

This case means there is no available FSAR for forecasting, and the prediction is set as $X_{T \times (t - 1) + r}^{1} = k_{(t - 1)} \times r + b_{(t - 1)}$ calculated from the current data.

Step 4.2: forecast $X_{T \times (t - 1) + r}^{p}$ from the available FLARs in $R_{p}^{*} (p = 2, 3, \dots, M)$

Based on the number of available FLARs Np, the prediction $X_{T \times (t - 1) + r}^{p}$ of x_T×(t-1)+r is calculated from the following two cases (r = 1, 2, ⋯ T):

Case 1: Np ≠ 0

With the selected available FLAR G_{j
_h} → G_{(j_h+p)}, and G_{(j_h+p)} = LG(k_{(j_h+p)}, b_{(j_h+p)}, σ_{(j_h+p)}, T), (h = 1, 2, ⋯ , Np) $\begin{matrix} X_{T \times (t - 1) + r}^{p} = \\ \frac{X_{T \times (t - 1) + r}^{p} (1) + X_{T \times (t - 1) + r}^{p} (2) + \dots + X_{T \times (t - 1) + r}^{p} (Np)}{Np} \end{matrix}$ where $X_{T \times (t - 1) + r}^{p} (h) = k_{j_{h}} \times r + b_{j_{h}}, r \in [1, T] (h = 1, 2, \dots, Np)$ .

Case 2: Np = 0

This case means there is no available FLAR for forecasting, and the prediction is set as $X_{T \times (t - 1) + r}^{p} = 0$ (p = 2, 3, ⋯ , M).

Step 4.3: calculate the final prediction of x_T×(t-1)+r(r = 1, 2, ⋯ , T)

The final prediction of x_T×(t-1)+r is calculated by the average of the predictions obtained from the available FSARs and FLARs $\begin{matrix} x_{T \times (t - 1) + r}^{*} = \\ \frac{X_{T \times (t - 1) + r}^{1} + \sum_{p = 2, p \notin W}^{M} X_{T \times (t - 2) + r}^{p}}{p - | W |} \end{matrix}$ where $W = {p | no available FLAR in R_{p}^{*},$ i.e., $X_{T \times (t - 1) + r}^{p} = 0, 1 < p ⩽ M}$ .

The final prediction here reflects the functions of short-association and long-association in time series.

5 Experimental analysis

In the fuzzy long-association rule based short-term forecasting model, the correlations in time series are described by FSARs and FLARs. These rules characterize the influence on the consequent trend from the antecedent trend at the consecutive time periods and non-consecutive time periods. When a forecasting is made on them, the prediction results reflect the function of both short-association and long-association in time series.

To certify the advantages of FARs (FSARs and FLARs) and the FLAR based forecasting model, experimental studies are introduced from four aspects: Section 5.1 analyzes 5 datasets used in experiments; Section 5.2 introduces 7 other existing models, whose forecasting performance is compared with that of the proposed model. Following that, an example is described in Section 5.3 to illustrate the forecasting processes of the proposed model. Finally, 4 experimental analyzes are given in Section 5.4 and Section 5.5.

5.1 Datasets

The 5 time series used in experiments are given as follows. They have different trend characteristics:

JD.com stock price time series (JD) [44]: the highest price from March 13, 2001 to August 4, 2003;

Melbourne temperature time series (MEL-Temp) [21]: the daily maximum temperature from January 1, 1981 to May 30, 1987;

General Electric Company stock price time series (GE) [44]: the highest price from March 13, 2001 to August 4, 2003;

Northrop Grumman Corporation stock price time series (NOC) [44]: the highest price from August 27, 2014 to July 6, 2017;

Intel Corporation stock price time series (INTC) [44]: the highest price from January 4, 2012 to October 15, 2014.

5.2 Seven comparative models

For verifying the superiority of the proposed model, we compare it with the following 7 existing models:

Fuzzy association rule based forecasting model 1 (FAR-1) [21]: it implements forecasting with all the constructed fuzzy association rules;

Fuzzy association rule based forecasting model 2 (FAR-2) [20]: it implements forecasting with the available fuzzy association rules;

The FARs adopted in models FAR-1 and FAR-2 are fuzzy short-association rules, which ignore the long-associations in time series.

Fuzzy cognitive map based forecasting model (FCM-Model) [45]: it implements forecasting with fuzzy cognitive maps;

Fuzzy time series based forecasting model (FTS-Model) [46]: it implements forecasting with fuzzy time series;

The FCM-Model and FTS-Model transform time series to fuzzy cognitive map and fuzzy time series for forecasting respectively, which have high interpretability.

•Exponential smoothing (ES): the function of this model is as follows: $x_{t}^{*} = α x_{(t - 1)}^{*} + (1 - α) x_{(t - 1)}$ where $x_{(t - 1)}^{*}$ is a prediction of x_(t-1), α is the smooth factor (0 < α < 1).

•Autoregressive integrated moving average (ARIMA): model ARIMA(p, d, q) produces the time series with the mean μ by the following function: $φ (B) {(1 - p)}^{d} (x_{t} - μ) = θ (B) α_{t}$ where x_t and α_t are, respectively, the actual value and random error at time t, φ(B) and θ(B) are the polynomials in B with degree p and q, B denotes the lag operator, d indicates the degree of ordinary differencing;

•Nonlinear autoregressive (NAR) neural networks: it has h delayed inputs, expressed as $x_{t}^{*} = F (x_{(t - 1)}, x_{(t - 2)}, \dots, x_{(t - h)})$ the number of hidden layers is 10.

5.3 Experiment on the JD time series

In order to exhibit the forecasting process of our proposed model, an example is studied on JD time series X ={ x₁, x₂, …, x₄₆₅ }. This example takes the ratio of training data to test data to be 2/1, and the FARs are with the forms of G(t - 1) → G(t) and G(t - p) → G(t)(p = { 2, 3, 4 }). The time granularity used in granulation is set as half a month, namely T = 15, and the number of clustering centers is 4.

From Steps 1 and 2, a granular time series G ={ G₁, G₂, …, G₃₁ }, one kind of FSARs and three kinds of FLARs are constructed on time series X. The forecasting of X is implemented from these FARs.

For the sake of understanding, we take the forecasting of x₄₆₂ (462 = 15×30 + 12) as an example, its forecasting is to be realized in granule G₃₁. Known the historical granules G₂₇, G₂₈, G₂₉, G₃₀, the available FSARs and FLARs mined by Step 3 (Algorithm 1) are shown in Table 2.

Table 2
The selected available FSARs and FLARs in forecasting granule G₃₁

FAR Available FAR

FSAR: G(t - 1) → G(t) G₁₉ → G₂₀G₂₀ → G₂₁

FLAR: G(t - 2) → G(t) G₁₉ → G₂₁

FLAR: G(t - 3) → G(t) No available FLAR

FLAR: G(t - 4) → G(t) No available FLAR

FAR	Available FAR
FSAR: G(t - 1) → G(t)	G₁₉ → G₂₀G₂₀ → G₂₁
FLAR: G(t - 2) → G(t)	G₁₉ → G₂₁
FLAR: G(t - 3) → G(t)	No available FLAR
FLAR: G(t - 4) → G(t)	No available FLAR

The prediction of x₄₆₂ is implemented from the FARs in Table 2 as the following 4 sub-steps:

Step 4.1: forecast from the available FSARs

In FSAR G₁₉ → G₂₀, the premise granule G₁₉ = LG(0.065, 27.582, 0.285, 15) captures the slow increase of data and the consequent granule G₂₀ = LG(0.123, 28.935, 0.180, 15) reflects a rapid growth in data, therefore the rule G₁₉ → G₂₀ explores the influence of the slow increase trend on the rapid growth trend.

From G₁₉ → G₂₀, since x₄₆₂ corresponds to the 12th moment of granule G₃₀, $X_{462}^{1} (1) = 0.123 \times 12 + 28.935 = 30.411$ .

From G₂₀ → G₂₁, the prediction of x₄₆₂ is $X_{462}^{1} (2) = 0.010 \times 12 + 30.723 = 30.843$ .

Thus, $X_{462}^{1} = \frac{30.411 + 30.843}{2} = 30.627$ .

Steps 4.2-4.4: forecast from the available FLARs

From FLAR G₁₉ → G₂₁, the prediction of x₄₆₂ is $X_{462}^{2} = 0.010 \times 12 + 30.723 = 30.843$ , and the predictions $X_{462}^{3} = 0, X_{462}^{4} = 0$ .

Step 4.5: calculate the final prediction of x₄₆₂ $x_{462}^{*} = \frac{30.627 + 30.843}{2} = 30.735$

Through the above steps, the prediction of other data can be calculated.

Next, we compare the forecasting performance of the proposed model with other existing models by three evaluation indexes (given in Equation (4)) and the prediction result depicted by $d_{i} = | x_{i}^{*} - x_{i} |$ , where x_i is an actual value at the ith moment, $x_{i}^{*}$ is the corresponding prediction. The smaller d_i is, the better the forecasting result.

Table 3 lists the index values of each model, and those of our model are listed in the last row. In order to compare the forecasting accuracy from each model clearly, Fig. 4(a) is drawn according to the results in Table 3. From the results expressed in Table 3 and Fig. 4(a), we can find that the proposed model achieves the smallest value on each evaluation index.

Fig. 4

Comparisons of forecasting performance between the proposed model and 7 other existing models on the JD time series: (1) comparisons of MAE, AFER and RMSE values; (2) comparisons of d_i.

Table 3

Comparisons of MAE, AFER and RMSE values between the proposed model and 7 other existing models on the JD time series

Models	MAE	AFER	RMSE
FAR-1	12.730	0.307	12.960
FAR-2	15.931	0.386	16.116
FCM	19.543	0.472	19.756
FTS	13.380	0.326	13.481
ES	14.142	0.341	14.345
ARIMA	15.018	0.362	15.207
NAR	14.551	0.351	14.753
Proposed	10.639	0.256	10.917

In addition, the prediction results d_is of each model are drawn by a box figure in Fig. 4(b). It’s obvious that the length of the box figure of the proposed model is the smallest and the medium value of d_i of our model is the closest to 0.

The forecasting accuracy of these models are analyzed as follows:

•Contrast between the proposed model and FAR-1

By contrasting the second row with the last row, the better forecasting performance of our proposed model indicates that construct the long-association rule is beneficial to capturing the characteristics of time series, moreover, select the available association rules can help us get better predictions.

•Contrast between the proposed model and FAR-2

After contrasting the third row with the last row, the benefits of fuzzy long-association rule can be verified, which indicate that considering the long-association in time series analysis is reasonable.

•Contrast between the proposed model and other models

After contrasting the last six rows, which calculate the short-term predictions through iterating one-step results, one can find that the proposed FLAR based model can improve the forecasting performance by reducing cumulative errors.

From the above analyses, we can declare that the proposed model outperforms the other 7 models on the JD time series.

5.4 Experiment on the MEL time series

The experiment on the MEL time series is the same as that on the JD time series.

In the proposed model, the number of the maximum time lag M has an influence on forecasting: the larger the number of M is, the more association rules are constructed. Thus, we will study the forecasting performance of the proposed model under different the maximum time lags in this experiment, i.e., M = 3, 4, 5.

Table 4 shows the evaluation index values of the proposed model under different M. From this table, one can see that the larger the number of M (the more FLARs), the better (the smaller) the forecasting performance. It indicates that the FLARs can improve forecasting accuracy by accurately considering the associations in time series.

Table 4
Comparisons of MAE, AFER and RMSE values of the proposed model under different the maximum time lag M

Index The maximum time lag

M = 3 M = 4 M = 5

MAE 3.880 3.784 3.725

AFER 0.193 0.184 0.182

RMSE 5.026 4.993 4.879

Index	The maximum time lag
MAE	3.880	3.784	3.725
AFER	0.193	0.184	0.182
RMSE	5.026	4.993	4.879

The index values of 7 other existing models are expressed in Table 5. For easy comparison of the forecasting performance of the proposed FLAR based model with other models, Fig. 5(a) is drawn according to the results in Tables 4 and 5. The prediction results (d_is) of the existing models and the proposed models are exhibited in Fig. 5(b). From Fig. 5, one can find that the index values, the length of box and the medium values on d_is of our models are smaller than those of other models.

Fig. 5

Comparisons of forecasting performance between the proposed model and 7 other existing models on the MEL time series: (1) comparisons of MAE, AFER and RMSE values; (2) comparisons of d_i.

Table 5

Comparisons of MAE, AFER and RMSE values of 7 other existing models on the MEL time series

Model	MAE	AFER	RMSE
FAR-1	4.974	0.271	6.033
FAR-2	3.959	0.189	5.252
FCM	5.663	0.305	6.922
FTS	4.811	0.258	5.850
ES	4.723	0.229	6.177
ARIMA	4.459	0.220	5.871
NAR	5.211	0.290	6.186

The obtained results in Table 4, Table 5 and Fig. 5 show that the proposed model is superior to the other existing models in short-term forecasting. Especially, unlike FAR-1 and FAR-2, the FLARs have more contributions to improving forecasting performance. Compared with the other models, the cumulative errors are alleviated in the proposed model.

5.5 Experiment on the other time series

The experiments on three other stock time series (GE time series, NOC time series, INTC time series) are analyzed in this section, whose the maximum time lag and number of clustering centers are set as 5 and 3, respectively.

The forecasting accuracy (AFER, RMSE and MAE index values) of the proposed model and 7 other models on these stock time series are presented in Table 6. To visualize the comparisons between the proposed model and the existing models on the above index values, Fig. 6(a), Fig. 7(a), Fig. 8(a) are drawn. And the d_is of each model is shown in Fig. 6(b), Fig. 7(b), Fig. 8(b).

Table 6
Comparisons of MAE, AFER and RMSE values between the proposed model and 7 other existing models on three other time series

Models GE NOC INTC

MAE AFER RMSE MAE AFER RMSE MAE AFER RMSE

FAR-1 1.214 0.037 1.407 35.304 0.247 38.214 11.176 0.290 12.746

FAR-2 1.991 0.061 2.267 32.573 0.227 35.697 4.441 0.113 6.438

FCM 1.216 0.036 1.395 41.094 0.286 44.702 13.229 0.340 15.516

FTS 2.148 0.061 2.010 20.088 0.136 24.914 6.426 0.165 7.967

ES 1.371 0.042 1.574 36.422 0.255 39.290 4.328 0.108 6.451

ARIMA 2.544 0.077 2.703 60.838 0.432 62.597 12.502 0.328 13.877

NAR 1.474 0.045 1.691 33.342 0.231 39.397 5.704 0.150 7.386

Proposed 0.716 0.022 0.880 27.528 0.190 31.254 4.104 0.102 6.185

Models	GE	NOC	INTC
FAR-1	1.214	0.037	1.407	35.304	0.247	38.214	11.176	0.290	12.746
FAR-2	1.991	0.061	2.267	32.573	0.227	35.697	4.441	0.113	6.438
FCM	1.216	0.036	1.395	41.094	0.286	44.702	13.229	0.340	15.516
FTS	2.148	0.061	2.010	20.088	0.136	24.914	6.426	0.165	7.967
ES	1.371	0.042	1.574	36.422	0.255	39.290	4.328	0.108	6.451
ARIMA	2.544	0.077	2.703	60.838	0.432	62.597	12.502	0.328	13.877
NAR	1.474	0.045	1.691	33.342	0.231	39.397	5.704	0.150	7.386
Proposed	0.716	0.022	0.880	27.528	0.190	31.254	4.104	0.102	6.185

Fig. 6

Comparisons of forecasting performance between the proposed model and 7 other existing models on the GE time series: (1) comparisons of MAE, AFER and RMSE values; (2) comparisons of d_i.

Fig. 7

Comparisons of forecasting performance between the proposed model and 7 other existing models on the NOC time series: (1) comparisons of MAE, AFER and RMSE values; (2) comparisons of d_i.

Fig. 8

Comparisons of forecasting performance between the proposed model and 7 other existing models on the INTC time series: (1) comparisons of MAE, AFER and RMSE values; (2) comparisons of d_i.

According to the results expressed in Table 6 and Figs. 6–8, we get the following consequences:

•GE time series and INTC time series

In Fig. 6, it’s obvious that the proposed model gets the smallest index values and the smallest medium value of d_is on the GE time series.

The same conclusion can be derived in Fig. 8. It means that the proposed model achieves the best (the smallest) forecasting performance on the INTC time series.

•NOC time series

In Fig. 7, although the FTS-Model achieves the smallest AFER, RMSE, MAE values on the NOC time series, and the medium value of d_is of our proposed model is larger than that of the FTS-Model, the forecasting accuracy of the proposed model tends to be the best, which is acceptable in real applications.

From the analysis in Section 5, we can get the following four conclusions:

1) Fuzzy long-association rules make up for the neglect of long-associations hidden in time series by the existing fuzzy association rules;

2) Fuzzy long-association rules improve the forecasting performance by accurately capturing the characteristics of time series;

3) K-medoids clustering based rule selection algorithm beneficial in obtaining reasonable prediction results by selecting the available FARs for forecasting;

4) Fuzzy long-association rule based short-term forecasting model is superior to 7 other models.

6 Conclusion

In this study, two kinds of fuzzy association rules are proposed: fuzzy short-association rule (FSAR) and fuzzy long-association rule (FLAR), where, the FSAR is constructed from two granules at consecutive time periods, and used to capture the short-association in data; the FLAR is constructed from two granules at non-consecutive time periods, and used to capture the long-association in data.

When a forecasting is carried out on these association rules (FSARs and FLARs), a rule selection algorithm is raised based on the k-medoids clustering. Through this algorithm, the available FSARs and FLARs can be selected for calculating the logical and accurate predictions.

On the basis of the new proposed FARs and the rule selection algorithm, a short-term forecasting model is put forward. This model differs from the existing short-term forecasting models in the form of FARs and fuzzy inference system constructed for forecasting. Comparing it with other models, one can find that the new proposed FARs and the novel fuzzy inference system help to improve forecasting performance.

In the construction of FLARs, the optimal time lag or the maximum time lag are given ahead. In the process of forecasting, different time lags will result in different FARs and different forecasting effects. Therefore, the selection of time lag is important in building forecasting model. Some intelligent methods, particle swarm optimization algorithm and ant colony optimization algorithm, can be applied for this work, and we will study it in the future. Moreover, the k-medoids cluster is introduced to select the available FARs for forecasting, next we will continue to analysis the function and influence of other clustering methods in this field.

Footnotes

Acknowledgments

We are grateful to the possible anonymous reviewers for their constructive comments on the manuscript. The authors thank AiMi Academic Services () for the English language editing and review services.

This work is supported by the National Natural Science Foundation of China (No. 12201396) and the Fujian Natural Science Foundation Project (No. 2021J01001).

References

Hossen

, Plathottam

S.J.

, Angamuthu

R.K.

et al., Short-term load forecasting using deep neural networks (DNN), 2017 North American Power Symposium (NAPS), 2017.

Wang

J.Z.

, Zhou

Q.P.

, Jiang

H.Y.

, Hou

Short-Term Wind Speed Forecasting Using Support Vector Regression Optimized by Cuckoo Optimization Algorithm, Mathematical Problems in Engineering (2015), pp. 1–13.

Akbari-Zadeh , Mohammad-Reza , Fard et al., A hybrid method based on wavelet, ANN and ARIMA model for short- term load forecasting, Journal of Experimental and Theoretical Artificial Intelligence 26(2) (2014), 167–182.

Blanco

, Delgado

and Pegalajar

M.C.

, Extracting rules from a (fuzzy/crisp) recurrent neural network using a self-organizing map, International Journal of Intelligent Systems 15(7) (2015), 595–621.

Liu

Y.W.

, Li

D.J.

, Wan

S.H.

et al., A long short-term memory-based model for greenhouse climate prediction, International Journal of Intelligent Systems 2021(7).

, GXA

, SZ

et al., Health indicator construction of machinery based on end-to-end trainable convolution recurrent neural networks, Journal of Manufacturing Systems 54 (2020), 1–11.

Cheng

, Li

D.Y.

, Wu

H.G.

et al., Chinese Text Classification Based on Character-Level CNN and SVM, International Journal of Intelligent Information and Database Systems 12(3) (2019), 212.

Mei

Z.Y.

, Zhang

L.H.

et al., Real-time multistep prediction of public parking spaces based on Fourier transform-least squares support vector regression, Journal of Intelligent Transportation Systems (2019), pp. 1–13.

Alam

M.S.

, Sultana

and Hossain

, Bayesian optimization algorithm based support vector regression analysis for estimation of shear capacity of FRP reinforced concrete members, Applied Soft Computing 105(2005) (2021), 107281.

10.

Cheng

C.H.

, Yang

J.H.

, Xiao

et al., A novel rainfall forecast model based on the integrated non-linear attribute selection method and support vector regression, Journal of intelligent and fuzzy systems: Applications in Engineering and Technology 31(2) (2016), 915–925.

11.

Kaytez

, A hybrid approach based on autoregressive integrated moving average and least-square support vector machine for long-term forecasting of net electricity consumption, Energy 197 (2020), 117200.

12.

Sun

G.G.

, Song,

L.J.

, Yu

H.F.

et al., V2V Routing in a VANET Based on the Autoregressive Integrated Moving Average Model, IEEE Transactions on Vehicular Technology 68(1) (2019), 908–922.

13.

Batselier

and Vanhoucke

, Improving project forecast accuracy by integrating earned value management with exponential smoothing and reference class forecasting, International Journal of Project Management 35(1) (2017), 28–43.

14.

Guo

H.Y.

, Pedrycz

and Liu

X.D.

, Fuzzy time series forecasting based on axiomatic fuzzy set theory, Neural Computing and Applications 31 (2019), 3921–3932.

15.

Morozova

M.E.

and Shmat

V.V.

, Medium-term forecasting of Russian economy using cognitive model, Studies on Russian Economic Development 28(3) (2017), 253–258.

16.

Luo

, Tan

and Zheng

Y.J.

, Long-term prediction of time series based on stepwise linear division algorithm and time-variant zonary fuzzy information granules, International Journal of Approximate Reasoning 108 (2019), 38–61.

17.

Luo

, Song

and Zheng

Y.J.

, A novel forecasting model for the long-term fluctuation of time series based on polar fuzzy information granules, Information Sciences 512 (2020), 760–779.

18.

, Pedrycz

, Liu

X.D.

, Yang

J.H.

and Li

, The modeling of time series based on fuzzy information granules, Expert Systems with Applications 41 (2014), 3799–3808.

19.

, Zhang

L.Y.

, Pedrycz

, Yang

J.H.

and Liu

X.D.

, The granular extension of Sugeno-type fuzzy models based on optimal allocation of information granularity and its application to forecasting of time series, Applied Soft Computing 42 (2016), 38–52.

20.

, Yang

H.L.

, Yu

F.S.

, Wang

F.Y.

, Wang

A one-factor granular fuzzy logical relationship based multi-point ahead prediction model, 2019 IEEE International Conference on Intelligent Systems and Knowledge Engineering (ISKE 2019), (2019), 1223–1228.

21.

Yang

X.Y.

, Yu

F.S.

and Pedrycz

, Long-term forecasting of time series based on linear fuzzy information granules and fuzzy inference system, International Journal of Approximate Reasoning 81 (2017), 1–27.

22.

, Chen

X.Y.

, Pedrycz

, Liu

X.D.

and Yang

J.H.

, Using interval information granules to improve forecasting in fuzzy time series, International Journal of Approximate Reasoning 57 (2015), 1–18.

23.

Wang

W.N.

, Pedrycz

and Liu

X.D.

, Time series long-term forecasting model based on information granules and fuzzy clustering, Engineering Application of Artificial Intelligence 41 (2015), 17–24.

24.

Zhao

Y.Y.

, Li

T.T.

and Luo

, Spatial–temporal fuzzy information granules for time series forecasting, Soft Computing 25 (2020), 1963–1981.

25.

Guo

H.Y.

, Pedrycz

and Liu

X.D.

, Hidden markov models based approaches to long-term prediction for granular time series, IEEE Transactions on Fuzzy Systems 26(5) (2018), 2807–2817.

26.

, Zhang

L.Y.

, Pedrycz

, Lu

The long-term prediction of time series: a granular computing-based design approach, IEEE Transactions on Systems, Man, and Cybernetics: Systems (2022), pp. 1–13.

27.

Luo

and Wang

H.Y.

, Fuzzy forecasting for long-term time series based on time-variant fuzzy information granules, Applied Soft Computing 88(6) (2019), 106046.

28.

Iqbal

, Zhang

C.Q.

, Arif

et al., A new fuzzy time series forecasting method based on clustering and weighted average approach, Journal of Intelligent and Fuzzy Systems 38 (2020), 6089–6098.

29.

Luo

and Bridges

S.M.

, Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection, International Journal of Intelligent Systems 15(8) (2000), 687–703.

30.

Zadeh

L.A.

Fuzzy sets and information granulation, In M. Gupta & R. Yager (Eds.), Advances in fuzzy set theory and applications, North-Holland Publishing Company (1979), pp. 3–18.

31.

Pedrycz

and Homenda

, Building the fundamentals of granular computing: a principle of justifiable granularity, Applied Soft Computing 13(10) (2013), 4209–4218.

32.

F.S.

, Dong

K.Q.

, Chen

, Jiang

Y.K.

, Zeng

W.Y.

Clustering time series with granular dynamic time warping method, IEEE International Conference on Granular Computing, 2007.

33.

Zhu

X.B.

, Pedrycz

and Li

Z.W.

, Granular representation of data: a design of families of e-information granules, IEEE Transactions on Fuzzy Systems 26(4) (2018), 2107–2119.

34.

Wang

L.Z.

, Liu

X.D.

and Pedrycz

, Effective intervals determined by information granules to improve forecasting in fuzzy time series, Expert System with Applications 40 (2013), 5673–5679.

35.

Wang

L.Z.

, Liu

X.D.

, Pedrycz

and Shao

Y.Y.

, Determination of temporal information granules to improve forecasting in time series, Expert Systems with Applications 41 (2014), 3134–3142.

36.

Wang

, Li

J.H.

, Wei

et al., Optimal granule level selection: A granule description accuracy viewpoint, International Journal of Approximate Reasoning 116 (2020), 85–105.

37.

D.H.

, Liu

G.J.

, Guo

M.Z.

and Liu

X.Y.

, An improved 917 K-medoids algorithm based on step increasing and optimiz ing medoids, Expert Systems with Application 92 (2018), 464–473.

38.

Aditya

, Sari

B.N.

and Padilah

T.N.

, Comparison analysis of euclidean and gower distance measures on k-medoids cluster, Jurnal Teknologi dan Sistem Komputer 9(1) (2020), 1–7.

39.

Tavakkol

and Son

Y.D.

, Fuzzy kernel K-medoids clustering algorithm for uncertain data objects, Pattern Analysis and Applications 24 (2021), 1287–1302.

40.

Ushakov

A.V.

and Vasilyev

, Near-optimal large-scale kmedoids clustering, Information Sciences 545(3) (2021), 344–362.

41.

Huang

Y.M.

, Wu

, He

Y.S.

, Lv

, Chen

S.B.

The selection of arc spectral line of interest based on improved K-medoids algorithm, 2016 IEEE Workshop on Advanced Robotics and its Social Impacts (ARSO), 2016.

42.

Han

Z.Y.

, Pedrycz

, Zhao

and Wang

, Hierarchical granular computing-based model and its reinforcement structural learning for construction of long-term prediction intervals, IEEE Transactions on Cybernetics 52(1) (2022), 666–676.

43.

Duan

L.Z.

, Yu

F.S.

, Pedrycz

, Wang

and Yang

X.Y.

, Time-series clustering based on linear fuzzy information granules, Applied Soft Computing Journal 73 (2018), 1053–1067.

44.

These time series are from https://finance.yahoo.com/quote/

45.

Homenda

, Jastrzebska

, Pedrycz

Model-ing time series with fuzzy cognitive maps, 2014 IEEE International Conference on Fuzzy Systems, (2014), pp. 2055–2062.

46.

Aladag

C.H.

, Egrioglu

, Yolcu

et al., A high order seasonal fuzzy time series model and application to international tourism demand of turkey, Journal of Intelligent and Fuzzy Systems 26 (2014), 295–302.

Index	The maximum time lag
	M = 3	M = 4	M = 5
MAE	3.880	3.784	3.725
AFER	0.193	0.184	0.182
RMSE	5.026	4.993	4.879

Establish a trend fuzzy information granule based short-term forecasting with long-association and k-medoids clustering

Abstract

Keywords

1 Introduction

2 Preliminary data

2.1 Linear fuzzy information granule

4.1 K-medoids clustering based rule selection algorithm

5.1 Datasets

5.2 Seven comparative models

5.3 Experiment on the JD time series

Table 2 The selected available FSARs and FLARs in forecasting granule G31 FAR Available FAR FSAR: G(t - 1) → G(t) G19 → G20G20 → G21 FLAR: G(t - 2) → G(t) G19 → G21 FLAR: G(t - 3) → G(t) No available FLAR FLAR: G(t - 4) → G(t) No available FLAR

Table 4 Comparisons of MAE, AFER and RMSE values of the proposed model under different the maximum time lag M Index The maximum time lag M = 3 M = 4 M = 5 MAE 3.880 3.784 3.725 AFER 0.193 0.184 0.182 RMSE 5.026 4.993 4.879

Footnotes

Acknowledgments

References

Table 2
The selected available FSARs and FLARs in forecasting granule G₃₁

FAR Available FAR

FSAR: G(t - 1) → G(t) G₁₉ → G₂₀G₂₀ → G₂₁

FLAR: G(t - 2) → G(t) G₁₉ → G₂₁

FLAR: G(t - 3) → G(t) No available FLAR

FLAR: G(t - 4) → G(t) No available FLAR

Table 4
Comparisons of MAE, AFER and RMSE values of the proposed model under different the maximum time lag M

Index The maximum time lag

M = 3 M = 4 M = 5

MAE 3.880 3.784 3.725

AFER 0.193 0.184 0.182

RMSE 5.026 4.993 4.879