Speed up dynamic time warpingof multivariate time series

Abstract

Dynamic time warping has attracted wide attention in various fields for its high matching accuracy. In time series data mining, dynamic time warping is a robust similarity measure of multivariate time series. However, the high computational cost of dynamic time warping restricts its applications in large scale data sets. In this paper, we propose a novel approach to speed up dynamic time warping of multivariate time series. Multivariate time series are fitted with multidimensional piecewise lines; and then, important points are extracted as features to reduce the dimensions of multivariate time series; finally, the features are imported to dynamic time warping to measure the similarity of multivariate time series. Extensive empirical results indicate that the proposed method can effectively improve the efficiency of dynamic time warping for multivariate time series, and obtain satisfactory matching accuracy.

Keywords

Multivariate time series dynamic time warping computational complexity speed up

1 Introduction

The past decades have witnessed the rapid development of sensor and storage technology. The scale of time series shows explosive growth in various applications, such as finance [1], multimedia [2], medicine [3], traffic [4], gesture recognition [5], etc.

A time series is a sequence of observations x_i (t), (i = 1, 2, ⋯ , m; t = 1, 2, ⋯ , n) where i indexes the measurements at every time point t [6]. If m = 1, x_i (t) is called a univariate time series; if m ≥ 2, it is known as a multivariate time series. Since time series from a data set usually have the same sampling rate, they can be simply denoted as X = x₁, x₂, ⋯ , x_n, where $x_{i} \in ℝ^{m \times 1}$ represents the ith observations of m variables, n is the length of the time series.

The tasks of time series data mining include classification [7], clustering [8], similarity search [9], outlier detection [10], etc. These tasks share a common subroutine: similarity measure [11], which is one of the most fundamental topics in data mining.

Most of the similarity measures of time series are mainly applicable to univariate time series. However, depicting the state of objects usually needs multiple variables. Multivariate time series are more prevalent in most applications. For example, stock can be described with opening price, closing price, average price and trading volume [12]; electroencephalogram (EEG) is measured by 64 electrodes [13]; in the application of sign language recognition, Australian sign language is gathered from 22 sensors on the hands of a native Australian speaker [14].

However, existing similarity measures of multivariate time series cannot ease the conflict between matching accuracy and computational cost. We try to resolve the intractable problem from thisperspective. The main contribution of this paper is that we proposed a method to reduce the computational complexity of dynamic time warping for multivariate time series and obtain satisfactory matching accuracy.

The rest of the paper is organized as follows. In Section 2, we give a review of the related work. In Section 3, a novel approach is proposed to speed up dynamic time warping of multivariate time series. In Section 4, extensive experiments are carried out on the real-world data sets. Finally, we summarize the work and present the future directions.

2 Related work

Existing similarity measures of multivariate time series mainly include Euclidean distance [15], dynamic time warping (DTW) [16], extended Frobenius norm (Eros) [17], etc.

2.1 Euclidean distance

Euclidean distance is the most popular similarity measure. It is parameter-free and has linear complexity. But Euclidean distance is sensitive to scaling and shifting of sequences in the time axis, and cannot measure the similarity of the sequences of different lengths [18]. These disadvantages make it only applicable to some special data sets.

2.2 Dynamic time warping

DTW was well-known in the speech-processing community, and it was introduced into the similarity measure of time series by Berndt and Clifford [16].

Compared with Euclidean distance, DTW searches an optimal match between two sequences, which enables it to accommodate scaling and shifting of sequences in the time axis. DTW can measure the similarity of the multivariate time series of different lengths and obtain high matching accuracy. Therefore, DTW is widely used in practice.

However, the computational cost of DTW is expensive, which seriously restricts its applications in large scale data sets [19].

2.3 Extended Frobenius norm

Eros is based on statistical theory. It views the observations of multivariate time series as the sample points of random variables, and takes the correlation coefficient matrix as the base of feature extraction. Then, linear space transformation is used to design the model of similarity measure.

Eros can preserve the correlation among different variables of multivariate time series. But it cannot describe the order of the observations in a sequence. For instance, two time series X, Y have the same observations but different order in the time dimension, as shown in Fig. 1.

Fig.1

Two time series of the same observations but different order.

Since the correlation coefficient matrixes of the two sequences are identical, Eros views X, Y as the same series. Obviously, this judgement disobeys our intuition. Therefore, Eros is unable to eliminate the risk of misjudgement. Besides, the computational cost of Eros is also expensive.

2.4 Other methods

Moreover, there are some symbol-based methods such as the longest common subsequence (LCSS) [20], edit distance [21]. They convert time series into character sequences, and utilize charactermanipulation technology to measure the similarity of time series. However, these methods are applicable to univariate time series. When the objects are extended from univariate time series to multivariate time series, symbolizing multivariate time series to preserve the correlation of the variables is still a toughtask.

2.5 Motivation of the proposed method

From the review of the related work, DTW can get high matching accuracy and deal with the sequences of different lengths. Besides, it dose not resort to symbolization and can describe the order of observations in a sequence.

However, the expensive computational cost becomes the only bottleneck restricting its applications in large-scale data sets. From this aspect, we want to propose a method to speed up dynamic time warping of multivariate time series.

3 Proposed method

Time series X, Y are regarded to be similar, if $D_{dtw} (feature (X), feature (Y)) \leq ɛ$ (1) where feature (·) denotes the model of feature extraction, and ɛ is the distance threshold. Supposing X, Y both contain m variables and their lengths are n and s respectively, the computational complexity of DTW is O(ns) [9].

DTW resorts to dynamic programming to search an optimal match between two sequences, as shown in Fig. 2(b). Compared with Euclidean distance, DTW obtains high matching accuracy at the cost of computational complexity. It is difficult to simplify the calculation process of DTW.

Fig.2

Different alignments of two similar time series.

Since multivariate time series are typically high-dimensional, measuring the similarity on the original sequences is computationally expensive. To speed up DTW, reducing the time dimension of multivariate time series may be an effective direction.

3.1 Feature extraction

Feature extraction is a prerequisite for similarity measure. Popular feature extraction methods of time series include discrete fourier transform (DFT)[15, 22], discrete wavelet transform (DWT) [23], singular value decomposition (SVD) [24], piecewise aggregate approximation (PAA) [25] and piecewise linear approximation (PLA) [26].

Among the feature extraction methods mentioned above, only PAA and PLA can be used in DTW. We judge the similarity of multivariate time series according to their shape features. Compared with PAA, PLA can describe the shape features of time series more precisely. So we adopt PLA to extract the features of multivariate time series.

The most intuitive idea is to break a multivariate time series into multiple univariate time series, and extract the features of each univariate time series respectively. Then the features of multiple univariate time series are combined into the features of a multivariate time series. But this way ignores the correlation of the variables. Unfortunately, it is indispensable in most applications.

For example, Fig. 3 records the heights of the left and right knees in the process of human walk. If we break the multivariate time series into two univariate time series, and represent them by PLA respectively, as shown in Fig. 3(b), the meaning of the segments is not clear. The reason is that the correlation of the two variables is not considered. So we segment multivariate time series in the related variable dimensions simultaneously, as illustrated in Fig. 3(a). Each segment in Fig. 3(a) corresponds to a walking movement. Based on this inspiration, we propose the multidimensional segmentation method.

Fig.3

Two ways of segmenting multivariate time series.

Supposing X_i = x₁, x₂, ⋯ , x_n is a section of the sequence in the ith variable dimension of amultivariate time series, and P_i = p₁, p₂, ⋯ , p_n is the fitting line of X_i, the fitting error between X_i and P_i is: $e_{i} = {(\sum_{j = 1}^{n} {| x_{j} - p_{j} |}^{2})}^{\frac{1}{2}}, \forall i = 1, \dots, m$ (2)

The total fitting error in all variable dimensions is evaluated by: $e_{seg} = \sum_{i = 1}^{m} e_{i}$ (3)

Considering the linear complexity and online computation capability, we use the window-sliding strategy [27] to segment a multivariate time series in all variable dimensions. When e_seg is greater than a given threshold MaxSegError, we start a new segment. The details of segmenting a multivariate time series are described in Algorithm 1.

Algorithm 1 Algorithm to segment a multivariate time series

input: A multivariate time series X, a given threshold MaxSegError.

output: The set of multidimensional segments T.

Initialize e_seg = 0;

The first segment starts from the first point of X.

while the last point of X is not contained in a segment do

if e_seg < MaxSegError

Extend the width of the current segment to

the next point of X;

Update e_seg.

else

Add the current segment to T;

Begin a new segment from the current point

of X.

end

end while

An instance of multidimensional segmentation of a multivariate time series is described in Fig. 4(a). We can see the multivariate time series is divided into 5 segments in all variable dimensions simultaneously.

Fig.4

Feature extraction of multivariate time series.

After the multidimensional segmentation, the first and last points of every segment are chosen as the features of the multivariate time series, as illustrated in Fig. 4(b). MaxSegError is a key parameter to balance the degree of dimension reduction and description granularity of a multivariate time series. In this way, the dimensions of a multivariate time series are greatly reduced.

3.2 Similarity measure

The features of multivariate time series are imported to dynamic time warping to measure the similarity, called important points - dynamic time warping mehtod (IP-DTW). The proposed method is defined formally as follows: $\begin{matrix} D_{ip - dtw} (X^{'}, Y^{'}) = D_{base} (x_{1}^{'}, y_{1}^{'}) + \\ min {\begin{matrix} D_{ip - dtw} (X^{'}, Y^{'} [2 : -]) \\ D_{ip - dtw} (X^{'} [2 : -], Y^{'}) \\ D_{ip - dtw} (X^{'} [2 : -], Y^{'} [2 : -]) \end{matrix} \end{matrix}$ (4) where X′ and Y′ are the features of multivariate time series, and $D_{base} (x_{i}^{'}, y_{j}^{'})$ is the base distance:

$D_{base} (x_{i}^{'}, y_{j}^{'}) = \sum_{t = 1}^{m} | x_{ti}^{'} - y_{tj}^{'} |$ (5)

We construct a matrix in which the (ith, jth) element denotes the base distance between the two points $x_{i}^{'}$ and $y_{j}^{'}$ . An example is given in Fig. 5.

Fig.5

A warping path of two time series.

A warping path W is a contiguous set of the matrix elements that describes a mapping between X′ and Y′. The kth element of W is denoted as w_k = (i, j) _k. The warping path $W = w_{1}, w_{2}, \dots, w_{K}$ (6) is subject to three constraints: boundary conditions, continuity and monotonicity [9]. We are only interested in the path that minimizes the warping cost: $D_{ip - dtw} (X^{'}, Y^{'}) = \min {\sqrt{\sum_{k = 1}^{K} w_{k}}}$ (7)

This path can be found by calculating the following recurrence with dynamic programming [28]: $\begin{matrix} γ (i, j) = D_{base} (x_{i}^{'}, y_{j}^{'}) + \\ min {γ (i - 1, j - 1), γ (i - 1, j), γ (i, j - 1) \end{matrix}$ (8) where γ (i, j) is the cumulative distance.

3.3 Analysis of computational complexity

Supposing the lengths of X, Y are n and s, the computational complexity of DTW is O(ns). If X and Y are split into n′ and s′ segments respectively by our proposed method, the computational complexity of IP-DTW is O(n′s′). Then, we can use the ratio $r = \frac{n^{'} s^{'}}{ns}$ (9) to compare their computational cost.

We can see from Fig. 4, the length of the characteristic sequence is usually less than that of the original sequence. Theoretically, the proposed method can help to speed up dynamic time warping of multivariate time series.

Furthermore, we need to answer at least two questions by the following experiments: whether IP-DTW can obtain acceptable matching accuracy? how much computational complexity has been reduced?

4 Experimental evaluation

The experiments involve three parts. First, we describe the experimental design including experimental setting and experiment procedures. Thenwe make a comprehensive comparison with competing methods to verify the effectiveness of the proposed method. Finally, we analyze its computational complexity in practical applications.

4.1 Experimental design

4.1.1 Experimental setting

Our experiment was implemented on a PC with Intel Quad Core 3.4 GHz CPU and 4 GB-RAM, running with Matlab R2010a.

Experiments are conducted with 7 real-world data sets, including the tasks of sign language recognition, handwriting recognition, medicine, robotics, etc. The detailed information is listed in Table 1.

Table 1
Summary of the data sets

Data sets Variables Length Av. length Classes Size

ASL 22 45∼136 57 95 2565

ECG 2 39∼152 90 2 200

EEG 64 256 256 2 22

Japanese Vowels 12 7∼29 16 9 640

Libras 2 45 45 15 360

Pen Digits 2 8 8 10 10992

LP1 6 15 15 4 88

Data sets	Variables	Length	Av. length	Classes	Size
ASL	22	45∼136	57	95	2565
ECG	2	39∼152	90	2	200
EEG	64	256	256	2	22
Japanese Vowels	12	7∼29	16	9	640
Libras	2	45	45	15	360
Pen Digits	2	8	8	10	10992
LP1	6	15	15	4	88

Australian Sign Language (ASL) consists of 95 kinds of meanings, such as alive, maybe, when, etc. Each hand was described by 11 variables: 6 variables record the position of the hand, and 5 variables measure 5 fingers’ bend.

ECG contains the measurements of cardiac electrical activity. Each sample was labelled as normal or abnormal.

EEG was obtained from 64 electrodes placed on subjects’ scalps. The subjects were divided into two groups: alcoholic and control. We choose 22 samples from the first 2 subjects (co2a0000364 and co2c0000337) as our experimental data.

Japanese Vowels were collected from 9 male speakers uttering 2 Japanese vowels successively. Each utterance was described by 12 variables.

Libras contains 15 classes with 24 instances per class. Each class refers to a hand movement in Brazilian signal language.

Pen Digits (short for Pen-Based Recognition of Handwritten Digits Data Set) were collected from a WACOM PL-100V pressure sensitive tablet with an integrated LCD display and a cordless stylus.

LP1 contains torque and force measurements on a robot after failure detection.

4.1.2 Experiment procedures

We use leave-one-out cross-validation to evaluate the classification accuracy of different methods. We choose a sequence X from a data set containing n sequences. The procedure of the experiment can be described as follows:

Find the k nearest neighbors of X in the data set;

From its k nearest neighbors, count the number of the series falling into the same class with X, labelled as n₀.

Calculate the classification accuracy: $e = n_{0} / k$ (10)

The classification accuracy of the other sequences in the data set is calculated. Then, the average classification accuracy is obtained: $e^{*} = \frac{1}{n} \sum_{i = 1}^{n} e_{i}$ (11) which is used to compare the effectiveness of different methods.

4.2 Comparison of matching effects

Euclidean distance can only measure multivariate time series of equal length. So we compare the effectiveness of our proposed method with Eros and DTW, without considering Euclidean distance.

The experimental results on ASL are illustrated in Table 2. In IP-DTW, MaxSegError=0.05; Eros and DTW are parameter-free. For each k (k=1, 5, 10), 2565 queries are conducted. Table 2 records the query times under different classification accuracy.

Table 2
The experimental results on ASL

k Eros DTW IP-DTW

1 5 10 1 5 10 1 5 10

e = 0 1696 1051 726 501 163 84 290 71 28

e = 10% - - 655 - - 266 - - 128

e = 20% - 672 363 - 362 363 - 221 182

e = 30% - - 204 - - 176 - - 157

e = 40% - 279 125 - 474 167 - 305 149

e = 50% - - 92 - - 194 - - 155

e = 60% - 152 57 - 334 212 - 268 171

e = 70% - - 81 - - 209 - - 250

e = 80% - 130 67 - 386 240 - 320 221

e = 90% - - 69 - - 238 - - 252

e = 100% 869 281 126 2064 846 416 2275 1380 872

k	Eros	DTW	IP-DTW
e = 0	1696	1051	726	501	163	84	290	71	28
e = 10%	-	-	655	-	-	266	-	-	128
e = 20%	-	672	363	-	362	363	-	221	182
e = 30%	-	-	204	-	-	176	-	-	157
e = 40%	-	279	125	-	474	167	-	305	149
e = 50%	-	-	92	-	-	194	-	-	155
e = 60%	-	152	57	-	334	212	-	268	171
e = 70%	-	-	81	-	-	209	-	-	250
e = 80%	-	130	67	-	386	240	-	320	221
e = 90%	-	-	69	-	-	238	-	-	252
e = 100%	869	281	126	2064	846	416	2275	1380	872

We can see, in the region of low classification accuracy (such as e=0 or e=10%), IP-DTW contains fewer query times than the other methods; in the region of high classification accuracy (especially e=100%), IP-DTW contains more query times.

Further, we calculate the average classification accuracy of different methods on ASL, as shown in Table 3. Obviously, IP-DTW obtains higher matching accuracy on ASL.

Table 3

Average classification accuracy on ASL

	Eros	DTW	IP-DTW
k=1	0.3388	0.8047	0.8869
k=5	0.2816	0.6305	0.7653
k=10	0.2448	0.5503	0.6965

In order to compare the matching accuracy more intuitionally, we choose the third sequence of ASL (ASL_3) as the input sample, and find its most similar series with different methods. The results are shown in Fig. 6, which provides two-dimensional and three-dimensional graphics of the multivariate time series.

Fig.6

The results of finding the most similar series on ASL.

We can see that the most similar series found by Eros is obviously different from the input sample (ASL_3). The reason is that Eros is unable to describe the order of the observations in a sequence and it cannot avoid the risk of misjudgement. Since DTW and IP-DTW can search an optimal match between two sequences, they get better results.

Moreover, only the series (ASL_2) found by IP-DTW falls into the same class with ASL_3. This shows that the features of the input sample can effectively describe the original series.

Furthermore, we extend the experiments from ASL to the other data sets. The experimental results are illustrated in Fig. 7. The relevant parameter settings in IP-DTW are listed in Table 4.

Fig.7

Average classification accuracy on diverse data sets.

Table 4

Parameter settings in IP-DTW method

Data sets	MaxSegError
ECG	165
EEG	280
Japanese Vowels	1.6
Libras	0.046
Pen Digits	0
LP1	120

The average classification accuracy of Eros on most data sets is much lower than the results of the other methods. The results are almost consistent under different k values in general.

But on EEG, Eros achieves higher accuracy than the other two competing methods. Because there are few sequences on EEG with similar observations but different order, and the risk of misjudgement rarely occurs.

IP-DTW achieves a little higher average classification accuracy than DTW on most data sets. It provides an experimental evidence that IP-DTW not only effectively reduces the dimensions of multivariate time series, but also obtains higher matching accuracy than DTW.

On Pen Digits, DTW and IP-DTW get exactly the same results. The reason is that the lengths of the series on Pen Digits are short, as shown in Table 1. The original series are their own features by setting MaxSegError = 0 in IP-DTW method, as illustrated in Table 4. In this case, IP-DTW degrades to DTW. Therefore, DTW can be seen as a special case of IP-DTW by setting MaxSegError = 0.

The experiments above evidence that IP-DTW can obtain acceptable matching accuracy. Next we analyze its computational complexity.

4.3 Comparison of computational complexity

In this section, we compare the computational complexity of DTW and IP-DTW according to Eq. (9).

Since leave-one-out cross-validation is used in our experiments, there is no need to calculate the ratio r of every pair of sequences. For a data set, the average lengths of the original sequences and the segmented sequences are calculated respectively, labelled as $\bar{L}$ and $\bar{L^{'}}$ . Then we use the square of the average compression ratio ${CR}^{2} = (\bar{L^{'}} ∖ \bar{L})^{2}$ (12) to compare the computational complexity of DTW and IP-DTW.

The comparison results of computational cost on 7 real-world data sets are illustrated in Table 5. On most data sets except for Pen Digits, IP-DTW can effectively reduce the dimensions of multivariate time series. Compared with DTW, the computational cost of IP-DTW decreases a lot, even less than a quarter of that of DTW.

Table 5

Comparison of computational complexity with DTW

Data sets	Variables	Before segment		After segment		CR ²
Data sets	Variables	Length	Av.length	Length	Av.length	CR ²
ASL	22	45∼136	57	14∼53	26	0 . 46²
ECG	2	39∼152	90	5∼21	9	0 . 10²
EEG	64	256	256	46∼75	58	0 . 23²
Japanese Vowels	12	7∼29	16	2∼6	3	0 . 21²
Libras	2	45	45	2∼21	8	0 . 19²
Pen Digits	2	8	8	8	8	1²
LP1	6	15	15	2∼10	4	0 . 24²

Furthermore, we record the average computation time of IP-DTW and DTW respectively, and use the ratio to compare their computational complexity. The results are shown in Fig. 8, which are consistent with the theoretical analysis.

Fig.8

Computational complexity on diverse data sets.

The experimental results in Section 4.2 and Section 4.3 indicate that IP-DTW can speed up dynamic time warping of multivariate time series, and can achieve satisfactory matching accuracy.

5 Conclusions

In this paper, we proposed a novel approach to speed up dynamic time warping of multivariate time series. Theoretically, the computational complexity of the proposed method was analysed. Extensive experiments were also conducted on real-world data sets. The experimental results indicate that the proposed method can effectively improve the efficiency of dynamic time warping for multivariate time series, and obtain satisfactory matching accuracy.

Furthermore, the proposed method can be naturally extended from multivariate time series to univariate time series. Several aspects still need to be studied further, for example, how to quickly find the optimal threshold MaxSegError, how to exploit the proposed method in similarity search on massive data sets, etc.

Footnotes

Acknowledgments

Thanks to the donors of the UCR time series repository. We would like to sincerely thank Eamonn Keogh, Chotirat (Ann) Ratanamahatana, etc. for their inspiring work in this field. The research is supported by the National Natural Science Foundation of China under Grant No. 61502521 and 71771094.

References

Wan and

Y.-W.

Si . A formal approach to chart patterns classification in financial time series. Information Sciences 411 (2017), 151--175.

Kim . Conditional alignment random fields for multiple motion sequence alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(11) (2013), 2803--2809.

Gharehbaghi ,

Ask , and

Babic . A pattern recognition framework for detecting dynamic changes on cyclic time series. Pattern Recognition 48(3) (2015), 696--708.

Yang ,

Bing ,

Lin , etc. Research on short-term traffic flow prediction method based on similarity search of time series. Mathematical Problems in Engineering 1 (2014), 1--8.

Zhou ,

Jiang , and

Lin . A novel finger and hand pose estimation technique for real-time hand gesture recognition. Pattern Recognition 49(1) (2016), 102--114.

Bankó and

Abonyi . Correlation based dynamic time warping of multivariate time series. Expert Systems with Applications 39(17) (2012), 12814--12823.

Górecki . Classification of time series using combination of DTW and LCSS dissimilarity measures. Communications in Statistics-Simulation and Computation 47(1) (2018), 263--276.

Izakian ,

Pedrycz , and

Jamal . Fuzzy clustering of time series data using dynamic time warping distance. Engineering Applications of Artificial Intelligence 39 (2015), 235--244.

Keogh and

C.A.

Ratanamahatana . Exact indexing of dynamic time warping. Knowledge and Information Systems 7(3) (2005), 358--386.

10.

Chrysanthou ,

Englezakis ,

Prodromou , etc. An online and real-time fault detection and localization mechanism for network-on-chip architectures. ACM Transactions on Architecture and Code Optimization 13(2) (2016), 22:1--22:26.

11.

T.-C.

Fu . A review on time series data mining. Engineering Applications of Artificial Intelligence 24(1) (2011), 164--181.

12.

Garthoff ,

Golosnoy , and

Schmid . Monitoring the mean of multivariate financial time series. Applied Stochastic Models in Business and Industry 30(3) (2014), 328--340.

13.

Pogorelc and

Gams . Detecting gait-related health problems of the elderly using multidimensional dynamic time warping approach with semantic attributes. Multimedia Tools and Applications 66(1) (2013), 95--114.

14.

Alon ,

Athitsos ,

Yuan , etc. A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(9) (2009), 1685--1699.

15.

Agrawal ,

Faloutsos , and

Swami . Efficient similarity search in sequence databases. In: Proceedings of the 4th international conference foundations of data organization and algorithms, Chicago, USA, 1993, pp. 69--84.

16.

D. J.

Berndt and

Clifford . Using dynamic time warping to find patterns in time series. In: KDD workshop, Seattle, USA, 1994, pp. 359--370.

17.

Yang and

Shahabi . An efficient k nearest neighbor search for multivariate time series. Information and Computation 205(1) (2007), 65--98.

18.

Aghabozorgi ,

A. S.

Shirkhorshidi , and

T. Y.

Wah . Time-series clustering--A decade review. Information Systems 53 (2015), 16--38.

19.

Rakthanmanon ,

Campana ,

Mueen , etc. Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping. ACM Transactions on Knowledge Discovery from Data 7(3) (2013), 10:1--10:31.

20.

Vlachos ,

Kollios , and

Gunopulos . Discovering similar multidimensional trajectories. In: Proceedings of the 18th International Conference on Data Engineering, San Jose, USA, 2002, pp. 673--684.

21.

Chen ,

M. T.

Özsu , and

Oria . Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, Baltimore, USA, 2005, pp. 491--502.

22.

M.-Y.

Chen and

B.-T.

Chen . Online fuzzy time series analysis based on entropy discretization and a Fast Fourier Transform. Applied Soft Computing 14 (2014), 156--166.

23.

K.-P.

Chan and

A. W.-C.

Fu . Efficient time series matching by wavelets. In: Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, 1999, pp. 126--133.

24.

Li ,

Khan , and

Prabhakaran . Real-time classification of variable length multi-attribute motions. Knowledge and Information Systems 10(2) (2006), 163--183.

25.

Keogh ,

Chakrabarti ,

Pazzani , etc. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems 3(3) (2001), 263--286.

26.

Li ,

Guo , and

Qiu . Similarity measure based on piecewise linear approximation and derivative dynamic time warping for time series mining. Expert Systems with Applications 38(12) (2011), 14732--14743.

27.