SAX-quantile based multiresolution approach for finding heatwave events in summer temperature time series

Abstract

Time series pattern discovery is of great importance in a large variety of environmental and engineering applications, from supporting predictive models to helping to understand hidden underlying processes. This work develops a multiresolution time series method for extracting patterns in weather records, particular temperature data. The topic is important, as, given a warming climate, morbidity and mortality are expected to rise as heatwave frequency and intensity increase. By analysing summer temperature quantiles at different levels of coarseness, it was found that compounding models can contain a complete description of severe weather events. This new multiresolution quantile approach is developed as an extension of the symbolic aggregate approximation of the temperature time series in which quantiles are computed at every stretch of the piecewise partition. The process is iterated at different scales of the partition, and it was found to be a very useful approach for finding patterns related to both heatwave periods and intensities. The method is successfully tested using real weather records from Brazil (Recife) and the UK (London), and it was found that in both locations heatwave intensity and frequency are increasing at a substantial rate. In addition, it was found that the rate of increase in intensity of the heatwaves is far outstripping the rate of increase in mean summer temperature: by a factor of 2 in Recife and a factor of 6 in London. The approach will be of use to those looking at the impact of future climates on civil engineering, water resources, energy use, agriculture and health care, or those looking for sustained extreme events in any time series.

Keywords

Multiresolution quantile methods time series analysis weather forecast heatwave events

1. Introduction

Multiresolution time series analysis [34] decomposes the data into components at different frequencies. The approach is often used to allow ordinary predictive models using smaller time frames to deal with higher frequency components, but automatically larger time steps for the lower frequencies. Multiresolution time series analysis also helps noise removal processes and data compression [30].

This paper proposes a time series decomposition in various time-steps, thereby generating new data over different scales from a common source. As the proposal focuses on summer temperature time series, the desire is to better explain the extremes of the historical series in comparison to only using mean conditioned models. The work proposes a multiresolution quantile (MRQ) approach, computing the quantiles for each level of resolution in the time series.

In order to represent the warmer temperatures, the quantiles under analysis need to be focused on both upper and lower quantiles. On one side, quantile regressions at upper values gather how weather interacts with the near maximum temperatures during the day. On the other side, analysing lower quantiles provides insights about when these values are relatively high during the night and the potential factors affecting these circumstances. This is key, as it is known that the lack of a diurnal cycle was one of the reasons for the high mortality during the Paris heatwave of 2003 [18,31]. Short distances between upper and lower quantiles are of key importance in establishing criteria regarding the existence of heatwave events, conditioned to steadily higher minimum temperatures [14]. The analyses of previous heatwaves are of particular interest for those looking to create future weather time series when looking at ventilation and overheating issues in the built environment [10,12,17], drought periods in agriculture [36], the management of water resources [3,8] and energy systems [11,13].

Various methods for the multiresolution analysis of time series are discussed in the literature. The discrete wavelet transform (DWT) is a natural way to reach such a decomposition and is useful in indexing time series [7]. However, different alternatives have been proposed for those looking for improved efficacy. For example, those based on fuzzy inductive reasoning [6] decompose the time series into a trend series and another complementary series describing the deviation from this trend. Alternatives based on symbolic aggregate approximation (SAX) have also been successfully developed [5,25]. The main advantage of SAX based methods is that they provide a suitable avenue for pattern recognition models associated with time series [28].

There are a few works related to multiresolution time series with respect to climate or weather time series, including solar irradiance and climate reconstructions [27] and a hybrid approach to removing noise from a climate time series [19]. However, the work introduced in this paper is closer to the hierarchy-of-clusters approach proposed for use in multiresolution image analysis [22]; being a variation of 1d-SAX [23] but extending the more common SAX methodology to deal with time series quantiles. MRQ starts with the typical piecewise aggregate approximation (PAA) to divide the series into segments of equal length [16]. After that, the quantiles for each segment of data are saved for further analysis instead of only the average values, and SAX run on these quantities. Finally the differences between upper and lower quantiles at each resolution level is computed by a lower-bounding distance measure [20]. Possible heatwave events could then be detected based on the persistence of minimum distances at various time-based resolution levels.

2. Indexing and mining time series for multiresolution purposes

Indexing time series is traditionally used as a way to efficiently store a large temporal database [15]. However, its use has been expanded to allow the extraction of patterns in time series. While common indexing and mining models are based on measures formed from the average, the method proposed here is quantile based: this is to allow for better treatment of the higher temperatures.

2.1. Multiresolution based on average values

Wavelets are mathematical functions that represent data or other functions in terms of the averages and differences to a prototype function [2]. An interesting wavelet property w.r.t. multiresolution approaches is that the first coefficients forming a wavelet expression contain an overall approximation of the time series under study, while additional coefficients take into account data characteristics in greater detail [37]. This endows the wavelets with the properties required for a suitable methodology for investigating time series at various resolutions. Haar’s Discrete wavelet transform [7] is a widely used case in which the prototype function follows an orthonormal system, at discrete times, for the space of square-integrable functions on the unit interval $[0, 1]$ . Haar’s DWT is the simplest case of wavelets but can be developed to efficiently catch up on any change on the time series average. Furthermore, different DWTs can be used as bases and combined in order to produce a better approximation of the original time series, with more bases giving a higher resolution (see Fig. 1).

Fig. 1.

Example of a combination of various wavelet resolution levels for a time series.

Haar’s DWT’s are ultimately made up by step functions. Approaches based on piecewise aggregate approximation (PAA), which divides the time series dataset into equally spaced segments, provide similar results to Haar’s DWT in terms of time series decomposition. Consequently, DWT results and PAA based outcomes are equivalents to computing distances between time series [16]. The efficient results found by using PAA based methods, together with their capability to be straightforwardly part of further data mining processes, represent a key advantage for the work presented here.

2.2. Multiresolution based on quantiles

The symbolic aggregate approximation of time series (SAX) [20] represents the time series as a sequence of symbols. It was primarily developed to reduce the dimensionality of a numerical series into a short chain of characters. However, it has been found useful for data mining tasks such as: indexing [32], clustering [1,24], and classification [38]. SAX is based on three steps:

Divide a time series into segments of length L.

Compute the average of the time series on each segment.

Represent the average values as a symbol from an alphabet of size N.

The time series division is based on a previous PAA phase. SAX is based on the assumption that time series values follow a Gaussian distribution for each of the segments into which PAA divided the series. The conversion of the average values into a symbol makes use of ( $N - 1$ ) breakpoints that divide the area under the Gaussian distribution into N equi-probable areas and then the average value per segment is quantized according to the areas of this distribution [20]. As a result a “word” is composed containing as many letters as segments in the PAA.

This alphabetic approach is then useful in further analyses using methods such as hashing [35], variations of Markov models [4,21], and suffix tree approaches [29]. In addition, it automatically has associated a sliding windows approach in which every time-frame is encoded by a letter. Figure 2 represents the output of a single use of SAX process for a temperatures time series from London (April to September, 1989).

Fig. 2.

Example of the SAX conversion process for a time series with length 549, w = 9 and resolution 4 (a, b, c, d). Temperature variations from the long term baseline.

The SAX variation known as 1d-SAX [23] extends the usual alphabetic symbols to a system able to contain information about the average and the trend of the series within a segment. A natural extension of 1d-SAX takes median values on each interval, and provides similar results to linear regression. The new MRQ method proposed generalizes this approach by computing upper and lower quantiles at every segment of the PAA partition. The PAA partition becomes a bi-level PAA-quantile based partition (PAA-Q), and we have termed the modified SAX approach SAX-Quantile, or SAX-Q for short. PAA-Q is defined in a similar way to the single PAA for a time series of length n. $Q_{u} = q_{u 1}, \dots, q_{u w}$ and $Q_{l} = q_{l 1}, \dots, q_{l w}$ ; corresponding to upper and lower quantiles respectively. Then, the time series dimension is reduced from n to $2 \times w$ dimensions, and the data is divided into w equal sized frames. The vectors, $Q_{u}$ and $Q_{l}$ become a suitable data representation taking into account the extremes of the time series instead of the average values.

This multiresolution approach facilitates working at the different coarseness of PAA-Q used in SAX-Q to allow for posterior analysis [26]. MRQ focuses specifically on quantile information for discovering patterns, in our case of steadily high summer temperatures at upper and lower quantiles i.e. heatwaves. Having enough variety in the coarseness levels of the PAA-Q partition, and computing upper and lower quantiles at each resolution level, we believe allows heatwave events to be identified successfully.

First, with larger PAA-Q segments, the series is filtered to allow the rapid identification of patterns associated with high temperatures even at lower quantiles. This is easily achieved by splitting the series using a binary SAX codification of values lower and higher than a certain threshold. For the set of segments of interest (those encoded as high temperatures) we approach a finer PAA-Q resolution. The overall MRQ process is:

Set PAA to coarsest scale with segment: $k_{1}$ days in length

Compute upper and lower quantiles: $α_{u}$ and $α_{l}$ levels, respectively

Approach SAX-Q coding as ‘1’ those segments encoded with higher alphabet values for both lower and upper quantiles; ‘0’ otherwise

Target the set of PAA-Q encoded as ‘1’

Define ‘runs’ as sub-series segment of consecutive PAA-Q stretches coded as ‘1’

Set PAA-Q for the runs to a finer scale: $k_{2}$ days length (with $k_{2} < k_{1}$ )

Go to 1. (a)

3. Patterns extraction from summer temperatures

Two meteorological databases in Brazil and UK have been analysed. Both of them contain 50 years of data taken at 8 hour intervals during the period 1961 to 2010.

3.1. Brazil summer database

Table 1
Average RMSE in the PAA approximation of the Recife 1961–2010 8 hourly temperature time series at different segment sizes

Segment length 540 180 60 30 9 3

Avg. RMSE 1.72 1.73 1.68 1.62 1.58 1.50

Segment length	540	180	60	30	9	3
Avg. RMSE	1.72	1.73	1.68	1.62	1.58	1.50

The Brazil weather database was collected at Curado weather station in Recife (Pernambuco State, Brazil). For each one of the 50 years analysed, the months of October to March (of the following year) were selected to represent the summer period. The average temperature for this period is 27.59°C with an associated standard error of 1.71. The choice of the number of segments to approach the first PAA basis partition takes into account a suitable length of every segment and the error related to the approximation (see Table 1). Thus, for a segment length of size 30 (10 days) the average root mean square error over (RMSE) the 50 years is only $1.62 / 1.58 = 1.02$ times higher than the average RMSE provided by working with a length of size 9 (3 days) and 1.06 times higher than working with segments of size 3 (1 day). This is better than working with a coarser partition based on segments of size 180 (60 days) which have an average RMSE 1.09 and 1.15 higher than partitions of 3 and 1 days respectively.

Figure 3 shows the performance of SAX-Q for a given PAA-Q partition and one selected summer. The summer of 1989–1990 is chosen for consistency with the choice of using the summer of 1989 for the UK analysis. Upper and lower quantiles, together with the median (quantile 50%) are shown in Fig. 3.

Fig. 3.

SAX-Q conversion process for Recife’s temperature time series with length 540, w = 18, resolution 4 (a, b, c, d), and quantiles 0.05 and 0.95. Temperature variations in summer 1989–1990 are given w.r.t. the baseline period of 1961–2010.

After obtaining the output of the MRQ process for Recife’s summer temperatures, it is possible to identify the warmest temperatures with the beginning and the end of the summer period. Figure 4 shows the histogram of the normalised probabilities regarding the frequencies of 2-level SAX-Q in the observed time series. This shows that at the beginning of the summer there is often a brief period of high temperatures, and from the end of December and until February also high temperatures.

Fig. 4.

Probability of warmest temperatures during Recife summers (1961–2010).

Figure 5(a) shows how the frequency of periods of high temperature increases in the latter years of the study period. This figure uses a colour scale in which light red represents no heatwaves that year, red is 1 period of high temperatures, and dark red is 2 or more periods of high temperatures. It can be seen how the ratio of years with heatwaves to years without is 10/13 in the first half of data. However, post year 1985, the situation is completely different, with two or more heatwaves in some years and the ratio of years with heatwaves to years without increasing to 14/7.

Fig. 5.

Change in frequency and intensity of heatwave candidates. Recife 1961–2005. (a) Frequency of warmer temperatures. (b) Trend of average temperatures associated with heatwave events.

Figure 5(b) complements the information given in Fig. 5(a) by showing how the mean temperature during each heatwave changes over the 50 years of the study. It is clear that there is an increasing trend in the mean, so not only are the number of heatwaves increasing, their magnitude are also increasing, with the mean heatwave temperature being 1.5°C higher in 2005 than at the start of the data. Applying linear regression suggests an increase in heatwave magnitude of 0.4°C per decade; this should be compared to the overall temperature trend during the summer period which is less than 0.2°C per decade.

3.2. UK summer database

The UK weather data was collected at the Heathrow weather station (London, UK) [33]. For each one of the 50 years analysed, the months of April to September were selected to represent the summer period. The average temperature for this period is 15.78°C with an associated standard error of 4.71. The temperature is clearly below Recife’s values. However, the proposed MRQ process automatically adapts itself to locally set conditions for screening heatwaves. The choice of the number of segments to approach the first PAA basis partition takes into account a suitable length for every segment and the error related to the approximation (see Table 2). Thus, for a segment length of size 61 (20 days) the average RMSE over the 50 years is $3.57 / 2.96 = 1.20$ times higher than the average RMSE provided by working with a length of size 9 (3 days) and 1.33 times higher than working with segments of size 3 (1 day). This is better than working with a coarser partition based of segments of size 183 (60 days) which have an average RMSE 1.36 and 1.52 higher than partitions of 3 and 1 days respectively.

Table 2
Average RMSE in PAA approximation of time series at different segment sizes. London 1961–2010

Segment length 549 183 61 9 3

Avg. RMSE 4.89 4.05 3.57 2.96 2.68

Segment length	549	183	61	9	3
Avg. RMSE	4.89	4.05	3.57	2.96	2.68

Figure 6 shows the performance of SAX-Q for a given PAA-Q partition and the summer of 1989. The year 1989 was chosen as it represents a moderately warm year and is used as a standard warm summer by the Chartered Institution of Building Services Engineers (CIBSE, UK) [9].

Fig. 6.

SAX-Q conversion process for London’s temperatures time series with length 549, w = 9, resolution 4 (a, b, c, d), and quantiles 0.05 and 0.95. Temperature variations in summer 1989 w.r.t. the baseline period 1961–2010.

After obtaining the output of the MRQ process for London it is found that the warmest temperatures typically occur during the last week of July and the first days of August. Figure 7 shows the histogram of the normalised probabilities of 2-level SAX-Q in the observed time series.

Fig. 7.

Probability of warmest temperatures during London summers (1961–2010).

Figure 8(a) shows how the heatwave frequency changes over the study period. In the first 25 years the ratio of years with heatwaves to years without is 18/7. However, post year 25 (1986), the ratio of years with heatwaves to years without increases to 21/4. This increase in heatwave frequency is even higher than that observed for the previous case-study of Recife.

Figure 8(b) shows how the mean temperature associated with each heatwave has increased with the mean heatwave temperature being about 4°C higher towards the end of the series that near the beginning. Applying linear regression suggests an increase in heatwave magnitude of near 1°C per decade, this should be compared to the overall temperature trend during the summer period which is less than 0.15°C per decade.

Fig. 8.

Change in frequency and intensity of heatwave candidates. London 1961–2010. (a) Frequency of warmer temperatures. (b) Trend of average temperatures associated with heatwave events.

It is interesting to note the difference in the values of the candidate heatwave temperatures identified by the process for Recife and London. While the candidate temperatures for heatwaves in Recife have a minimum value over 27°C, in London 18°C at night is enough to indicate the temperature is potentially part of a heatwave – a 9°C difference between the two cities. However, the average temperature of the selected dataset in Recife is around 29°C, compared with 24°C in London, only a 5°C difference.

The proposed method is blind to the temperatures themselves and suggests a possible heatwave based only on the variability and the size of the variation, thus it is useful in that it is universal. The corollary however is that if given an unsuitable time series it might identify heatwaves where the maximum temperature reached is still below any common definition of a heatwave. In a practicable application there would hence need to be the use of an heuristic for rejecting unsuitable time series or locations, or one applied after identifying the heatwave; for example that the peak temperature must be at least 28°C. However, this constant would depend greatly on the location: 28°C is a high temperature in London for example, but not in Saudi Arabia.

4. Conclusions

The paper presents a new method based on a fundamental modification of the SAX approach for time series, called SAX-Q, which focuses on the symbolic approximation of the quantiles instead of using the mean. Its iterative use leads to an automatic detection process for periods of interest w.r.t. time series extremes.

Multi-resolution analysis appears to be an ideal tool to uncover relationships at different temporal scales in weather data that are normally hidden by background noise. A posterior analysis of a multi-resolution SAX-Q has been carried out the summer temperature record in Brazil and the UK and heatwaves identified in these two, very different, climates; as the methodology automatically adapts itself to local conditions. Fundamental questions about when, how often and the intensity of high temperatures have been answered for both locations. Most importantly, in both locations heatwave intensity and frequency are increasing at a substantial rate. In addition, it was found that the rate of increase in intensity of the heatwaves is far outstripping the rate of increase in mean summer temperature: by a factor of 2 in Recife and a factor of 6 in London.

An interesting extension to this work would come from the relaxation of the PAA-Q partition proposed, by instead using a dynamical basis for segments or change points for the initial SAX-Q partition of the time series. This might provide better insights about the length of the hottest periods lying in the data.

Footnotes

Acknowledgements

This research has been performed under the COLBE (The Creation of Localized Current and Future Weather for the Built Environment) project funded by the UK’s Engineering and Physical Sciences Research Council (EP/M021890/1).

References

Aghabozorgi and

T.Y.

Wah, Clustering of large time series datasets, Intelligent Data Analysis18(5) (2014), 793–817.

Bakhtazad,

Palazoglu and

J.A.

Romagnoli, Process data de-noising using wavelet transform, Intelligent Data Analysis3(4) (1999), 267–285. doi:10.1016/S1088-467X(99)00023-2.

B.M.

Brentan,

LuvizottoJr.,

Herrera,

Izquierdo and

Pérez-García, Hybrid regression model for near real-time urban water demand forecasting, Journal of Computational and Applied Mathematics309(1) (2017), 532–541. doi:10.1016/j.cam.2016.02.009.

Cassisi,

Prestifilippo,

Cannata,

Montalto,

Patanè and

Privitera, Probabilistic reasoning over seismic time series: Volcano monitoring by hidden Markov models at mt. etna, in: Pure and Applied Geophysics, 2016, pp. 1–22.

Castro and

P.J.

Azevedo, Multiresolution motif discovery in time series, in: SDM, SIAM, 2010, pp. 665–676.

F.E.

Cellier and

À.

Nebot, Multi-resolution time-series prediction using fuzzy inductive reasoning, in: Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on, Vol. 2, IEEE, 2004, pp. 1621–1624.

K.-P.

Chan and

A.W.-C.

Fu, Efficient time series matching by wavelets, in: Data Engineering, Proceedings, 15th International Conference on, IEEE, 1999, pp. 126–133.

Christierson,

J.-P.

Vidal and

S.D.

Wade, Using UKCP09 probabilistic climate information for UK water resource planning, Journal of Hydrology424 (2012), 48–67. doi:10.1016/j.jhydrol.2011.12.020.

CIBSE, Design Summer Years for London – CIBSE TM49, 2014.

10.

Coley,

Kershaw and

Eames, A comparison of structural and behavioural adaptations to future proofing buildings against higher temperatures, Building and Environment55 (2012), 159–166. doi:10.1016/j.buildenv.2011.12.011.

11.

R.R.B.

de Aquino,

H.T.V.

Gouveia,

M.M.S.

Lira,

A.A.

Ferreira,

O.N.

Neto and

M.A.

Carvalho, Models based on neural networks and neuro-fuzzy systems for wind power prediction using wavelet transform as data preprocessing method, in: Engineering Applications of Neural Networks: EANN 2012, London, UK, Springer, Berlin/Heidelberg, 2012, pp. 272–281.

12.

Eames,

Kershaw and

Coley, A comparison of future weather created from morphed observed weather and created by a weather generator, Building and Environment56 (2012), 252–264. doi:10.1016/j.buildenv.2012.03.006.

13.

J.O.

Ebinger, Climate Impacts on Energy Systems: Key Issues for Energy Sector Adaptation, World Bank Publications, 2011.

14.

Herrera,

Eames,

Ramallo-Gonzalez,

Liu and

Coley, Quantile regression ensemble for summer temperatures time series and its impact on built environment studies, in: Proceedings of International Environmental Modelling and Software Society (IEMSS), 2016.

15.

Keogh, Indexing and mining time series data, in: Encyclopedia of GIS, Springer, 2008, pp. 493–497. doi:10.1007/978-0-387-35973-1_598.

16.

Keogh,

Chakrabarti,

Pazzani and

Mehrotra, Dimensionality reduction for fast similarity search in large time series databases, Knowledge and Information Systems3(3) (2001), 263–286. doi:10.1007/PL00011669.

17.

Kershaw,

Eames and

Coley, Comparison of multi-year and reference year building simulations, Building Services Engineering Research and Technology31(4) (2010), 357–369. doi:10.1177/0143624410374689.

18.

Laaidi,

Zeghnoun,

Dousset,

Bretin,

Vandentorren,

Giraudet and

Beaudeau, The impact of heat islands on mortality in Paris during the August 2003 heat wave, Environmental Health Perspectives120(2) (2012), 254.

19.

Lee and

T.B.M.J.

Ouarda, An EMD and PCA hybrid approach for separating noise from signal, and signal in climate change detection, International Journal of Climatology32(4) (2012), 624–634. doi:10.1002/joc.2299.

20.

Lin,

Keogh,

Wei and

Lonardi, Experiencing SAX: A novel symbolic representation of time series, Data Mining and Knowledge Discovery15(2) (2007), 107–144. doi:10.1007/s10618-007-0064-z.

21.

Lin and

Li, Finding structural similarity in time series data using bag-of-patterns representation, in: Scientific and Statistical Database Management, Springer, 2009, pp. 461–477. doi:10.1007/978-3-642-02279-1_33.

22.

Lin,

Vlachos,

Keogh and

Gunopulos, Multiresolution clustering of time series and application to images, in: Multimedia Data Mining and Knowledge Discovery, Springer, 2007, pp. 58–79. doi:10.1007/978-1-84628-799-2_4.

23.

Malinowski,

Guyet,

Quiniou and

Tavenard, 1d-SAX: A novel symbolic representation for time series, in: Advances in Intelligent Data Analysis XII, Springer, 2013, pp. 273–284. doi:10.1007/978-3-642-41398-8_24.

24.

Martínez-Álvarez, Clustering preprocessing to improve time series forecasting, AI Communications24(1) (2011), 97–98.

25.

Megalooikonomou,

Wang,

Li and

Faloutsos, A multiresolution symbolic representation of time series, in: Data Engineering, ICDE 2005, Proceedings, 21st International Conference on, IEEE, 2005, pp. 668–679.

26.

Mueen and

Chavoshi, Enumeration of time series motifs of all lengths, Knowledge and Information Systems45(1) (2015), 105–132. doi:10.1007/s10115-014-0793-4.

27.

H.-S.

Oh,

C.M.

Ammann,

Naveau,

Nychka and

B.L.

Otto-Bliesner, Multi-resolution time series analysis applied to solar irradiance and climate reconstructions, Journal of Atmospheric and Solar-Terrestrial Physics65(2) (2003), 191–201. doi:10.1016/S1364-6826(02)00291-2.

28.

Rajaraman,

J.D.

Ullman,

J.D.

Ullman and

J.D.

Ullman, Mining of Massive Datasets, Vol. 1, Cambridge University Press, Cambridge, 2012.

29.

Rasheed,

Alshalalfa and

Alhajj, Efficient periodicity mining in time series databases using suffix trees, Knowledge and Data Engineering, IEEE Transactions on23(1) (2011), 79–94. doi:10.1109/TKDE.2010.76.

30.

J.D.

Scargle, Wavelet and other multi-resolution methods for time series analysis, in: Statistical Challenges in Modern Astronomy II, Springer, 1997, pp. 333–347. doi:10.1007/978-1-4612-1968-2_19.

31.

Schaeffer,

de Crouy-Chanel,

Wagner,

Desplat and

Pascal, How to estimate exposure when studying the temperature-mortality relationship? A case study of the Paris area, International Journal of Biometeorology60(1) (2016), 73–83. doi:10.1007/s00484-015-1006-x.

32.

Toshniwal, Feature extraction from time series data, Journal of Computational Methods in Sciences and Engineering9(1, 2S1) (2009), 99–110.

33.

UK Meteorological Office, Met Office Integrated Data Archive System (Midas), land and marine surface stations data (1853-current), NCAS British Atmospheric Data Centre (accessed 01/10/14), 2012.

34.

Wang,

Megalooikonomou and

Faloutsos, Time series analysis with multiple resolutions, Information Systems35(1) (2010), 56–74. doi:10.1016/j.is.2009.03.006.

35.

Wang,

Mueen,

Ding,

Trajcevski,

Scheuermann and

Keogh, Experimental comparison of representation methods and distance measures for time series data, Data Mining and Knowledge Discovery26(2) (2013), 275–309. doi:10.1007/s10618-012-0250-5.

36.

R.L.

Wilby,

Orr,

Watts,

R.W.

Battarbee,

P.M.

Berry,

Chadd,

S.J.

Dugdale,

M.J.

Dunbar,

J.A.

Elliott,

Extenceet al., Evidence needed to manage freshwater ecosystems in a changing climate: Turning adaptation principles into practice, Science of the Total Environment408(19) (2010), 4150–4164. doi:10.1016/j.scitotenv.2010.05.014.

37.

Wojdyłło, Wavelets and mallat’s multiresolution analysis, Fundamenta Informaticae34(4) (1998), 469–474.

38.

Yuan,

Wang,

Han and

Sun, A lazy associative classifier for time series, Intelligent Data Analysis19(5) (2015), 983–1002. doi:10.3233/IDA-150754.

SAX-quantile based multiresolution approach for finding heatwave events in summer temperature time series

Abstract

Keywords

1. Introduction

2. Indexing and mining time series for multiresolution purposes

2.1. Multiresolution based on average values

3.1. Brazil summer database

Table 1 Average RMSE in the PAA approximation of the Recife 1961–2010 8 hourly temperature time series at different segment sizes Segment length 540 180 60 30 9 3 Avg. RMSE 1.72 1.73 1.68 1.62 1.58 1.50

Table 2 Average RMSE in PAA approximation of time series at different segment sizes. London 1961–2010 Segment length 549 183 61 9 3 Avg. RMSE 4.89 4.05 3.57 2.96 2.68

Footnotes

Acknowledgements

References

Table 1
Average RMSE in the PAA approximation of the Recife 1961–2010 8 hourly temperature time series at different segment sizes

Segment length 540 180 60 30 9 3

Avg. RMSE 1.72 1.73 1.68 1.62 1.58 1.50

Table 2
Average RMSE in PAA approximation of time series at different segment sizes. London 1961–2010

Segment length 549 183 61 9 3

Avg. RMSE 4.89 4.05 3.57 2.96 2.68