Abstract
Detecting anomalies in patterns of sensor data is important in many practical applications, including domestic activity monitoring for Active Assisted Living (AAL). How to represent and analyse these patterns, however, remains a challenging task, especially when data is relatively scarce and an explicit model is required to be fine-tuned for specific scenarios. This paper, therefore, presents a new approach for temporal modelling of long-term human activities with smart-home sensors, which is used to detect anomalous situations in a robot-assisted environment. The model is based on wavelet transforms and used to forecast smart sensor data, providing a temporal prior to detect unexpected events in human environments. To this end, a new extension of Hybrid Markov Logic Networks has been developed that merges different anomaly indicators, including activities detected by binary sensors, expert logic rules, and wavelet-based temporal models. The latter in particular allows the inference system to discover deviations from long-term activity patterns, which cannot be detected by simpler frequency-based models. Two new publicly available datasets were collected using several smart-sensors to evaluate the approach in office and domestic scenarios. The experimental results demonstrate the effectiveness of the proposed solutions and their successful deployment in complex human environments, showing their potential for future smart-home and robot integrated services.
Keywords
Introduction
Modelling temporal series is important to represent sensor data in robotics and many other technical applications, for example to monitor and understand human behaviours with wireless sensor networks in smart environments [34]. The nature of the process, the amount of data required and the extent of the forecasting challenge determine the kind of model finally chosen. Temporal models should be able to capture the frequencies of important event occurrences – e.g. the daily routine of activities performed by an elderly person at home [4]. Methods for frequency analysis (i.e. Fourier transform) can reveal periodic patterns in the sensor data but, if occurring only within specific time intervals, they fail to determine when these periodicities start and end. In this paper, therefore, we propose a new wavelet-based method that is suitable for modelling sparse periodic and/or very short events in sensor data.
Wavelet analysis has been successfully used for time series classification [23], showing the advantage of simultaneously providing temporal and frequency information of a signal with very little loss of information, and it is therefore more powerful than standard Fourier analysis in capturing and forecasting sensor data for many real-world applications. One such application is Active Assisted Living (AAL) and Assistive Robotics, where good temporal representations of human activities can enable the implementation of many useful well-being services [14].
However, the use of wavelet-based techniques for activity monitoring in AAL scenarios has not been sufficiently investigated in previous literature. To this end, the wavelet-based temporal model proposed in this paper can be used to identify patterns of human activity from the smart sensors of a robot-assisted environment, detecting anomalies that could trigger opportune robot interventions (e.g. if an elderly user spends more time in bed than usual). In this context, anomalies are defined as deviations from typical human activities. The latter have a significant temporal component, which is often periodic, but with occasional variations and possibly very short-term events (e.g. repeatedly opening/closing the fridge in the morning, but only on weekdays). In particular, we adopt the anomaly definition by [22], which considers the amount of motion in specific locations as a normalized entropy beyond some given thresholds. Note that the term “motion” is used in a broad sense to include the activation of various binary sensors, such as passive infrared (PIR) motion detectors or contact sensors on doors, cupboards, etc. We also refer to this type of motion in the environment as activity level, or intensity.

Wavelet-based anomaly detection system. The expected and actual normalized entropies from wavelet-based temporal models and sensor data, respectively, are compared by a HMLN-based inference module that contains expert rules. In particular, for N sensors (blue boxes), N wavelet-based temporal models (green boxes) are created and used to predict the expected output of each sensor. Real and expected outputs are used to compute, respectively, the real and expected normalized entropies (yellow blocks), which in turn represents real and expected levels of human activity in the environment. These two activity levels, together with the original sensor data, are finally processed by a rule-based hybrid Markov logic network (HMLN, orange block) to detect potential anomalies.
In this work, we apply our wavelet-based representation of human activities to a new anomaly detection system for AAL (see Fig. 1), complementing and extending our previous assistive robotic system [14]. The system predicts sensor data and detect anomalies when the expected intensity of human activities differs from the real one. In particular, given a set of smart-home sensors (i.e. motion detectors, temperature, light and contact sensors), we build accurate temporal models to represent and forecast their expected output. Then, using an entropy-based method [22], we estimate the current and expected levels of human activity. These two levels are finally compared by a new inference system, based on a Hybrid Markov Logic Network (HMLN) [54], to detect and integrate “statistical” anomalies with user-defined expert rules.
The paper includes three main contributions:
First, we describe a new technique for temporal modelling of (long-term) human activities based on wavelet transforms. Among its possible applications, this wavelet-based temporal model enables the forecasting of smart-sensor signals for the detection of potential anomalies, i.e. human activities that deviate significantly from the norm. A software implementation of this temporal modelling tool is made publicly available.
Second, we propose an automatic system for anomaly detection that uses a HMLN to combine three sources of information about human activities, namely i) actual entropy level from smart-home sensors, ii) expected entropy from wavelet-based temporal models, and iii) expert knowledge in the form of logic rules.
Finally, we present extensive experimental results based on two large datasets, one previously recorded in an office environment [22] and a new one from a real elderly home, which we also made publicly available. These datasets were recorded in MongoDB format [9] for easy access and re-usability by the scientific community.
The remainder of the paper is organized as follows. Section 2 reviews state-of-the-art methods for temporal modelling and anomaly detection with smart-home sensors, including relevant public datasets. Section 3 briefly introduces the wavelet transform and describes the respective temporal models of sensor data. Section 4 describes the design of the HMLN-based inference systems and its expert rules to analyse and detect anomalies in human activities. Section 5 illustrates the architecture and practical implementation of the anomaly detection system. Section 6 presents datasets and experiments to validate the effectiveness of the temporal models and the anomaly detection in an office and AAL scenarios. Finally, Section 7 discusses advantages and disadvantages of the proposed approach, suggesting directions for future work in this area.
Temporal modelling is widely used to detect regular patterns in data. From time series analysis, a relevant tool is the autoregressive integrated moving average model (ARIMA) used, for example, for applications as varied as stochastic wind power modelling [8] and, more recently, for the prediction of human behaviours in smart-home settings [53].
The main problem with these models, though, is that they are only suitable for relatively short temporal windows or known temporal trends [58]. Other non-linear techniques, such as Gaussian Processes, could theoretically achieve the full reconstruction of signals from mixture models. For example, Ghassemi & Deisenroth [25] use periodic Gaussian Processes for long-term forecasting. Similarly, recent work by [27] has used a non-parametric prognostic framework for individualized event prediction based on Poisson processes with a multivariate Gaussian convolution method. These models are typically robust against model instabilities, but they require heavy computational processes.
A technique called FreMEn (Frequency Map Enhancement) has also been proposed for spatio-temporal representations of robot environments in long-term scenarios [30]. It uses Fourier analysis to extract periodicities in sensor data, in combination with a Bernoulli distribution or Poisson processes [29] to represent binary information states. FreMEn is a simple yet effective modelling tool, but it is not suitable to describe sparse or very short events.
Wavelet-based methods have been used for temporal analysis in many different fields such as drought or price forecasting [11], passenger flow prediction [49], human gait detection [2], motion and texture classification [16]. Since wavelets contain both frequency and time domain information, they are particularly suitable to represent sparse non-stationary signals. Indeed, wavelet transforms have been successfully used to model complex data patterns in sensor networks and wearable systems for human activity recognition [45,55]. Their application to sensor modelling and forecasting for activity monitoring in long-term studies, however, are still underinvestigated. Given their properties, it is particularly interesting to evaluate whether they can reliably characterise short- and long-term human activities in smart-home AAL settings.
Soulas et al. [47] proposed an Extended Episode Discovery model that defines habits in terms of length, frequency and periodicity for offline processing. In [7], the authors compare three sequential activity models – Hidden Markov Model (HMM), Conditional Random Fields (CRF) and sequential Markov Logic Network (MLN) – where feature vectors were generated during fixed-time windows for on-line processing. Although potentially useful, these sequential activity models were not used for anomaly detection in the aforementioned works.
Typically, anomaly detection systems are designed for the specific sensor(s) used. Depending on the input data, approaches may vary greatly. Wearable activity trackers like the ones used for human activity recognition [37,45], for example, provide continuous motion and pose information without requiring any additional preprocessing.
Markov Logic Networks (MLNs) are both a modelling [32] and inference [28] tool, often used for their flexibility to define rich models. They are able to perform inference using imprecise or incomplete inputs, useful to deal with sensor faults and network errors. In addition, they can blend both sensor data and expert logic rules within a probabilistic framework for robust inference in real-time applications [50]. Compared to SVM and HMM-based systems, the advantage of using MLNs for anomaly detection is that they require a smaller amount of sensor data to build their models and that they better handle uncertain information [24]. SVM have been successfully combined with deep learning techniques for anomaly detection and achieved promising results in high dimensional problems [18], but without exploiting the available temporal information.
An unsupervised anomaly detection for multivariate time series was proposed in [1] based on an adversarially-trained auto-encoder that outperformed previous unsupervised methods, such as Isolation Forests, Autoencoders, LSTM-VAE, DAGMM [62], and OmniAnomaly [48]. A variational representation of an encoder-decoder based LSTM (VLSTM) was proposed in [61] for a supervised anomaly detection in industrial big data, efficiently coping with typical issues of imbalanced high-dimensionality. However, in addition to the significant effort required for labeling, VLSTM was not benchmarked against other frameworks for anomaly detection. The authors in [57] proposed an LSTM-Autoencoder (LSTM-AE) for unsupervised anomaly detection in indoor environments, evaluated only on a single univariate dataset. Another unsupervised attention-based convolutional LSTM-based autoencoder with dynamic thresholding (ACLAE-DT) was proposed by [51] for anomaly detection and diagnosis in multivariate time series applied to smart manufacturing. ACLAE-DT showed better performance compared to other baselines, such as SVM and ARIMA (Auto-Regressive Integrated Moving Average), but it required a large amount of training data, further augmented by a pre-processing phase to capture the intercorrelation between sensor pairs. Although these unsupervised methods achieve good performance, they typically require large amounts of data in order to generalise well, which are not always available. Moreover, it remains unclear how to fine tune and incorporate additional expert knowledge into such models to fit a particular application.
A different non-deep approach for fast, online and accurate anomaly detection was proposed in [56] leveraging an energy-based extreme learning machine for online detection of anomalies in traffic signals. This framework outperformed SVM, artificial neural networks, K-NN, and Random Forest on a DDoS (distributed denial-of-service attacks) dataset. Other works [10,39,43] addressed the explainability, and therefore causes, of detected anomalies. In particular, the approach proposed by [10] achieved the best prognostic performance among several baseline methods for anomaly detection. One of the most recent works in unsupervised anomaly detection for time series data was presented by [41] comparing six deep and traditional methods. The results show that in many cases classical machine learning methods can outperform (in terms of runtime and accuracy) many deep-learning solutions across a wide range of anomaly types. Motivated by the results of this latter study, our paper proposes a non-deep architecture based on HMLN that combines wavelet-based temporal models and expert rules, mixing for the first time discrete and continuous predicates, to infer potential anomalies. These expert rules allow also to overcome the lack of data otherwise required to train deep learning-based methods.
Obviously, public datasets with labelled sensor data are important to test and compare different algorithms. Unfortunately though, there are not many such datasets available in the context of smart-sensor networks, especially for AAL applications. The dataset by Tim van Kasteren2
In this section we present a novel approach to forecast sensor data for human activity monitoring using a wavelet-based temporal model. We start with a brief description of the discrete wavelet transform algorithm, and then we explain how to tune and use this algorithm for building our temporal model of the sensor data.
Wavelets provide an alternative representation that overcomes the limitations of Fourier analysis. They decompose signals into individual components, which maintain both frequency and time information. Also, they can effectively represent and provide localized information about discontinuities. These advantages (i.e. time-frequency and discontinuity representations) are very important to handle the non-periodic and often “spiky” nature of real-world sensor data, especially in the context of activity monitoring.
Discrete wavelet transform
A discrete wavelet transform (DWT) is a sampled wavelet transform applicable to digital signals. It uses two sets of weighted and displaced reference signals (wavelets
This is, using the averaging
Sensor data modelling and forecasting
After introducing the wavelet transform and its parameters, we can use them to model smart-home sensors and to forecast their data. Our model is an efficient representation of a generic temporal signal, similar to some compression techniques commonly used in image processing [20].
Let us consider the signal
Our wavelet-based model
Once this model is available, it is possible to represent the sensor output at a future time instant
In practice, we will use the set of coefficients
Anomaly detection
Markov Logic Networks can be used to combine different sources of information for probabilistic inference. In this paper, we use both smart-home motion sensors and their wavelet-based models to analyse the difference between actual and expected entropy, respectively, of the environment. The first one represents the current activity level, whereas the second one represents the most likely one. These entropy values, together with direct sensor inputs and expert rules, provide the necessary information for our MLN to detect anomalous situations, as shown also in Fig. 1.
Hybrid Markov logic networks
MLNs combine both probabilistic and logical reasoning [42]. Briefly, a MLN consists of a set of weighted first-order logic formulas or clauses. The latter include the following elements:
constants, which are possible objects in the domain of interest;
variables, describing a set of objects in that domain;
functions, mapping relations between different objects;
predicates, defining logical attributes or relationships over the domain’s elements, which can be combined into more complex formulas using logical connectors.
Functions, variables and constants are called terms. If they do not contain variables, they are ground terms. A predicate that contains only ground terms is a ground predicate. When a logical value is assigned to all grounded predicates in a network, we have a possible world.
Using evidences, MLNs can produce Markov networks that describe the probability of all possible combinations of grounded clauses. We can then perform inference on these Markov networks, usually by using approximate methods such as MC-SAT [54]. Besides discrete evidence value, it is also possible to consider continuous ones using an extension called Hybrid Markov Logic Network (HMLN) [54]. Thanks to the latter, we can thus consider predicates based on continuous variables that contain our entropy values of the activity levels.
Wavelet model as prior for HMLN
According to caregivers, some of the most concerning behaviours of people with cognitive impairments are related to memory losses [5], wandering and repetitive actions [15]. Memory losses may be linked with problematic behaviours, for example if user leaves some doors open. If user leaves main door open it will likely affect the house temperature and be a safety concern. Similarly, if the fridge door is left open, it will increase the energy consumption, break the fridge itself and spoil the food. Wandering may be another concerning behaviour, specially if happens during resting hours where there are no caregivers around and accidents are more likely to happen. But overall, wandering during resting hours keeps patients from resting. Our system is able to detect those situations, considered as “inappropriate behaviours” using logical rules.
Repetitive actions are also a source of concern, as they are related to dementia. Changes on the intensity of the activity levels or usual times can be indicators of a cognitive decline. For example, if users change they activity pattern spending more time in the kitchen than usual or at unusual times -for them-, it may related to dementia episodes. The wavelet-based sensor data model defined in Section 3.2 can be used to predict the expected output of a particular sensor based on historical data and detect “Statistical anomalies” using logical rules as well.
These predictions
We define three clauses to combine our sources of information: one to check whether the current sensor value
In our system,

An advantage of MLNs is that they can combine different logical rules. This allows us to combine these “Statistical anomalies” rules with new ones describing “inappropriate behaviours”. For AAL applications, such rules could be provided by clinicians or professional carers and adapted to the specific person being monitored. In our system these can be monitored by means of motion detector and contact sensors on doors and appliances. Their outputs determine the state of the predicate
The two types of anomaly are finally combined by the following
In conclusion, we have two groups of expert rules: the first group is related to sensors and their outputs –
The solutions described in the previous sections have been implemented in ENRICHME,4

Smart-home sensors integration in ENRICHME.
The embedded PC for data recording and processing is an Intel NUC i7-5557U CPU @ 3.10 GHz with 8 GB of RAM, running Linux OS Ubuntu 14.04 64 bits. The smart-home sensors are commercial Z-Wave wireless devices produced by the Fibar Group.6
The anomaly detection system is implemented as a Robot Operating System7
The inference service can be also triggered under demand by other ROS modules, for example giving the assistive robot the possibility to evaluate anomalies at will or trigger evaluations not only periodically but whenever particular criterion is met. The proposed system could be potentially used as an “space-behavior-health” just by identifying the key spatial areas where anomaly detection should be evaluated.
The performance of our proposed solutions were evaluated using real data recorded from different scenarios. In this sections, we will first describe two different datasets: one already presented in [22] and one newly recorded. Then, we will use them to evaluate the forecasting capabilities of our wavelet sensor model compared to another similar tool in the literature. This is, we will implement wavelet models for all the sensors on each dataset. Based on these wavelet models, we will calculate the expected entropy levels of the testing environments and finally demonstrate their use as priors for anomaly detection. Wavelet models are the base of our anomaly detection system, but HMLNs allow to blend them with expert rules derived from caregivers. So, in this section we will also study the impact of the expert rules described in Section 4.2 in our anomaly detection system. The implemented rules (‘door left open’ and ‘activity during resting hours’) are some of the most common on the literature. Finally, the performance of the anomaly detection system was compared also against some state-of-the-art unsupervised anomaly detection methods.
Sensor datasets
All the datasets were recorded using MongoDB, an open-source cross-platform document-oriented database. MongoDB is a NoSQL database, using JSON-like documents with schemas. Compared to traditional log and spreadsheet files, this storage approach offers better data management and manipulation, which is particularly important for long-term datasets like ours. MongoDB can work very efficiently with time series and also provides efficient and flexible querying methods, so we can easily retrieve any data interval, sensor set, or even combine data from other sources [40]. Our datasets are recorded from commercial devices managed by OpenHAB, one of the most popular open-source software platforms for home automation, which ensures the accuracy, consistency and completeness of the stored data. The sensor datasets presented here can be seamlessly augmented with and compared to future recordings from a large range of different domotic sensor standards.
The first dataset was collected for 111 days in an office environment (L-CAS dataset [22]) including: a lounge with sofas and a coffee table; a kitchenette with various appliances and cupboards for storing and preparing food; an entrance and a workshop area. This dataset contains data from ten different physical devices, which provided six different types of sensor data readings: humidity, temperature, light, energy consumption, motion, and binary contact (for door activation). The sensors were located in five different locations, and their data recorded approximately every 30 seconds, generating more than 400,000 data entries in the final dataset.
More than ten people were working in the L-CAS premises during the recording. The sensors were mostly concentrated in places where a rich set of activities were typically performed (entering, exiting, eating, drinking, resting, etc.). The dataset is split in two parts: the first one, used for training, includes sensor data continuously recorded for three months and a half; the second one includes one week of data used for testing.

ENRICHME dataset environment.
The new dataset was recorded in the apartment of an elderly couple within the residential facilities of LACE Housing9
Table 1 summarizes the duration, locations, sensors, and other general characteristics of the recorded datasets. For comparison, the table includes also the main characteristics of other two datasets, Kasteren [52] and CASAS [12], from the related work (Section 2).
Dataset entries summary. In addition to the L-CAS and ENRICHME datasets, two other datasets from the literature are included for comparison
Comparison between predictions from Wavelet (W), FreMEn (F), ARIMA (A) and CNN-based (C) models in the L-CAS dataset
Note that, to evaluate our wavelet-based models and anomaly detection (Section 6.2 and Section 6.4), both continuous and binary sensors were used. However, only the latter were necessary for our entropy-based activity representation (Section 6.3).
We divided our datasets (see Table 1) into two folds: one for training and one for testing the prediction. We trained our models using the first part of the datasets. Once the models were ready, we evaluated them by comparing their predicted outputs with the real data from the testing part of the datasets. In the L-CAS dataset, we used the first three months (104 days) of sensor data for training and then one week for testing. The ENRICHME dataset had a smaller number of entries, so we used three weeks for training and 10 days for testing.
Comparison between predictions from Wavelet (W), FreMEn (F), ARIMA (A) and CNN-based (C) models in the ENRICHME dataset
Comparison between predictions from Wavelet (W), FreMEn (F), ARIMA (A) and CNN-based (C) models in the ENRICHME dataset
RMSE (%) of predictions for continuous sensors using Wavelet (W), FreMEn (F), ARIMA (A) and CNN-based (C) models in the L-CAS dataset
RMSE (%) of predictions for continuous sensors using Wavelet (W), FreMEn (F), ARIMA (A) and CNN-based (C) models in the ENRICHME dataset
For each sensor in a dataset, we took the testing dataset timestamps and forecasted the expected sensor values at those instants using the trained models. For the binary sensors, if both testing data and model forecasting values were ‘true’, the prediction was considered as a ‘True Positive’. Similarly, if both testing data and prediction were ‘false’, it was a ‘True Negative’. Missed activations (‘true’ in the dataset but ‘false’ in the prediction) were marked as ‘False Negative’, and predicted activations that were wrong were considered ‘False Positive’. With those four parameters, we computed the statistics summarized in Tables 2 and 3. For the continuous sensors, we computed the RMSE between testing data and forecasted values, as presented in Tables 4 and 5.
In order to evaluate the prediction quality of our wavelet sensor model, we compared it to other three different forecasting tools, considering frequency-, statistics-, and neural network-based approaches. The first one, called Frequency Map Enhancement (FreMEn) [30], was originally developed for robotics applications but then applied also to smart-home sensors [13]. FreMEn is a method that allows to model periodic changes of the environment using Fourier-based spectral analysis. It considers the probability of the environment’s state to be a function of time, represented by a (compressed) combination of harmonic components. The second forecasting tool is based on autoregressive integrated moving average models (ARIMA) [53]. These statistical models combine a regressive part to capture the relation between current and past observations, a moving average component that includes lagged forecasting errors into the prediction, and an integral part to ensure that series are stationary. Finally, the third one is based upon a fully-convolutional neural network (CNN) [3] using causal convolutions without feedback. This takes a sequence of any length and map it to an output sequence of the same length.
To start with, Table 2 presents some statistics of the predictions in the L-CAS dataset. For all the considered metrics, we can see that our new wavelet model outperforms the others. In particular, the wavelet model performs much better in terms of accuracy. Table 3 presents also some results on the ENRICHME dataset. In this case, the precision of the FreMEn and CNN models are slightly higher than our wavelet model, probably due to the periodic nature of the activities in the considered scenario. FreMEn indeed captures all the most relevant frequency components, so the predicted activations can be very precise (i.e. high number of true positives). However, for the recall, which considers the correct predictions over the total number of real activations, we can observe a significant improvement of the wavelet models compared to CNN-based and FreMEn models, since the latters are not able to predict some of the sensor activations. This improvement is further confirmed by the F1 score and the accuracy, also shown in the same table.
Continuous signals such as those from light and temperature sensors are also properly modelled by our wavelet models. The average RMSE of the prediction is the lowest for the light sensors in the L-CAS dataset (see Table 4). Only in one occasion, i.e. for the entry sensor, the value predicted by the FreMEn model is slightly better than ours, most likely due to its significant periodic component. The performance of wavelet models is particularly relevant in this case. The accuracy of the model is key for our anomaly detection system, as it will define how different rules are triggered. The temperature predictions are very similar for the wavelet and the CNN-based models, although the latter is more accurate for the entry and the lounge sensors, probably thanks to the large amount of training data available in this dataset. The ARIMA models perform reasonably well but, as expected, not as good the other two.
The results for the ENRICHME dataset (see Table 5) show a similar situation. The wavelet models are generally better than the FReMEn and the CNN-based ones, although the difference to the latter is minimal. Note also that all the considered models perform better in this one than the L-CAS dataset. In fact, the ENRICHME dataset captures the interactions of only two persons in a small domestic environment, while the L-CAS dataset was recorded in a larger and more crowded office environment.
Measures of similarity between real and predicted entropies of human activity in the L-CAS and ENRICHME datasets using Wavelet (W), FreMEn (F), ARIMA (A) and CNN-based (C) models
In the following sub-sections we illustrate the performance of our system to represent human activities using the normalized entropy method from [22] and comparing the expected levels of activity to the actual ones.
Real vs. predicted entropies
We compared the entropies of human activity predicted by our wavelet model with the actual ones computed on both datasets. We used three popular metrics to measure the statistical similarity between these two entropies: RMSE, correlation coefficient, and explained variance.
Table 6 illustrates the good performance of our solution in predicting the entropy of human activities, showing better results than the other approaches. We can also see that the entropy predicted by our wavelet model is mostly improved for the ENRICHME dataset compared to the L-CAS dataset (i.e. lower RMSE; higher correlation and explained variance). Only the RMSE is slighly better for the CNN-based models in the ENRICHME dataset. However, for both datasets, our results confirms that real and predicted entropies are reasonably similar and, therefore, the wavelet-based model is suitable to forecast the level of activity in the environment.
Examples of activity forecasting
As explained in [22], human activities can be represented by the normalized entropy of the environment. Figure 5 illustrates two examples of such entropy calculated from the real sensors and predicted by our wavelet-based model. In particular, the red graph shows the real normalized house entropy (as percentage) based on the available sensor setups. The blue graph is the predicted entropy at the same time, using the wavelet models of our sensors. It also shows the entropy prediction based on FreMEn (green), ARIMA (yellow) and CNN-based (pink) models. We can see the limitations of FreMEn, which is not capable of capturing the complex frequency behaviours of the abrupt slope in Fig. 5a or the low/high combination in Fig. 5b. The ARIMA model suffers from similar frequency limitations, while the CNN-based model shows a better performance, closer to our wavelet solution.
More in detail, Fig. 5a is based on the ENRICHME dataset, collected in the relatively quiet apartment of an elderly couple. The figure refers to a typical morning of the two residents. The predicted entropy of their activities differs from the real one for less than 10%, with only two significant exceptions: in the morning, at around 10:00, the activity’s level was higher than expected (about 20% error between real and predicted entropies); a little later, around 11:30, the real activity’s entropy decreased sharply a few minutes after the usual time (still about 20% error). These differences between real and predicted data, however, are understandable under normal variations of the resident’s schedule, which cannot be predicted by our model. It is worth noticing that our solution is able to predict a very sharp transition, where the activity’s entropy goes from high to no activity at all. Other models are limited in their responsiveness, showing the improved capability of our system to consider high-frequency elements thanks to its wavelet-based model.
Figure 5b refers instead to the activity of a non-typical Friday afternoon in the L-CAS offices. The real entropy (red) shows that it was a particularly busy day, with a high activity level for most of the time. However, a significant decrease of the entropy between 18:00 and 19:00, when most of the researchers left the office, is followed by another increase between 19:00 and 20:00, when some people came back. The activity remained then relatively high for the rest of the evening, which was unusual. The entropy’s prediction (blue) is able to capture several important trends of the activity levels, including a few small negative peaks between 17:00 and 18:00 hours, probably due to some researchers leaving the office, and the sharp decrease around 18:00 hours, when most of them left. Our model captures also some of the evening activities and the entropy’s increase between 19:00 and 20:00. Although after this time there is a significant difference between real and predicted entropies, due to the unusual presence of people on a Friday night, the general trends of the activity’s levels are correctly captured by our prediction system.

Examples of real vs. predicted entropies computed on the ENRICHME (a) and L-CAS (b) datasets during a 5-hours interval. The first plot’s row, in red, depicts real entropies. The second row, in blue, is the wavelet-based prediction. The third row, in green, shows the FreMEn forecast. The forth row, in yellow, shows the ARIMA estimation and last row, in pink, shows a CNN-based model output.
In this final set of experiments we compare our HMLN for anomaly detection (Section 4 and 5), which integrates wavelet and entropy-based activity priors, to other existing approaches. Abnormal activity detection ensures timely intervention and helps mitigate more significant problems, as elderly health incidents and food damage. To achieve the latter, some emergency assistance methods were introduced when sensors readings are classified as abnormal. Common emergency methods include among others alarm systems, email notifications, SMS notifications. In this work, anomalies are reported using ROS services both to the assistive robot [14] and networked care [4]. This means a double approach is taken for emergency assistance: an inmediate robot interaction is requested while common emergency methods are addressed by the networked care system.
The normalized entropy computed by our system can be used indeed as a time series for unsupervised anomaly detection. In particular, we evaluate our HMLN-based anomaly detector against five state-of-the-art unsupervised methods from existing frameworks [6,19,33,59,60], which are the following ones:
Gaussian1D10
LOF10 – This method relies on local deviations of the density of a given sample with respect to its neighbors. It is local in the sense that the anomaly score depends on how isolated the object is from the surrounding neighborhood [19].
COPOD11
LSCP11 – Another unsupervised method that defines local regions around test instances using the consensus of its nearest neighbors in randomly selected feature subspaces. The top-performing base detectors in these local regions are selected and combined as the model’s final output for detection [59].
β-VAE11 – A variational autoencoder which uses a modification to the training regime to improve the learning of disentangled representations [6].
Percentage of anomalies detected by a particular method (row) that are also detected by another one (column) in the L-CAS dataset
Percentage of anomalies detected by a particular method (row) that are also detected by another one (column) in the ENRICHME dataset

F1 score comparison for the considered methods using our HMLN anomaly detector with (a) and without (b) expert rules.
To compare our HLMN to the above methods, we account for the number of anomalies that each detector has in common with the others. The results are summarized in Table 7 and 8 for the L-CAS and ENRICHME datasets, respectively. For a fair comparison, the tables include also a variant of our method (HMLN*) that does not implement any expert rule, but considers only statistical anomalies based on activity entropy. We can see that all the anomalies reported by the HMLN* with no rules are also reported by the original HMLN, but not the opposite, as expected.
The results show that our HMLN approach shares a significant number of detections with the other five methods. In particular, our solutions enable a balanced distribution of detections that captures a reasonable number of anomalies from the other detectors. It is also interesting to note how the HMLN detectors behave similarly to the LOF and the LCSP detectors for the L-CAS dataset (Table 7), whereas for the ENRICHME dataset (Table 8) the HLMN detectors are closer to COPOD and β-VAE. This difference may come from the fact that each dataset captures a very different scenario: the first one is a research office, while the second is an elderly home. The type of anomalies on each dataset is different, an thus they are detected by different algorithms, although the HMLN detectors are still able to perform reasonably well in both cases.
To identify the best one among these detection systems, but lacking a consistent and reliable annotation of true anomalies, we used the method proposed by Lamiroy & Sun [31] to estimate precision and recall, and from these compute the F1 score. Although not accurate in absolute terms, this approach has been shown to be useful for ranking different binary classifiers in absence of ground-truth.
Lamirov’s method was applied to our datasets to obtain the results summarised in Fig. 6. In particular, we used F1 estimations to rank different detectors in absence of ground truth. Note that the ranking is the same in Fig. 6a (with expert rules) and Fig. 6b (without), although the F1 scores obviously differ in terms of numerical values.
Also, although in general our HMLN detectors are the most performant, we should mention that the β-VAE encoder ranks slightly above them in the L-CAS dataset. This could be due to the challenging nature of the dataset itself, which contains many more people, sensors, and time entries. As already seen in previous results, in this case it is more difficult to identify periodic patterns of sensor data, and hence anomalies. Therefore, a neural network-based detector might be more suitable to model such complex scenarios, provided sufficient training data (3 months in the L-CAS case) is available.
This paper presented a new approach for wavelet-based temporal modelling of smart sensors, which we used to forecast levels of human activity in dynamic and robot-assisted environments. We also proposed an original application of HMLNs combining real and predicted entropies of human activity with expert rules to detect potential anomalies. Our solutions have been evaluated using two large public datasets, one of which newly collected from a real elderly home, to demonstrate their effectiveness.
We have compared our wavelet-based activity models and anomaly detection to several state-of-the-art solutions, showing improved overall performance. In particular, the wavelet model performed better than other three popular forecasting approaches, while the anomaly detection system has proved to be preferable when compared to alternative unsupervised methods. It is also interesting to note that our solution can be a valid alternative to neural network-based approaches in scenarios with limited amount of data, like the ones here considered.
Although the proposed wavelet temporal model can be applied to any arbitrary signal, our current implementation focused mostly on static binary sensors, partly because of the subsequent entropy-based representation of human activities. It remains to be studied how analog smart sensors (e.g. light, temperature) and mobile sensors (e.g. from the robot) could be integrated and exploited by our system.
Finally, despite the flexibility of HMLNs, there are still limitations in the way logic rules are formulated and their weights learned, which requires particular attention and fine tuning to guarantee the convergence of the training process. Also, the time required by the latter grows exponentially with the number and complexity of the rules, which can be a problem in case a richer spectrum of human activities and sensor data is considered. In our case, indeed, the MLN’s rules are relatively simple to guarantee the real-time performance of our system, an aspect that we analysed in previous work [21]. Possible alternatives combining deep neural networks and symbolic representations, like Logic Tensor Networks [44] and ABBA-LSTM [17], could potentially overcome some of these problems and enable more powerful inference systems for anomaly detection. In addition to the above-mentioned integration of analog and mobile sensors, future work will be therefore devoted to more recent and promising neuro-symbolic architectures [36], with special attention to the constraints posed by data availability and real-time requirements.
Conflict of interest
The authors have no conflict of interest to report.
