Data driven recognition of interleaved and concurrent human activities with nonlinear characteristics

Abstract

Sensor-based human activity recognition gained a lot of research interest within the field of pervasive computing due to its wide range of application domains. Recognition of complex human activities is a challenging task due to the tendency of humans to perform activities in an interleaved and concurrent scenario. In this paper, we address the problem of complex activities recognition using a combination of the discriminative features called Strong Jumping Emerging Patterns (SJEPs) and the fuzzy sets theory. The proposed approach is designed to fit the challenges of multi-label classification, nonlinear separation, and recognition of multiple overlaps of interleaved and concurrent activities. Besides the need for a training dataset of complex activities that is difficult to obtain. The proposed approach uses a training dataset of simple activities to extract two sets of SJEPs for linear and nonlinear activities. Then, a novel SJEP-based recognition approach is presented to recognize simple and complex activities. We evaluate our approach using two datasets collected from two different labs. Experimental results show the efficiency of our approach in recognizing simple and complex human activities, besides the superiority of our approach against other competing approaches with regard to recognition accuracy.

Keywords

Complex activities recognition emerging pattern multi-label classification nonlinear separation fuzzy sets

1 Introduction

One of the promising research themes in pervasive computing is Human Activity Recognition (HAR), which is concerned with monitoring the users and their surrounding environment using computing devices, and inferring the performed user activities from the sensor events triggered by the user [1]. The forefront research attempts in HAR were vision-based HAR. In the last few decades, due to the rapid developments of sensors and mobile devices, recognizing human activities based on sensors has drawn a lot of research interest. Sensor-based HAR [2] operates through the collection of data from sensors attached to the user or installed in the environment to measure the user movement, environment variables, or physiological signals [1]. Research in sensor-based HAR stems from the vast array of its applications in medical, military, and security applications [3]. A common application is monitoring activities of daily life (ADL) for eldercare, healthcare, and assistive living applications.

Human activities can be categorized from two different dimensions, time span, and execution patterns. With regard to the time span, there exist two categories, action and activity [4]. An action refers to any simple indivisible operation, which spans a short time, and corresponds to a single event triggered by a sensor such as holding a cup. An activity spans a longer time and consists of a set of actions (i.e. sensor events) such as cooking. Regarding the execution pattern, a single user can perform activities in two different scenarios, a simple (i.e. sequential) or a complex (i.e. overlapping) scenario [5]. A simple activity is performed by a single user when no other activities are performed simultaneously. In complex activities, a single user performs multiple activities in a concurrent or interleaved fashion. In concurrent activities, actions of multiple activities are carried out simultaneously. Interleaved activities contain actions of various activities carried out in an interwoven (i.e. shuffled) manner. Figure 1 illustrates examples on different types of human activities.

Fig. 1

Different types of activities with Examples.

The human nature is to perform activities in a random overlapping pattern than in a sequential pattern. This makes complex activities recognition a more realistic target. From machine learning perspective, complex activities recognition is a multi-label classification problem. Existing data-driven approaches for complex activities recognition have some limitations. First, they require a training dataset of complex activities in order to recognize complex activities, which is difficult to obtain (for m activities, there exist m (m - 1) different patterns of complex activities). Second, existing data-driven approaches [5 –9] assume that activities are linearly separated. In fact, some nonlinearly separated activities will need to be recognized as such. For example, consider two activities, ‘making coffee’ and ‘preparing breakfast’. The sensor events in the first activity are completely contained within the second activity, and thus they require a methodology to detect both activities from the same set of sensor events. Finally, existing approaches such as [10] can recognize a maximum number of two interleaved or concurrent activities, which does not capture the general case of having multiple activities interleaved or executed in parallel.

In this this paper, we present a novel SJEPs (Strong Jumping Emerging Patterns)-based approach for the recognition of simple and complex activities with linear and nonlinear characteristics. This is accomplished using discriminative features (i.e. SJEPs) that represent the unique sensor features that are exclusively present in each activity class. The proposed approach is composed of two main phases. In the first phase, we use a hierarchical pattern-mining approach to discover SJEPs for linear and nonlinear activities from a training dataset of only simple activities. Moreover, we extract two other features, an activity weight vector, and a correlation matrix for sensor/activity interdependency. In the second phase, our recognition method uses the extracted SJEPs, and features besides the fuzzy sets theory to recognize simple and complex activities.

The main contributions of this paper are:

We propose a unified framework for multi-label classification of complex activities using a training dataset of only simple activities.

We apply a hierarchical pattern-mining approach to identify SJEPs for both linear and nonlinear activities.

We generalize the recognition of complex activities to recognize more than two interleaved or concurrent activities.

We validate the efficiency of the proposed method on different datasets and perform comparison with existing research attempts. The experimental results demonstrate superiority of our proposed method.

We organize the remainder of the paper as follows: in section 2, we review the previous research attempts for complex activities recognition.

Then, the problem of complex HAR is formulated in section 3. In section 4, we review preliminaries on the pattern mining approach. Then, we present our proposed approach for simple and complex activities recognition in section 5. In section 6, the experimental evaluation and comparisons are illustrated and discussed. Finally, we conclude our work and discuss our plans for future work in section 7.

2 Related works

Research on Human activity recognition generally addresses two main dimensions, activity monitoring devices, and activity recognition techniques. In this section, we review HAR from these dimensions.

Regarding the type of monitoring devices used for HAR, HAR is classified into two categories, vision-based HAR, and sensor-based HAR. Vision-based methods are the forefront in HAR that use cameras for activity monitoring and apply computer vision techniques for activity recognition process. In [11], the authors review some of the most important work in vision-based HAR. On the other hand, sensor-based methods use sensors to monitor human activity. These sensors may be wearable such as accelerometer, or dense sensors distributed in the environment such as motion sensors. An important research on sensor-based HAR is presented in [1]. Each monitoring method has advantages and disadvantages beyond its typical applications. Sensor-based methods gain a lot of research attention because of the rapid developments of low computational and lightweight wearable and environmental sensors. This leads to mature applications such as assistive living, healthcare monitoring, sports, surveillance, etc. In this paper, we are interested in sensor-based HAR.

Existing techniques for sensor-based HAR are divided into two main approaches, knowledge-driven, and data-driven approaches [12]. In Knowledge-driven approaches, experts capture domain knowledge to construct a model, and then use that model to identify input sensor data. Ontology modeling [13] is one of the main techniques used to represent knowledge models [14 –16]. Although knowledge-driven approaches have a clear semantics, they require a rich knowledge domain to represent activities. Obtaining this knowledge depends primarily on the expertise of experts. In addition, they are weak in dealing with uncertainty and temporal relationships between data. On the other hand, data-driven approaches use pre-existing datasets to construct an activity model using machine-learning techniques. Then the constructed model is used to identify new sensor data. Some of the advantages of the data-driven approaches are their ability to deal with uncertainty using well-established machine learning techniques. In addition to the use of temporal information to capture short-term and long-term temporal dependencies [15]. However, they require large amounts of datasets to learn the activity models and have problems in reusability and applicability. In this paper, we are interested in data-driven approaches. Next, we review the most important data-driven techniques proposed for complex HAR.

For complex HAR, there are three prominent data-driven approaches, approaches based on probabilistic models, based on time-series analysis, and pattern-mining based approaches. Most of the existing data-driven approaches are probabilistic models such as Conditional Random Fields (CRF), Hidden Markov Model (HMM), and Bayes Network (BN) [17]. CRF cannot recognize concurrent activities, so some variants of CRF were proposed such as Factorial CRF (FCRF) [6], and Skip Chain CRF (SCCRF) [7, 8]. For HMM, it has a limitation on the recognition of interleaved activities. To solve this problem, [18] presented an Interleaved Hidden Markov Model (IHMM). Using the BN, the authors in [9] presented an Interval Temporal Bayesian Network (ITBN) that makes a combination of probabilistic Bayesian network with Allen’s temporal relations. These graphical models are based on time points, and can capture only three relations (i.e., precedes, follows, equals). Therefore, they cannot characterize rich temporal characteristics about activities. In addition, their computational complexity increases with the number of overlapping activities. Regarding time-series based approaches, an attempt is presented in [5]. In this work, the authors extract discriminative subsequences of time series called shapelets to identify activities. However, this work requires more computational complexity to extract shapelets.

Finally, the pattern mining-based approaches extract a set of discriminative patterns for each activity class, and then output the activity label(s) with the highest likelihood score. In [10], and [19] the authors make use of Emerging Patterns (EPs). In [10], the authors proposed a framework for recognizing simple and complex activities, besides a dynamic trace segmentation algorithm. This approach has some shortcomings; they use a separate recognition module for simple and complex activities and did not specify which module to select for incoming data. Moreover, they can handle only two concurrent or interleaved activities. Finally, the applied recursive segmentation approach is not robust; wrong recognition of a specific segment affects the recognition and segmentation of subsequent segments. In [19], the authors converted the problem of complex activities recognition into the problem of recognizing multiple simple activities. They proposed a dynamic segmentation algorithm that split the incoming stream of sensor readings into multiple simple activities. Then, for each obtained activity segment, they made a combination of random forest and EPs to identify its activity label. Although this approach simplifies the problem of complex HAR, it has some drawbacks. Initially, the authors put a great responsibility on the segmentation method; prediction error in one segment affects the recognition and segmentation of subsequent segments. In addition to the increased time and computational complexity, to recognize n simple activities, they require to train and test n random forests.

In this paper, we propose an EP-based approach, our work is somewhat similar to [10], and [19]. However, we discover a discriminative type of EPs called SJEPs that represent the unique features that are present exclusively in each class of activities. Our approach differs from theirs in its ability to handle both linear and nonlinear activities. In addition, we can recognize more than two overlapping (i.e. interleaved or concurrent) activities.

3 Problem formulation

For the task of sensor-based human activity recognition, wearable or environmental sensors monitor human’s behavior over time. When human performs a specific activity, a sequence of sensor events is triggered, and then sent to an integrating device for further processing.

To recognize complex human activities, we utilize a training dataset D that consists of q observations o of m simple activities Equation (1). In dataset D, attributes represent sensor readings and observations represent activities. An observation o =< e₁, e₂, …, e_n > is comprised of a subsequence of sensor events e from n different sensors. Each event is a tuple of four elements e =< ts, sn, sv, a_i >, where ts is the time stamp, sn the sensor name, sv the sensor value, and a_i is the corresponding activity label from a predefined set of labels A = {a₁, a₂, …, a_m}.

Our objective in the training phase is to define a mapping function MF that assigns each new observation of sensor readings for simple or complex activities with the correct activity label(s) Equation (2). $D = {o_{1}, o_{2}, \dots, o_{q}}$ (1) $MF (o) = {a_{i}}, where | {a_{i}} | \geq 1$ (2)

4 Preliminaries

Before going in depth to the proposed approach, in this section we introduce the definitions required in our approach.

Itemset: For a training dataset D collected by d different sensors {s₁, s₂, … s_d}, an item i refers to a pair of sensor name and its corresponding value. A set of items from the set {i₁, i₂, … i_r} forms an itemset I.

The support: For an itemset X in dataset D, the support of X in D, sup (X, D) = (freq (X))/(|D|), where freq (X) is the frequency of X in D.

The Growth Rate: Given two classes of activities D₁, D₂, The Growth Rate (GR) of an itemset X from class D₁ (i.e. contrasting class) to class D₂ (i.e. target class) measures its frequency in D₁ with respect to its frequency in D₂ as follows, $\begin{matrix} GR (X, D_{1}, D_{2}) = \\ {\begin{matrix} 0 if \sup (X, D_{1}) = 0, and \sup (X, D_{2}) = 0 \\ \infty if \sup (X, D_{1}) = 0, and \sup (X, D_{2}) > 0 \\ \sup (X, D_{2}) / \sup (X, D_{1}) otherwise \end{matrix} \end{matrix}$

Emerging Pattern (EP) is an important type of discriminative patterns [20]. For HAR, EPs represent the discriminative features between different classes of activities. There are different types of emerging patterns; there are Minimal Emerging Patterns (MinEPs), Maximal Emerging Patterns (MaxEPs), Jumping Emerging Patterns (JEPs), Strong Jumping Emerging Patterns (SJEPs), Fuzzy Emerging Patterns (FEPs), and Noise-tolerant Emerging Patterns (NEPs) as depicted in Fig. 2. Each class of EPs has its own characteristics and usage, a detailed discussion about these classes is presented in [21].

Fig. 2

Relationships among different types of Emerging Patterns [20].

Considers a pattern X, and two different activity classes D₁ and D₂, we have the following definitions.

Emerging Pattern (EP): EP (X, D₁, D₂) = (X : sup (X, D₂) > μ, and GR (X, D₁, D₂) ≥ ρ}, where μ is a minimum support threshold and ρ is a minimum growth rate threshold.

Minimal Emerging Pattern (MinEP): MinEP (X, D₁, D₂) = {X : GR (X, D₁, D₂) ≥ ρ, and ∄ Y|Y ⊂ X and GR (Y, D₁, D₂) ≥ ρ}.

Jumping Emerging Pattern (JEP): JEP (X, D₁, D₂) = {X : GR (X, D₁, D₂) = ∞}

Strong Jumping Emerging Pattern (SJEP): SJEP (X, D₁, D₂) = {X : GR (X, D₁, D₂) = ∞ , and ∄ Y ⊂ X : GR (Y, D₁, D₂) = ∞}

In this paper, we are interested in a specific type of EPs which is the SJEPs incorporating the proprieties of JEPs and MinEPs. SJEPs achieve absolute separation and represent the unique features that are exclusively present in a single class.

5 The proposed approach

The proposed SJEP-based approach for the recognition of simple and complex human activities is comprised of two main phases, training, and testing as shown in Fig. 3. The input to the training phase is a set of labeled sequences for simple human activities. At the end of this phase, we get three outputs: the extracted SJEPs, an activity weight vector, and sensor/activity correlation matrix. Then, in the testing phase, the incoming stream of sensor readings is segmented, and then each segment is assigned the corresponding activity label(s) using the constructed recognition model. We discuss these phases in detail in the following sections.

Fig. 3

The proposed SJEP-based framework architecture.

5.1 The training phase

Given a labeled dataset of observation sequences for different classes of simple activities, we perform the following steps.

5.1.1 Mining of strong jumping emerging patterns

Due to the characteristics of SJEPs, they make absolute class separation (i.e. they completely separate one class from the remaining classes). However, there are some cases where activities are not linearly separable. In order to deal with linear and nonlinear separated activities, we extract two classes of SJEPs, which are One-Versus-All SJEPs (OVA-SJEPs) and N-Versus-All SJEPs (NVA-SJEPs), for linear and nonlinear activities respectively. we present a hierarchical SJEPs mining approach that first searches for OVA-SJEPs for linearly separated activities. Then, for nonlinearly separated activities, it discovers NVA-SJEPs (N refers to the number of nonlinear activities). The SJEPs mining problem is a multi-objective optimization problem that can be described as follows:

Input parameters:

m: number of simple activities.

D_{a
_i}: preprocessed observation sequences for activity class a_i, for i = 1, …, m

$D_{a_{i}}^{'}$ : preprocessed observation sequences from m - 1 activity classes, such that $D_{a_{i}}^{'} = \cup_{j = 1}^{m} D_{a_{j}}, i \neq j$

Output parameters:

OVA-SJEPs for linearly separated activities.

NLSA:The set of nonlinearly separated activities.

Decision variables:

For each activity class a_i, we use the FP-growth algorithm [22] to find the set of all possible itemset combinations for events/items in D_{a
_i} referred as X. Then for each candidate x ∈ X, we compute the following variables: $\begin{matrix} Sup (x, D_{a_{i}}) = freq (x, D_{a_{i}}) / | D_{a_{i}} | \\ Sup (x, D_{a_{i}}^{'}) = freq (x, D_{a_{i}}^{'}) / | D_{a_{i}}^{'} | \end{matrix}$

Objectives:

Finding the OVA-SJEPs

Find a pattern x ∈ X that: $max (Sup (x, D_{a_{i}}))$

s.t. $GR (x, D_{a_{i}}^{'}, D_{a_{i}}) = \infty and ∄ y \subset x such that GR (y, D_{a_{i}}^{'}, D_{a_{i}}) = \infty$

Finding the set NLSA of nonlinear separated activities such that |NLSA| < m

Find: ${arg}_{a_{i}} max (Sup (X, D_{a_{i}}^{'}))$

s.t. $GR (x, D_{a_{i}}^{'}, D_{a_{i}}) \neq \infty$

After obtaining the OVA-SJEPs and the set of nonlinear separated activities NLSA, the previously defined mathematical model is recursively repeated to find the NVA-SJEPs for the set of nonlinear activities in NLSA against the remaining activities.

A graphical illustration of the difference between OVA-SJEPs and NVA-SJEPs is presented in Fig. 4. Consider five activity classes {c₁, c₂, c₃, c₄, c₅}, such that activities c₁, c₂, c₃ can be separated linearly, but activities c₄, c₅ are nonlinearly separated (i.e. c₅ is completely contained within c₄). In such case, we extract OVA-SJEPs for c₁, c₂, c₃ as illustrated in Figs. 4 a, b, c, and NVA-SJEPs for c₄, c₅ in Fig. 4d.

Fig. 4

Graphical Illustration of the difference between OVA-SJEPs and NVA-SJEPs.

5.1.2 Computation of activity weight vector

The second task in the training phase is to build an activity weight vector representing the discriminative power for each activity.

For m activities, we produce an m × 1 weight vector that is computed as follows: $weight (a_{i}) = {\begin{matrix} 1 if a_{i} has OVA - SJEPs \\ 1 / N if a_{i} has NVA - SJEPs \end{matrix}$ (3)

5.1.3 Computation of correlation matrix

Finally, we build the correlation matrix representing the interdependency between the sensors and the activities. In our approach, we measure the correlation using the support. We compute the support of each sensor within each activity class. For a specific activity, the sensor with high support is highly correlated to the activity and vice versa. For a training dataset of m activities, measured by n different sensors, we produce a m × 1 correlation matrix as follows: $\begin{matrix} corr (s_{j}, a_{j}) = \sup (s_{j}, D_{a_{j}}) \\ for 1 \leq i \leq m, 1 \leq j \leq n \end{matrix}$ (4)

5.2 The testing phase

The input to this phase can be a testing dataset or an online trace of sensor readings. In both cases, the input traces are segmented, and then the activities within each segment are recognized. These steps are explained in the following subsections.

5.2.1 Trace segmentation

There are two main trace segmentation approaches, fixed-size segmentation [5], and dynamic segmentation [23]. Fixed-size segmentation approach includes two methods, time-based, and event-based segmentation. In this paper, we use fixed-size segmentation that uses fixed-time sliding window to split the incoming activity traces into equal-sized time segments.

There are two main challenges with fixed-time segmentation. The first is the size of the sliding window used for segmentation. Activities occur with different durations, so it is difficult to find the exact size for the sliding window. The second challenge is the distribution of an activity over different segments. Therefore, while computing the score for an activity a at a specific segment s_t, we should take into account its score in the previous segment.

5.2.2 Simple and complex activities recognition

For each test segment, to recognize the performed simple and complex activities, we perform the following steps. First, we identify the candidate activities by computing the matching score that measures the percentage of SJEPs contained in the test segment Equation (4). Then, we use our scoring function to measure the likelihood of the suggested candidates to the input test segment Equation (5). Finally, we decide that a specific activity exists in the test segment according to the formula in Equation (10).

The proposed approach for simple and complex activities recognition can be mathematically formulated as follows:

Input parameters:

s_t: a preprocessed trace segment of sensor readings.

m: number of activity classes.

n: number of sensors.

OVA-SJEPs: SJEPs for linearly separated activities.

NVA-SJEPs: SJEPs for nonlinearly separated activities.

weight: an m × 1 activity weight vector.

corr: an m × n correlation matrix.

Output parameters:

-Win: The set of winning activities occurring at segment s_t such that |Win| ≥ 1.

Decision Variables:

For each test segment, the following steps are performed. First, we identify the candidate activities by computing the matching score referred as mat_score that measures the percentage of SJEPs contained in the test segment as follows: ${mat}_{score} (S_{1}, S_{2}) = {\begin{matrix} 0 if S_{1} \cap S_{2} = \emptyset \\ 1 if S_{1} \cap S_{2} = S_{2} \\ \frac{| S_{1} \cap S_{2} |}{| S_{2} |} otherwise \end{matrix}$ (5) where S₁ represents the SJEPs, and S₂ represents the test segment. The value of this score lies in the interval [0, 1], which indicates three cases: not contained, fully contained, or partially contained. From this score, we obtain the set of k candidate activities with mat_score > 0.

If all the resulting candidates are linearly separated activities with OVA-SJEPs, then they are the winning activities. Otherwise, if the candidates contain nonlinearly separated activities, we compute the likelihood score for candidate activity a_i at segment s_t referred as lik_score (a_i, s_t) that is defined as follows: ${lik}_{score} (a_{i}, s_{t}) = {dis}_{score} (a_{i}, s_{t}) \times Weight (a_{i})$ (6)

Complex activity recognition is a multi-label classification task such that multiple activities could be performed within a specific test segment. Moreover, using fixed-time segmentation, a specific activity could be distributed along multiple time segments. Therefore, at a specific test segment s_t, an activity a_i could be not contained, partially contained or fully contained in that segment. From fuzzy set perspective [24], each activity has a degree of membership in the incoming test segment according to the percent of its sensor readings/events contained in that segment which lies in the interval [0, 1] as shown in Fig. 5. From this point, we define the discriminative score for activity a_i at segment s_t referred as disc_score (a_i, s_t) that represents the fuzzy membership degree of activity a_i within segment s_t.It equals the summation of the fuzzy membership values μ_{a
_i} (e) of all events e in s_t to activity a_i. ${dis}_{score} (a_{i}, s_{t}) = \sum_{v = 1}^{L} μ_{a_{t}} (e_{v})$ (7) where L is the number of sensor events within s_t, and μ_{a
_i} (e) is the membership of event e to activity a_i that is measured by the correlation between sensor event e and activity a_i as follows. $\begin{matrix} μ_{a_{t}} (e) = \frac{corr (e, a_{i})}{\sum_{j = 1}^{k} corr (e, a_{j})} \\ Where \sum_{j = 1}^{k} μ_{a_{j}} (e) = 1 \end{matrix}$ (8)

Fig. 5

Fuzzy membership function for an activity within a test segment.

Finally, due to the challenges associated with fixed-time segmentation. While computing the score for an activity a_i at a specific segment s_t, we should take into account its score in the previous segment. This score is referred as the cumulative score cum_score that is calculated using the correlation coefficient alpha a ∈ [0, 1].

$\begin{matrix} {Cum}_{Score} (a_{i}, S_{t}) = α \times {lik}_{Score} (a_{i}, S_{t - 1}) \\ + (1 - α) \times {lik}_{score} (a_{i}, S_{t}) \end{matrix}$ (9)

Objectives:

Our main objective is to output the activity label(s) that occurs within the incoming test segment.

Find arg _ai max(cum_{score(a_i,s_t)})

s.t. ${Cum}_{Score (a_{i}, S_{t})} \geq λ \times mean ({lik}_{score})$ (10)

Where λ is a user-defined parameter that tunes the threshold (i.e.mean (lik_score)).

In Fig. 6, we present an illustrative example on the proposed recognition approach. In this example, for a specific test segment, we have three probabilities: existence of linear activities (Fig. 6a), nonlinear activities (Fig. 6b), or both linear and nonlinear activities (Fig. 6c).

Fig. 6

Illustrative examples for the proposed SJEP-based approach for simple and complex activities recognition.

5.3 Computational analysis of the proposed SJEP-based recognition approach

As mentioned before, our proposed framework requires two main phases, training and testing. The main goal of the training phase is the extraction of SJEPs, which is a highly computational process, so this phase should be implemented offline. In this section, we analyze the complexity of the proposed SJEP-based recognition approach.

Assume the following: we have m activity classes with s sets of SJEPs,s ≤ m. Consider computing the matching score between a test segment and one set of SJEPs requires O (A), which is linear to the number of events in the segment. Then to check the existence of s sets of SJEPs we require s . O (A). The worst case occurs when both linear and nonlinear activities exist in the same segment. For k candidate activities, we require O (B) to compute the score for each candidate, and k . O (B) for all the candidates. As a result, the testing phase requires s . O (A) + k . O (B) for classifying a specific test segment.

Assume that the input stream is segmented into t segments; then the total complexity of the proposed method can be computed as t (s . O (A) + k . O (B))which is linear to the number of sensor events in the test segment, and s, k ≤ m.

6 Experimental results and evaluations

In this section, we evaluate the proposed approach for recognizing simple and complex human activities. First, we describe the datasets used, the implementation settings, and the evaluation settings. Then, the obtained results from experiments are presented and discussed.

6.1 Datasets

In the following experiments, we evaluate our SJEP-based approach using two datasets collected from two different labs. The first dataset is “The interleaved Activities of Daily Living” benchmarking dataset. This dataset was collected at CASAS [25] smart home repository at Washington State University [26] (CASAS for short) that is used in many researches such as [23, 27]. The second dataset is from [10] that contain sequential, interleaved, and concurrent activities (SICA for short).

The CASAS dataset contains eight activities listed in Table 1, collected by 20 subjects from a smart home equipped with eight different modalities of environmental sensors. To collect sequential activities, the participants (one at a time) perform each activity separately in the same sequential order. Then, the participants were allowed to perform any number of activities in an interwoven fashion to obtain a sequence of interleaved and concurrent activities.

Table 1
Activities in CASAS Dataset [26]

No. Activity

1 take medicine

2 watch DVD

3 water plants

4 answer phone

5 write birthday card

6 prepare meal

7 Clean

8 select an outfit

No.	Activity
1	take medicine
2	watch DVD
3	water plants
4	answer phone
5	write birthday card
6	prepare meal
7	Clean
8	select an outfit

The SICA dataset contains 26 activities listed in Table 2. These activities were performed by four subjects using two sets of wearable sensors; IMote2 set and RFID reader set. The subject wears IMote2 set on his hands and waist to measure user movement, and other environment-related information such as temperature, humidity, and light level. The subject wears the RFID readers on his hands for human-object interaction. During two weeks every day, each subject performed the activities in any order they decide, with a maximum number of two interwoven activities at the same time [10].

Table 2

Activities in SICA Dataset [10]

No.	Activity	No.	Activity
1	making coffee	14	ironing
2	making tea	15	eating meal
3	making oatmeal	16	Drinking
4	frying eggs	17	taking medication
5	making a drink	18	cleaning a dining table
6	applying makeup	19	Vacuuming
7	brushing hair	20	taking out trash
8	Shaving	21	using phone
9	Toileting	22	watching TV
10	brushing teeth	23	watching DVD
11	washing hands	24	using computer
12	washing face	25	reading book
13	washing clothes	26	listening music

We selected these datasets for three main reasons. First, both datasets used different methods for activity monitoring; the CASAS dataset used environmental sensors while the SICA dataset used wearable sensors. Second, the number of activities and the size of the dataset are different for both datasets; the number of activities and samples in the CASAS dataset is small with respect to the SICA dataset, including more activities and samples as illustrated in Table 3. Finally, the number of interwoven activities that form a complex activity was not limited in the CASAS dataset, but the SICA dataset limited that number to a maximum of two interwoven activities.

Table 3

Number of Samples at CASAS and SICA datasets

Type of activities	Number of Samples
	CASAS Dataset	SICA Dataset
Simple	160	422
Complex	108	110
Total	168	532

6.2 Experimental settings

In this section, we elaborate the basic steps required for the process of simple and complex HAR as shown in Fig. 7. Initially, to guarantee the robustness and flexibility of performance evaluation, we perform two experiments. The first experiment uses the CASAS benchmarking dataset, and the second one uses the SICA dataset. In both experiments, we evaluate the performance of the proposed approach to recognize simple and complex human activities.

Fig. 7

Visualization of the main components of the recognition phase.

Sensor data preprocessing: the use of mobile and ubiquitous environments makes sensor data unreliable and noisy. Therefore, sensor data may contain errors such as missed or unreliable data. In order to clean these errors from the sensor data, we apply a pipelined toolkit for sensor data cleaning called Extensible receptor Stream Processing (ESP) proposed in [28].

Trace segmentation: given a stream of sensor readings, we apply fixed segmentation approach using a fixed-time sliding window. For each experiment, from their activity’s duration, we compute the average duration referred as L_avg to use as a baseline for the sliding time window t_win. We have L_avg = 212 seconds for the CASAS dataset, and L_avg = 70 seconds for the SICA dataset.

Feature extraction and selection: in this phase, we extract sensor features from the obtained activity traces. Various types of features are extracted which depends on the type of the sensors used. For example, for motion sensors, we use sensor name and value (e.g. ON/OFF) as a feature. From each activity trace, we obtain a feature vector that consists of a set of items. Each item is represented as a pair of feature name and value, in which the value can be nominal or numeric. Numeric features are discretized using an entropy-based discretization method [29]. Finally, we encode the extracted features by a simple encoding scheme to be used in the SJEP mining process.

Comparison with other approaches: for simple activity recognition, we compared our SJEP-based approach with other five popular classifiers: Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbor (K-NN), Decision Tree (DT), and Hidden Markov Model (HMM). For complex activities recognition, we compared our SJEP-based approach with two prominent attempts utilizing the concept of emerging patterns in [10], and [19] discussed in section 2.

6.3 Evaluation metrics

We applied n-fold cross-validation to evaluate our approach. We used 3-fold cross validation for the CASAS dataset, and 10-fold cross validation for the SICA dataset as shown in Figs. 8 and 9 respectively. In each experiment, we use the simple activities from n - 1 folds for training and use the simple and complex activities from the remaining fold for testing. In Fig. 10, we present samples from the activity records at CASAS dataset used for training, whereas Fig. 11 presents a segment from sensor data stream used for testing. Then, for evaluation, we used the three standard evaluation metrics: precision (PR), recall (RC), and F-measure.

Fig. 8

3-Fold Cross Validation for CASAS Dataset.

Fig. 9

10-Fold Cross Validation for SICA Dataset.

Fig. 10

Sample records of simple activities used for training from CASAS Dataset.

Fig. 11

Segment of stream of sensor readings used for testing from CASAS Dataset.

For complex activities recognition, we have a multi-label classification problem [30]. Therefore, we modify these evaluation metrics to deal with our problem as defined below.

$\begin{matrix} Precision = \\ \frac{number of correctly identified activities}{total number of detectedactivities} \end{matrix}$ (11) $Recall = \frac{number of correctly identified activities}{total number of expected activities}$ (12) $F - Measure = 2 . \frac{Precision . Recall}{Precision + Recall}$ (13)

6.4 Experiment I: CASAS dataset

In this experiment, we validate our SJEP-based approach using the CASAS dataset. In the training phase, during SJEP extraction process, all the activities are linearly separable except two activities, water plants and clean. As a result, we extract OVA-SJEPs for linear activities and NVA-SJEPs for these nonlinear activities with N = 2. In our approach, we have two key parameters, α, and λ. In order to analyze the sensitivity of these parameters, we evaluate our approach on fixed time segments t_win = 2 × L_avg. The results are shown in Fig. 6, where the X-axis represents λ, and the Y-axis represents the F-measure.

The results show that our approach reaches the best f-measure at α = 0.3 and λ = 0.1 that are used for the rest of this experiment.

For simple activity recognition, the average results for our SJEP-based approach compared with five popular classifiers (i.e. SVM, DT, NB, KNN, and HMM) is shown in Table 4. From these results, we notice that the competition is close between our approach and the remaining classifiers except NB and HMM. However, our approach outperforms the others with 100% accuracy for linear activities and almost 96% accuracy (i.e. F-Measure) for nonlinear activities.

Table 4
The comparison results for simple activity recognition (CASAS dataset)

SJEP-based SVM NB KNN DT HMM

PR RC PR RC PR RC PR RC PR RC PR RC

Take Medicine 100 100 100 100 55 60.1 95.2 100 87 100 68 60.4

Watch DVD 100 100 100 100 62 68.3 100 100 85 100 78 65

Water Plants 100 93 95.8 100 43 59.5 85.1 100 77 83 63 61

Answer phone 100 100 91.6 80.9 60 65.7 94.4 70.6 100 85 66 55.4

Write Birthday Card 100 100 85.1 90.4 78 68.6 84.2 95.2 81 83 80 83.2

Prepare Meal 100 100 100 100 83 77.8 100 90.4 87 100 82 79

Clean 90 100 100 94.4 89 78.1 100 84.1 66 83 81 87.4

Select an outfit 100 100 100 100 67 70 91 100 100 100 67 77

Average 98.7 99.1 96.5 95.7 67.1 68.5 93.7 92.5 85.3 91.7 73.1 71

F-Measure 98.6 96.15 67.8 93.16 88.4 72.07

	SJEP-based	SVM	NB	KNN	DT	HMM
Take Medicine	100	100	100	100	55	60.1	95.2	100	87	100	68	60.4
Watch DVD	100	100	100	100	62	68.3	100	100	85	100	78	65
Water Plants	100	93	95.8	100	43	59.5	85.1	100	77	83	63	61
Answer phone	100	100	91.6	80.9	60	65.7	94.4	70.6	100	85	66	55.4
Write Birthday Card	100	100	85.1	90.4	78	68.6	84.2	95.2	81	83	80	83.2
Prepare Meal	100	100	100	100	83	77.8	100	90.4	87	100	82	79
Clean	90	100	100	94.4	89	78.1	100	84.1	66	83	81	87.4
Select an outfit	100	100	100	100	67	70	91	100	100	100	67	77
Average	98.7	99.1	96.5	95.7	67.1	68.5	93.7	92.5	85.3	91.7	73.1	71
F-Measure	98.6		96.15		67.8		93.16		88.4		72.07

For complex activities recognition, we compare our SJEP-based approach with [19]. The results for precision, recall, and F-measure are shown in Figs. 13 –15 respectively, where the X-axis refers to the window size. We use different lengths for the time window t_win that ranges from 0.1 × L_avg to 3.5 × L_avg.

Fig. 12

Parameter Analysis (CASAS dataset).

Fig. 13

The comparison of Precision in [19] and SJEP-based approach using CASAS dataset.

Fig. 14

The comparison of Recall in [19] and SJEP-based approach using CASAS dataset.

Fig. 15

The comparison of F-Measure in [19] and SJEP-based approach using CASAS dataset.

6.5 Experiment II: SICA dataset

In this experiment, we follow the same evaluation steps done on experiment I using the SICA dataset.

First, in the training phase, we find that all the activities are linearly separable except two activities washing hands and washing face. Then, we analyze our key parameters α, and λ. The best F-measure at t_win = 2 × L_avg comes with α = 0.3 and λ = 0.1 as shown in Fig. 16.

Fig. 16

Parameter analysis (SICA Dataset).

For simple activity recognition, we compare our approach with five common classifiers as shown in Table 5. From comparisons, our SJEP-based approach outperforms the others with an overall accuracy of 96.34%, with an acceptable performance of 90% with nonlinear activities.

Table 5

The comparison results for simple activity recognition (SICA dataset)

	SJEP-based		SVM		NB		KNN		DT		HMM
	PR	RC	PR	RC	PR	RC	PR	RC	PR	RC	PR	RC
Making coffee	100	100	100	85	100	55	100	100	30	47	66	70
Making tea	100	100	100	80	80	80	100	100	34	79	88	81
Making oatmeal	100	100	100	100	66	25	84	100	21	31	78	80
Frying eggs	100	100	100	83	48	83	100	100	20	61	64	64
Making a drink	100	100	50	100	66	71	100	96	32	55	80	78
Applying makeup	100	91	100	100	76	80	100	85	78	71	59	61
Brushing hair	89	100	75	77	66	45	75	100	51	62	77	76
Shaving	100	100	100	100	50	75	77	83	28	36	62	55
Toileting	100	100	100	60	100	30	83	88	66	28	79	71
Brushing teeth	100	100	100	100	57	80	100	100	65	65	45	58
Washing hands	82	96	58	100	58	28	75	81	71	55	67	66
Washing face	96	86	86	70	53	75	63	70	33	20	54	45
Washing clothes	100	100	100	100	100	50	100	100	100	46	75	79
Ironing	100	100	100	100	60	37	100	100	65	51	77	79
Eating meal	82	100	83	100	100	66	100	100	57	77	50	56
Drinking	78	100	75	91	56	25	93	100	44	47	60	59
Taking medication	100	85	83	100	55	35	100	90	72	39	65	72
Cleaning a dining table	100	83	100	100	59	85	80	85	55	30	71	73
Vacuuming	94	100	100	100	49	66	75	90	33	22	50	61
Taking out trash	100	86	100	100	100	50	85	85	50	20	78	69
Using phone	100	100	83	83	50	30	81	73	28	36	86	88
Watching TV	98	100	100	88	60	60	80	68	48	47	76	75
Watching DVD	100	85	100	100	66	50	100	100	50	63	88	89
Using computer	100	79	100	60	52	66	95	81	50	31	78	80
Reading book	100	100	80	100	77	66	100	100	33	55	77	77
Listening music	100	100	100	100	66	25	80	82	100	22	89	84
Average	96.8	95.8	91.2	91.4	68	55.3	89.5	90.6	50.5	46	70.7	71
F-Measure	96.34		91.34		61.03		90.05		48.16		70.8

For the recognition of complex activities, we compare our SJEP-based approach with another method in [10]. The results of our evaluation metrics are shown in Figs. 17 –19, where the X-axis represents the window size that ranges from 0.1 × L_avg to 3.5 × L_avg.

Fig. 17

The comparison of Precision in [10] and SJEP-based approach using the SICA dataset.

Fig. 18

The comparison of Recall in [10] and SJEP-based approach using the SICA dataset.

Fig. 19

The comparison of F-Measure in [10] and SJEP-based approach using the SICA dataset.

6.6 Discussion

Using two different datasets for evaluation, the obtained results indicate the efficiency of the proposed approach besides its superiority against other state-of-the-art and EP-based competing approaches for recognizing simple and complex activities.

For simple activity recognition, we compared our SJEP-based approach with five popular classifiers (i.e. SVM, KNN, DT, HMM and NB). As a result, our approach outperformed the others, it achieved 98.6% in Experiment I and 96.34% in Experiment II followed by SVM, KNN, DT, NB, and HMM for the following reason. For sensor-based HAR, the size of the collected dataset is relatively small compared to real world situations and our SJEP mine for differences between activity classes rather than similarities that best suits this situation. In contrast, the discriminative SVM, KNN, and DT search for similarities which decreased their performance. In addition, the generative HMM and NB over fit the training data and require attribute independence assumption.

For complex activities recognition, we compared our proposed SJEP-based approach with two other EP-based approaches in [10, 19]. Remembering that activities occur with different durations, there exist short-term activities and long-term activities. In small window sizes, long-term activities are distributed along many segments, while large window sizes reduce the classification accuracy of short-term activities. Therefore, through evaluation, we note that all approaches show degradation of F-measure at small and large windows. However, the combination of SJEPS and fuzzy set theory within the proposed model overcomes this challenge and the results of F-Measure indicate the efficiency and superiority of our SJEP against the others [5, 3] that are 94.02% and 92.04% in experiments I and II respectively.

7 Conclusion

In this paper, our main objective is to design a general framework that can recognize any number of overlapping activities with linear and nonlinear characteristics using a training dataset of simple activities. The proposed SJEP consists of two phases: training and testing. In the training phase, we mine for irregularities in the training dataset and extract the discriminative SJEPs for linear and nonlinear activities. Next, in the testing phase, for each test segment, we apply our scoring function to compute the score for candidate activities, and then output the activity labels(s) that meet a specific criterion. We evaluate our approach using two datasets collected from two different labs with different characteristics. The results show the efficiency of the proposed approach in recognizing both simple and complex activities. On comparison with two common approaches incorporating the concept of Emerging Patterns, the results reflect the superiority of our approach with respect to recognition accuracy.

The use of SJEPs increases system scalability; it is easy to append new activity class through its SJEPs. Regarding noise tolerance, sensor data often contains noise with random distribution. So mining the differences between data is most effective than mining the regularities. To summarize, this paper proposed an efficient, scalable, error and noise tolerant model for sensor-based complex activity recognition. We look afterwards in the future to devise a dynamic segmentation algorithm exploiting both context and spatial information. In addition to this, we shall extend our work to deal with multi-residents applications.

References

Chen

, Hoey

, Nugent

C.D.

, Cook

D.J.

and Yu

, Sensor-based activity recognition, IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews 42 (2012), 790–808.

Sunny

J.T.

, George

S.M.

and Kizhakkethottam

J.J.

, Applications and Challenges of Human Activity Recognition using Sensors in a Smart Environment, IJIRST –International Journal for Innovative Research in Science & Technology| 2 (2015), 50–57.

Ranasinghe

, Al Machot

and Mayr

H.C.

, A Review on Applications of Activity Recognition Systems with Regard to Performance and Evaluation, International Journal of Distributed Sensor Networks 12 (2016), 1–21.

Liu

, Nie

, Liu

and Rosenblum

D.S.

, From action to activity: Sensor-based activity recognition, Neurocomputing 181 (2016), 108–115.

Liu

, Peng

, Wang

, Liu

and Huang

, Complex activity recognition using time series pattern dictionary learned from ubiquitous sensors, Information Sciences (2016), 1–17.

, Lian

and Hsu

J.Y.J.Y.Y.

, Joint recognition of multiple concurrent activities using factorial conditional random fields, Proc 22nd Conf on Artificial Intelligence (AAAI-2007) (2007), 82–87.

, Hao Hu

, Pan

S.J.

, Zheng

V.W.

, Liu

N.N.

and Yang

, Real world activity recognition with multiple goals, Proceedings of the 10th International Conference on Ubiquitous Computing UbiComp 08(344) (2008), 30.

D.H.

and Yang

, CIGAR: Concurrent and Interleaving Goal and Activity Recognition, in: AAAI Conference on Artificial Intelligence (2008), 1363–1368.

Zhang

, Zhang

, Swears

, Larios

, Wang

and Ji

, Modeling temporal interactions with interval temporal bayesian networks for complex activity recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (2013), 2468–2483.

10.

, Wang

, Wu

, Tao

and Lu

, A pattern mining approach to sensor-based human activity recognition, IEEE Transactions on Knowledge and Data Engineering 23 (2011), 1359–1372.

11.

Zhang

, Wei

, Nie

, Huang

, Wang

and Li

, A Review on Human Activity Recognition Using Vision-Based Method, Journal of Healthcare Engineering 2017 (2017).

12.

Okeyo

, Chen

, Wang

and Sterritt

, A Knowledge-Driven Approach to Composite Activity Recognition in Smart Environments, Ubiquitous Computing and Ambient Intelligence 6th International Conference, UCAmI 2012 (2012), 322–329.

13.

Chen

and Nugent

, Ontology-based activity recognition in intelligent pervasive environments, 5 (2009), 410–430.

14.

Helaoui

, Riboni

and Stuckenschmidt

, A probabilistic ontological framework for the recognition of multilevel human activities, Ambient Intelligence and Future Trends-International Symposium on Ambient Intelligence (ISAmI 2010) (2013), 247–254.

15.

Okeyo

, Chen

and Wang

, Combining ontological and temporal formalisms for composite activity modelling and recognition in smart homes, Future Generation Computer Systems 39 (2014), 29–43.

16.

Riboni

, Sztyler

, Civitarese

and Stuckenschmidt

, Unsupervised recognition of interleaved activities of daily living through ontological and probabilistic reasoning, in: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing – UbiComp ’16, ACM Press, New York, New York, USA, 2016, 1–12.

17.

C.-H.

and Fu

L.-C.

, Robust Location-Aware Activity Recognition Using Wireless Sensor Network in an Attentive Home, IEEE Transactions on Automation Science and Engineering 6 (2009), 598–609.

18.

Modayil

, Bai

and Kautz

, Improving the recognition of interleaved activities, Proceedings of the 10th International Conference on Ubiquitous Computing – UbiComp ’08 (2008), 40.

19.

Tabatabaee Malazi

and Davari

, Combining emerging patterns with random forest for complex activity recognition in smart homes, Applied Intelligence 48 (2018), 315–330.

20.

García-Vico

A.M.

, Carmona

C.J.

, Martín

, García-Borroto

and del Jesus

M.J.

, An overview of emerging pattern mining in supervised descriptive rule discovery: taxonomy, empirical study, trends, and prospects, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8 (2018), 1–22.

21.

García-Borroto

C-OJ.

and Martínez-Trinidad

M.J.

, A survey of emerging patterns for supervised classification, Artificial Intelligence Review 42 (2014), 705–721.

22.

Nasreen

, Azam

M.A.

, Shehzad

, Naeem

and Ghazanfar

M.A.

, Frequent pattern mining algorithms for finding associated frequent patterns for data streams: A survey, Procedia Computer Science 37 (2014), 109–116.

23.

Wan

, O’Grady

M.J.

and O’Hare

G.M.P.

, Dynamic sensor event segmentation for real-time activity recognition in a smart home context, Personal and Ubiquitous Computing 19 (2015), 287–301.

24.

Bede

, Mathematics of fuzzy sets and fuzzy logic, 2013.

25.

CASAS Datasets, (n.d.). http://casas.wsu.edu/datasets/ (accessed 21 April 2019).

26.

Singla

, Cook

D.J.

and Schmitter-Edgecombe

, Tracking Activities in Complex Settings Using Smart Environment Technologies, International Journal of Biosciences, Psychiatry, and Technology (IJBSPT) 1 (2009), 25–35.

27.

A.S. B, Deng

J.D.

and Woodford

B.J.

, Online Hidden Conditional Random Fields to Recognize Activity-Driven Behavior Using Adaptive Resilient Gradient Learning, (2017), 515–525.

28.

Jeffery

S.R.

, Franklin

M.J.

and Berkeley

U.C.

, A Pipelined Framework for Online Cleaning of Sensor Data Streams, (2005), 8–10.

29.

K.B.I. U. Fayyad

, Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning, (1993), 1022–1029.

30.

Ward

J.A.

, Lukowicz

and Gellersen

H.W.

, Performance metrics for activity recognition, ACM Transactions on Intelligent Systems and Technology (TIST) 2 (2011), 6.

Data driven recognition of interleaved and concurrent human activities with nonlinear characteristics

Abstract

Keywords

1 Introduction

3 Problem formulation

5.1.1 Mining of strong jumping emerging patterns

5.2.1 Trace segmentation

5.2.2 Simple and complex activities recognition

6 Experimental results and evaluations

6.1 Datasets

Table 1 Activities in CASAS Dataset [26] No. Activity 1 take medicine 2 watch DVD 3 water plants 4 answer phone 5 write birthday card 6 prepare meal 7 Clean 8 select an outfit

7 Conclusion

References

Table 1
Activities in CASAS Dataset [26]

No. Activity

1 take medicine

2 watch DVD

3 water plants

4 answer phone

5 write birthday card

6 prepare meal

7 Clean

8 select an outfit