Abstract
In the new era of technology with the development of wearable sensors, it is possible to collect data and analyze the same for recognition of different human activities. Activity recognition is used to monitor humans’ activity in various applications like assistance for an elderly and disabled person, Health care, physical activity monitoring, and also to identify a physical attack on a person etc. This paper presents the techniques of classifying the data from normal activity and violent attack on a victim. To solve this problem, the paper emphasis on classifying different activities using machine learning (supervised) techniques. Various experiments have been conducted using wearable inertial fabric sensors for different activities. These wearable e-textile sensors were woven onto the jacket worn by a healthy subject. The main steps which outline the process of activity recognition: location of sensors, pre-processing of the statistical data and activity. Three supervised algorithmic techniques were used namely Decision tree, k-NN classifier and Support Vector Machine (SVM). Based on the experimental work, the results obtained show that the SVM algorithm offers an overall good performance matched in terms of accuracy i.e. 97.6% and computation time of 0.85 seconds for k-NN and Decision Tree for all activities.
Introduction
Statistics show that the crimes and misconducts against women are rampant across the world. It is undeniable, that woman feels unsafe on the streets, workplaces, markets, etc. [1]. A broad variety of wearable sensors or apps available for various applications such as smart healthcare, patient tracking, physical activity of athletes, rehabilitation, elderly treatment, surveillance, etc. There is no system designed or created for the safety or protection of the women using wearable sensors [28]. Numerous approaches have been developed for women’s protection, such as smartphones, SOS services, smart watches that work as a wearable women’s safety device that can be used to send a message or to contact the custodian. (Koshmak et al., 2013, Cornacchia et al., 2017). Once the target is attacked by the offender; it is not easy to use them on time because the target is instantly oppressed by the attacker. This warrants a change in design to develop a system that automatically gathers data, analyzes and sends an emergency message to the guardian. During the escape tussle, the cloth worn by the target experiences stretching, bending, twisting body movements. [28]. In order to elucidate the problem, the fabric sensors were woven onto the jacket or pullover to record data such as stretching, bending, twisting, etc. Many methods have been used to classify human activity using inertial sensors such as pressure, stretch, accelerometers, cameras, etc. (Cyuong et al., 2018) using machine learning algorithms such as support vector machine (SVM),random forest (RF), artificial neural network(ANN).
Hidden Markov Model (HMM), Decision Tree (DT), Naïve Bayes (NB) etc. Wearable accelerometers were instrumental in recognizing activities such as sitting, walking, standing, routine activities (Attalah et al., 2001). The device was designed using these methods which will identify the violent attack from normal motion. The paper is laid out as follows. Section 2 discussed the literature review that shows the different sensors and methods used for the classification on an activity. Section 3 includes sensor placement, pre-processing technique, and activity recognition classification techniques. Section 4 shows the results achieved with different machine learning algorithms. Finally, a conclusion is given in Section 5.
Related Work
Various studies have shown the mechanism of sensors based on the electrical properties such as resistance, capacitance, pressure or voltage. Using this principle, e-textile sensors are constructed in which electronic components are integrated into the fabric itself (Cyuong et al., 2018). With the advent of technology, developments are taking place in the field of wearable textiles, which has applications in various fields, such as the recognition of activities using wearable sensors. Table 1 summarizes some of the activities using accelerometers and other types of sensors. In activity detection, data collection using a variety of wearable sensors is followed by other data analytical processes such as data processing, data integration, mining of important features and, finally, activity classification.
Related work on the classification of human activity detection
Related work on the classification of human activity detection
Various methods have been used in previous studies for the pre-processing of data to make it suitable for the machine learning model. The Segmentation process is used to split the data to extract important features and remove irrelevant attributes from the data set, which minimizes errors and reduces computation time.
This paper discusses designing and implementing a functional device which acts as a measure of safety against violent assault. This device, intended as a smart jacket, does not guarantee that violence is avoided, but may expect to minimize at least a significant percentage of crimes as the proposed system acts as a deterrent [28]. This paper also features a analysis of wearable fabric sensors for measuring pressure and stretch, etc. Figure 1 shows the sensor position on the vest.

Sensor location.
The Data Acquisition module consists of hardware and software architecture for data collection from wearables. For Data Acquisition, the Flora chip has been used to attach all the wearable sensors through connectors and can be sewn to the fabric for accuracy instead of using the Arduino Chip electronics. Flora Technology is developed by Limor Fried (Ladyada), founder and engineer of Adafuit. Our main objective is to design and implement a system that is capable of collecting different types of data from Wearable Sensors, of pre-processing data and of being able to classify a set of attack patterns.
Classification of sensors
Three kinds of fabric sensors i.e. pressure, stretch and 3 DOF accelerometers for sensing and monitoring the activities.
Pressure Sensor
It is formed by means of a sandwiched velostat between two conductive fabrics. Change the electrical properties, i.e. the resistance when it is pressed. Figure 2 shows how pressure sensors are created using conductive fabrics.

Pressure sensor.
Stretch Sensor
The fabric used to make stretch sensors is commercially available from Less EMF Safety Shop online. The composition consists of 76% Nylon 24 percent elastic which gives the distinctive ability to stretch in both directions. As there is a stretch in either direction, there is electrical resistance. Figure 3 shows that when the material is stretched, the conductivity of the fabric changes itself as well. Using this principle, when the victim is attacked, there will be sudden stretching or sudden pressure on the fabric itself, which provides raw data.

Stretch v/s conductivity of stretched fabric sensor.
Accelerometer
A high precision 3 Axis DOF Flora fabric Piezo- resistive Accelerometer (LSM303) with an acceleration of±2 g/±4 g/±8 g/±16 g manufactured by AdafruitTM is used to add direction and orientation to the wearable jacket. The Accelerometer is woven onto the wrist of the fabric using conductive thread and data produced from the device is transmitted to the flora chip which has pins SCL/SDA for interfacing. Figure 4 shows the A/M/G Flora accelerometer.

Flora 3 axis accelerometer.
The data from the wearable sensors is transmitted to the flora Arduino Chip. The chip has all the pins for the interface and the power supply. Constant 3.3 V power supply has been generated to power the entire circuit. The circuit is proposed in such a way that the sensors are placed between the power source and the ground. A set of resistors is also connected in parallel to the sensors to measure voltage changes. As there is a voltage change, which means that there is a physical change in the properties of the fabric sensor. The data collected from the sensors are received by flora on pins 9, 10, 11, 12. The data acquisition module for the system is shown in Figure 5. The data obtained from wearable electronics shall be communicated to the Software module, i.e. Arduino IDE software platform for data processing.

Data acquisition module.
Experiments are carried out on 3 subjects (3 Females) performing various activities such as stationary, walking, brisk walking, hand twisting and a violent attack from the fabric pressure, stretch and inertial wear sensors. The selection of these activities is made in order to distinguish between normal motion and violent activity. In order to collect the data, the sensors are woven onto the fabric using a conductive thread. Two stretch sensors are placed close to the shoulders, two on the elbow and two pressure sensors on the wrist. Acceleration, stretch and pressure signals are identified from the sensors and analyzed using the Arduino IDE tool software. Raw data is collected in a flora chip using conductive bus-lines. Figure 6 shows the functional modules for the classification of the activity. The human motion data set consists of 1K-2 K motion samples explained in five different attributes such as static / stationary, walking, brisk walking, twisting, Violent motion. The subject carried out the various activities in a controlled environment and limited to the means by which the activities should be carried out, but only with the sequence AR1, AR2, AR3, AR4 and AR5 as shown in Table 2.

Hardware and software architecture.
Dataset for human activity recognition from various sensors
The fabric sensors are positioned in such a way as to confirm the maximum stretching movements recorded on the Flora Chip. Real-time data from the accelerometer, the stretch sensor and the pressure sensor are collected and stored. For each sensor, a record of three aspects is recorded, i.e. accelerometer, gyroscope and magnetometer readings (ax, ay, az) and a total of fifteen vector features are recorded, including pressure and stretch sensors (elbows and shoulders). Five types of predefined activities are shown in Fig. 7.

Classification of the different activities.
Various machine learning applications require feature mining and selection of important features for the pre-processing of classification data (Suto et al., 2017; Juan et al., 2017; Nhan et al., 2018) Feature extraction can be recognized as a processing of data in which different types of features are extracted from raw data. In the first phase, the raw data will be split into short intervals known as the windowing technique. The windowing technique usually covers an interval of one or two seconds, and its size is determined by its sampling frequency. In our study, the window contains 20 samples and there is a 50% overlap between the windows during the training phase and no overlap in the testing phase to obtain the activity. To improve the accuracy in our model, the PCA (principle component analysis) technique is used for the selection of the features and to remove the redundant or irrelevant features from the data Table 3 shows the ranking of all the features that contribute the maximum for an activity’s classification. It is shown that the combination of 0.733 contributes most and the least contribution is of 0.0331. Only 11 features are chosen for the classification and for the model design. It shows that Magnetometer is a redundant feature in the dataset which, if removed from the feature vectors, will not affect our data set.
Ranked features using PCA
Ranked features using PCA
The human motion dataset is composed of 5K-10K samples distributed in stationary, walking, brisk walking, twisting, and aggressive motion of five classes. The data set in the MATLAB is split by 70 percent for the training set and 30 percent for the test set by the CV (cv partition, 2018) function [2]. Since the majority of the previous study focused on some machine learning models of the multi-classification, the study also focused on those models for the various applications to test the accuracy. These models are mainly organized by MATLAB 2017b software. The algorithms discussed below have been chosen as each algorithm approaches the problem in a unique manner. KNN uses Euclidean distances and decision-making bodies have used the entropy reduction concept. Naive Bayes depends on conditions between vectors and SVM classification and reduces the problem into a vector space where each function vector is transformed into its own hyperplane. All four algorithms are based on unique mathematical principles. A comparison between these can enable us to examine the problem of classification at a deeper level. Several studies in Table 1 show that this data classification algorithm is best performed (Arnold et al, 2017, Cyuong et al., 2018).
KNN algorithm
This algorithm is created on the Euclidean distance which measures the nearest distance between the data point. It works on the close proximity. Let x = (x1,x2,……x n ) and y = (y1,y2,…….y) are two points. The distance between two points is measured by:
The parameter K varies from 1 to 10 to obtain the best accuracy in the KNN algorithm. In this case, a 5-fold cross-validation method is used to produce the training and testing data. In the testing set, the predictable classes are matched to the true classes to calculate the accuracy. The accuracy to be calculated to measure the performance, the accuracy can be indicated as follows:
Where (true negative) means correct prediction of negative instances, (true positive ones) signifies correct prediction of positive instances. (False negatives) and (false positives) signifies uncorrected graded examples from the positive and negative classes. The accuracy is 93.8 percent in our dataset. For each activity Fig. 8 shows the confusion matrix. This appears from the observation that in a few cases such as walking and brisk walking, twist and aggressive assault, there is uncertainty. One can also detect more easily recognizable activities like brisk walking and violent activity.

Confusion matrix for different algorithms for misclassification.
It is a probabilistic classification technique that uses a probability method to make predictions for the classification of an activity. It is based on Bayes Theorem, which states that each pair of features is independent of each other and contributes equally to classification, e.g. by using acceleration and pressure or stretch sensors alone, the class cannot be accurately predicted. Mathematically, the hypothesis states:
Where Y is the class instance and X is a dependent feature vector. So basically, P (X|Y) here means, the probability of “being attacked” given that the pressure sensor is pressed.
It is an algorithm that uses a set of classification rules. It uses the Gini Index, which works on the model of inequality. The tree node links to the attributes, i.e. our sensor parameters and each leaf node links to the label class, i.e. Normal or Violent Movement. The dataset consists of 13 attributes that select which feature to place at the root or which at the node of the leaf. Using various measures such as information gain, gini indexes all attributes to be tested and which have the highest value on the root. In our study, both stretch and pressure sensors were shown to be more important in deciding whether or not a person is being attacked than Gyroscope X and Magnetometer Y-axis data, which are redundant in the feature vector.
Support vector machine
This algorithm uses the concept of linear separability, which has the property to separate the data points in n- dimensional space using the hyperplane. It separates the points which are closed to the decision surface (hyperplane) known as support vectors. The objective of the SVM algorithm is to minimize the error function.
Mathematically denoted as:
Where C = capacity contact, w = vector constants, ζ i = parameters for controlling non-separable data (inputs) x i = independent variables
The index i represents the N training set.
This section showed the results using different machine Learning Techniques after the mining of all the features from the dataset. The performance of our method for the classification of violent motion and a comparison with the references are also discussed. In addition, Four evaluation metrics, that is accuracy, precision, recall, and F1-score are employed to estimate the performance.
Confusion matrix
A confusion matrix examines how many occurrences of the different classes of activities have been misclassified by the model. The rows of the confusion matrix indicate the number of occurrences in each true activity class, whereas the columns indicate the number of occurrences for each predicted activity class. Each row of the matrix is compared to the actual class and the expected class for performance evaluation. Figure 8 shows the confusion matrix of the four algorithms. It is observed that, in most classes, misperceptions occur between different activities, such as walking, walking, and violent attacks. It is noted that basic activities, such as stationary, twist and walking, are easier to identify. Figure 9 shows the accuracy of the classification using different machine learning approaches. The support vector machine provides the best accuracy of 97.6 percent at a speed of 0.85 seconds.

Classification accuracy using various algorithms.
From the Confusion matrix in Fig. 8, it can be seen that KNN and Decision Trees have the highest amount of misclassification while the accuracy of Naive Bayes is higher when it comes to classifying between certain specific features. SVM has a balanced model and is most likely to be a production-level model. However, there are still a lot of scope for further study. There are questions such as why naive bayes was perfect in classifying between Stationary and Violent Attack, where SVM has misclassified it three times. At the same time, while classifying between Walking and Brisk Walking, Naive Bayes has failed miserably to generate a misclassification of 7 times, while SVM performs well with only one misclassification. The reasons for these errors can lead to better algorithm tuning for specific data sets and problem classes. A novel method of activation of the sensor table is proposed where the peak times and the setting times of the sensor values are recorded, which would be added as a new dimension. Perhaps a new dimension can add a more decisive element in reducing misclassification.
