A hybrid evolutionary learning classification for robot ground pattern recognition

Abstract

In the field of intelligent robot engineering, whether it is humanoid, bionic or vehicle robots, the driving forms of standing, moving and walking, and the consciousness discrimination of the environment in which they are located have always been the focus and difficulty of research. Based on such problems, Naive Bayes Classifier (NBC), Support Vector Machine(SVM), k-Nearest-Neighbor (KNN), Decision Tree (DT), Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) were introduced to conduct experiments. The six individual classifiers have an obvious effect on a particular type of ground, but the overall performance is poor. Therefore, the paper proposes a “Novel Hybrid Evolutionary Learning” method (NHEL) which combines every single classifier by means of weighted voting and adopts an improved genetic algorithm (GA) to obtain the optimal weight. According to the fitness function and evolution times, this paper designs the adaptively changing crossover and mutation rate and applies the conjugate gradient (CG) to enhance GA. By making full use of the global search capabilities of GA and the fast local search ability of CG, the convergence speed is accelerated and the search precision is upgraded. The experimental results show that the performance of the proposed model is significantly better than individual machine learning and ensemble classifiers.

Keywords

Hybrid classification model improved GA machine learning ground recognition

1 Introduction

Ground classification and recognition are one of the key technologies for robot environment perception. By realizing ground identification and perceiving non-geometric hazards, the robot can better complete navigation control, ensure its operation in the best state and safely reach the work destination through the various unknown ground [1]. The ground environment in which the robot is exposed to different materials directly affects the sensitivity of the robot driving device. Even if the “joints” of the robot’s feet are at the same degree of freedom, the same speed, and acceleration, the effects of various grounds are different, which also causes the robot to move normally or out of control. In this paper, through the analysis and recognition strategy research on different material surfaces, the robot can change the control signal according to the feedback of the sensor, improve the movement and navigation state of the robot, so that the robot can complete the task stably, accurately and perfectly under various environmental situations.

Most navigation technologies rely on external resources (e.g., GPS, GNSS, BeiDou Navigation System). However, whether it is the above-mentioned navigation technology (indirect eyes of robots) or various infrared, ultrasonic, and camera devices (direct eyes) assembled by intelligent robots, there are certain problems such as delays, blind spots, shadows and reflections. All these make the intelligent machine lack the ability to “directly and autonomously” perceive and cause the limitation to dynamically recognize the surrounding environment and cognition. Therefore, a new data source (robot foot sensor) can not only improve navigation accuracy but also enhance the machine cognition and learning capabilities. These data sources are not directly connected to locations, but act as agents for data fusion. The agent of the position is to measure the ground environment where the intelligent automatic robot is in different media materials.

Human beings interplay with surrounding substances and the environment, so as to further comprehend that the most essential information of interacting objects is perception and tactile impression. Similarly, for robots, it is also crucial to connect with and distinguish the unstructured and changeable circumstances. In this research field, surface tactile exploration has attracted a lot of attention [2]. In order to enable robots with this capability, a mass of scholars [3 –9] have attempted to design systems that employ perception exploration to obtain relevant information about objects in the environment, and apply intelligent algorithms to analyze information to discern objects.

Traditional ground identification methods are mainly based on vision and lidar [10, 11]. However, these methods are affected by some factors such as light intensity, weather, smoke, dust, and surface covering.For the past few years, ground classification methods based on sound or vibration signals of sensors have emerged, which are characterized by simplicity, low cost, and low power consumption. The signal-based recognition methods have been widely applied in military and civilian fields, with applications focused on wheeled and legged robots [12, 13].

In recent years, with the improvement of deep learning and computer hardware performance, many deep networks (such as CNN, FCN, RNN, LSTM, GCN, and their various variants) [14 –19] applied to intelligent classification, pattern recognition, and perceived interaction, and achieved certain results. Autonomous mobile robot systems are usually embedded devices. However, due to its dependence on memory and computing resources, deep learning will limit the robot’s rapid response in terms of time cost. In [20], the authors made use of the signals collected by the foot sensors of a mobile robot to differentiate interior multiple grounds. In [21], a method to discriminate outdoor ground types at different speeds with the Z-direction signals of an IMU device was proposed, but it is barely valid for indoor. In [22], a strategy for learning outdoor terrains from videos was established. The robot employs these messages to learn from various visual perception units to categorize uncharted topographies. In [23], a deep recursive neural network based on a semi-supervised learning method was proposed by using the acceleration signal of the machine, and a variety of surface textures were effectively classified. The [24] compared the classification effects of time-domain amplitude analysis, fast Fourier transform (FFT), power spectral density (PSD), and singular value decomposition (SVD), using principal component analysis (PCA) to reduce the dimensions of ground features, and then using SVM for ground recognition. The [25] took advantage of the ground classification technology of the multi-class relevance vector machine (MRVM) and the filtering algorithm based on the Bayes framework to classify the robot driving ground. Other terrains’ classification manners consist of the sonic vibration, such as the [26], which takes advantage of the pulsation feedback from the frictional contact between the robot and the terrain to identify the ground style. The [27, 28] also fused the signals collected by the sensors and the visual position information forlearning.

The research on ground recognition can not only enhance the perception and autonomous interaction capabilities of intelligent robots but gradually get rid of human dependence and enable them to work in a broader environment. In addition to realizing the integration of machines and the environment and autonomous movement and evolution, there are many realistic application scenarios: 1) Repair the position in the map, make the navigation data more reliable and realize multi-agent and collaboration [29]. 2) The intelligent prosthesis equipment based on the ground recognition method can automatically adjusts the sensitivity of joints according to the detected ground environment to help people with walking difficulties and amputees obtain a comfortable gait pattern [30 –32]. 3) Planetary exploration robots (e.g., Mars or moon rover) will encounter challenging terrain and ground environment during operation. Terrain classification and recognition can help boost the autonomous abilities of mobile robots [33]. The contributions of this paper are as follows:

A Novel Hybrid Evolutionary Learning (NHEL) was designed to solve the ground classification problem and a comprehensive comparison was implemented with six individual classifiers.

The main parameters of the GA change adaptively with the increase of iterations, so that NHEL has enough diversity in the initial training, which helps to find the optimal solution in the later stage and has excellent convergence. Besides, NHEL also combines the advantages of GA and CG. After the GA has evolved to a certain iteration, the CG is employed to accelerate the approach to the optimal value, thereby boosting the efficiency.

The hybrid of different algorithms will reduce the interpretability. However, the differences between the hybridized algorithms will increase, and it may be possible to obtain better accuracy than single. In the IMU data, there is a problem of a serious imbalance of categories. In the absence of resampling, NHEL can also successfully learn and correctly classify rare categories.

This paper is organized as follows. We describe our problem in Section 2. Section 3 introduces the data set and exploratory feature extraction in variant dimensions. Section 4 provides hybrid strategies, and the NHEL method is described in detail. Section 5 is the experiment and analysis. Section 6 presents a summary and future work.

2 Problem formulation

The individual learner in the homogeneous ensemble is also called “base learner”. The learners in heterogeneous (NHEL) are generated by different algorithms which are called the individual (single) learner. Whether it is a traditional ensemble or a fresh hybrid, certain combination strategies are required. There are three common ones: averaging method (e.g., simple or weighted averaging), voting method (e.g., majority voting, plurality voting, and weighted voting), and learning methods. The hybrid of different algorithms will reduce the interpret-ability and raise the complexity to a certain extent. However, the differences between diverse algorithms to be mixed will be very large, and it may be possible to achieve better and more surprising accuracy than a single algorithm while ensuring that the time cost is within an acceptable range. So based on this idea, this paper takes advantage of the characteristics of genetic algorithm heuristic search to hybridize 6 single classifiers. The output of a final hybrid learner can be expressed as: $h_{final} (x_{i}) = f [c_{t}, h_{t} (x_{i}, y)]$ (1)

Among them: h_t (x_i, y) represents a classifier learned from a set of observation samples {x_i, y} through some kind of learning machines (the six learners in this paper are NBC, SVM, KNN, DT, RF and XGBoost). t = 1, …, T means that the experiment is repeated T times, and T output results of the classifiers are obtained respectively. C_t represents the weighted value of each classifier during hybridizing, and the function f is used to mix the final T functions (f in this paper is a weighted voting function).

The optimal weights of the six learners are determined by the evolutionary learning algorithm, that is, the single classifiers are hybridized to obtain a hybrid classifier model through the weighted voting method. The complete description is shown in Fig. 1.

Fig. 1

The block diagram of complete description.

3 IMU data set and feature engineering

This paper adopts an open-source dataset which is from Tampere University, Finland [34]. The robot time series data is collected from the Inertial Measurement Units (IMU) with XSENS MTi-300, which is an electronic device that solves orientation, navigation, and motion carrier control. The IMU data set includes accelerometer and gyroscope data, and magnetometer. Specifically, it is mainly divided into 10 sensor channels, as shown in Table 1. In particular, the data set has a total of 7,626 groups, each of which has a signal sample length of 128, and is collected on 10 channels. The number of data rows is about one million (7,626*128 = 976,128). The nine different surfaces and the corresponding 10 sensor channel signal distributions are shown in Table 2 and Fig. 2.

Table 1
Robot IMU sensor data composition structure

Meta data Position data Motion/movement data

Row id Series id Measurement number Orientation Angular velocity Linear acceleration

(x, y, z, w) (x, y, z) (x, y, z)

Meta data	Position data	Motion/movement data
Row id	Series id	Measurement number	Orientation	Angular velocity	Linear acceleration
			(x, y, z, w)	(x, y, z)	(x, y, z)

Table 2

Number and proportion of the nine surface types

Label number	Surface type	Percentage
0	fine_concrete	9.53%
1	concrete	20.45%
2	soft_tiles	7.80%
3	tiled	13.49%
4	soft_pvc	19.21%
5	hard_tiles_large_space	8.08%
6	carpet	4.96%
7	hard_tiles	0.55%
8	wood	15.93%

Fig. 2

The signal distribution of ten sensor channels on nine types of surfaces. The top 4 red lines represent the orientation (x, y, z, w) is to encode the current angle as a quaternion to indicate the direction, and the middle 3 green lines are the angular velocity (x, y, z) to describe the angle and speed of movement. The bottom 3 blue lines represent linear acceleration (x, y, z) to describe the change of speed at variant times.

3.1 Time domain feature extraction

The Root Mean Square (RMS) of the signal is also called the effective value. As shown in Fig. 3, the RMS seems to be different across the entire group. The RMS distribution of angular velocity is scattered and the characteristics are obvious. Although the difference is not so large that it can be classified according to the RMS. However, the RMS distribution of acceleration is dense, so we will perform feature extraction in the frequency domain. $X_{rms} = \sqrt{\frac{\sum_{i = 1}^{N} X_{i}^{2}}{N}} = \sqrt{\frac{X_{1}^{2} + X_{2}^{2} + \dots + X_{N}^{2}}{N}}$ (2)

Fig. 3

Root mean square value characteristics of angular velocity and acceleration.

3.2 Frequency domain feature extraction

For a specific signal, there are two expressions in the time domain and frequency domain. The former is the change of the signal with time, and the latter is the signal’s components at distinct frequencies. For signals that are uncertain in random processes, their frequency can be expressed by the power spectral density (PSD). Power spectrum estimation is to use a given set of sample data to estimate the PSD of a stationary random signal. It can give the energy distribution of the analyzed object with frequency, so it is widely used in engineering. The main steps of Welch method power spectrum estimation are:

Divide the data segment of length N into L segments, each with M points, and adjacent segments overlap by M/2 points, so the number of segments is: $L = \frac{N - M / 2}{M / 2}$ (3)

Add the same smoothing window w (n) to each segment and find the Fourier transform:

$X_{i} (e^{j ω}) = \sum_{M - 1}^{n = 0} x_{i} (n) w (n) e^{- j ω n}, i = 1, 2, \dots, L$ (4)

Find the average of the power spectrum of each segment: ${\hat{S}}_{x} (e^{j ω}) = \frac{1}{L} \sum_{L}^{i = 1} \frac{1}{MU} {| X_{i} (e^{j ω}) |}^{2}$ (5)

Among them, U is the average power of w (n), and MU is the energy of M long window function w (n). $U = \frac{1}{M} \sum_{M - 1}^{n = 0} w^{2} (n), MU = \sum_{M - 1}^{n = 0} w^{2} (n)$ (6)

As shown in Fig. 4, the frequencies of angular velocity (x, y, z) with the PSD maximum amplitude are 10HZ, 2HZ & 15HZ, 2HZ, and the acceleration (x, y, z) when the PSD maximum are 2HZ &27HZ, 2HZ, 10HZ. The characteristic discrimination degree of acceleration in the frequency domain is obviously more distinct than that in the time domain, and the PSD amplitude of accelerations is generally larger than angular velocities. The PSD with the hard_tiles_large surface is the largest on both angular velocity (x, y) and acceleration (x, z) sensors, because the robot has a louder sound and higher energy when passing through such a hard surface.

Fig. 4

The estimate power spectral density using Welch method.

3.3 Exploratory time series mining

We have extracted some features (RMS & PSD) from the signal motion data, but we can also explore many other potentially meaningful features. Tsfresh (Time Series Feature Extraction on basis of Scalable Hypothesis tests) is a module of Python time-series feature mining, which automatically extracts and selects features from data [35]. By default, Tsfresh can calculate a total of 794 time-series features. There are 63 features to choose from through combination optimization. This paper selects the following 8 from the 63 features, as shown in Table 3.

Table 3
The features extracted from Tsfresh

Feature Name (abbreviation) Mathematical Definition Interpretation

Aggregate statistics of all levels of auto-correlation coefficients (agg_auto-*correlation) $R (l) = \frac{1}{(n - l) σ^{2}} \sum_{i = 1}^{n - l} (x_{i} - μ) (x_{i + l} - μ)$ The aggregation statistical characteristics between the different values of the time series data, and the auto-correlation after the aggregation function (variance or mean) is calculated, which can measure the periodic nature of the data to a certain extent.

$\bar{x} = \sum_{i = 1}^{N} x_{i} / N$

Time series data statistics (change_quantiles) $s^{2} = \sum_{i = 1}^{N} \frac{{(x_{i} - \bar{x})}^{2}}{(N - 1)} X_{median} = {\begin{matrix} x_{(n + 1) / 2}, (n \mod 2 = 1) \\ \frac{x_{n} + x_{(n + 1) / 2}}{2}, (n \mod 2 = 0) \end{matrix}$ Calculate the mean, variance, standard deviation, median, etc. of time series data in a given time series interval.

Time series data non-linearity measurement (c3) $\frac{1}{n - 2 lag} \sum_{i = 0}^{n - 2 log} x_{i + 2 log}^{2} x_{i + lag} x_{i}$ Physics-based data non-linearity measurement, which measures the non-linearity of time series data.

Time series data complexity (cid_ce) $\sqrt{\sum_{i = 0}^{n - 2 lag} {(x_{i} - x_{i + 1})}^{2}}$ Evaluate the complexity of time series, the more complex the series, the more valleys and peaks.

Absolute Fourier transform spectral statistics (fft_aggregated) centroid variance skew kurtosis Spectral centroid, kurtosis, skewness, etc. after absolute Fourier transform.

Fourier transform system (fft_coefficient) $A_{k} = \sum_{m = 0}^{n - 1} a_{m} exp (- 2 π i \frac{mk}{n})$ Calculating the coefficients of one-dimensional discrete Fourier sequence based on the fast Fourier transform algorithm FFT.

Standard deviation and range multiple (large_standard_deviation) std (x) > r (max(x) - min(x)) Whether the standard deviation is r times the data range.

quantile p(X >Za)=α Calculate the q quantile of x, the value of x that is greater than the first q% of the ordinal value of x.

Feature Name (abbreviation)	Mathematical Definition	Interpretation
Aggregate statistics of all levels of auto-correlation coefficients (agg_auto-correlation)	$R (l) = \frac{1}{(n - l) σ^{2}} \sum_{i = 1}^{n - l} (x_{i} - μ) (x_{i + l} - μ)$	The aggregation statistical characteristics between the different values of the time series data, and the auto-correlation after the aggregation function (variance or mean) is calculated, which can measure the periodic nature of the data to a certain extent.
	$\bar{x} = \sum_{i = 1}^{N} x_{i} / N$
Time series data statistics (change_quantiles)	$s^{2} = \sum_{i = 1}^{N} \frac{{(x_{i} - \bar{x})}^{2}}{(N - 1)} X_{median} = {\begin{matrix} x_{(n + 1) / 2}, (n \mod 2 = 1) \\ \frac{x_{n} + x_{(n + 1) / 2}}{2}, (n \mod 2 = 0) \end{matrix}$	Calculate the mean, variance, standard deviation, median, etc. of time series data in a given time series interval.
Time series data non-linearity measurement (c3)	$\frac{1}{n - 2 lag} \sum_{i = 0}^{n - 2 log} x_{i + 2 log}^{2} x_{i + lag} x_{i}$	Physics-based data non-linearity measurement, which measures the non-linearity of time series data.
Time series data complexity (cid_ce)	$\sqrt{\sum_{i = 0}^{n - 2 lag} {(x_{i} - x_{i + 1})}^{2}}$	Evaluate the complexity of time series, the more complex the series, the more valleys and peaks.
Absolute Fourier transform spectral statistics (fft_aggregated)	centroid variance skew kurtosis	Spectral centroid, kurtosis, skewness, etc. after absolute Fourier transform.
Fourier transform system (fft_coefficient)	$A_{k} = \sum_{m = 0}^{n - 1} a_{m} exp (- 2 π i \frac{mk}{n})$	Calculating the coefficients of one-dimensional discrete Fourier sequence based on the fast Fourier transform algorithm FFT.
Standard deviation and range multiple (large_standard_deviation)	std (x) > r * (max(x) - min(x))	Whether the standard deviation is r times the data range.
quantile	p(X >Za)=α	Calculate the q quantile of x, the value of x that is greater than the first q% of the ordinal value of x.

4 Novel hybrid evolutionary learning

4.1 Combine and hybrid strategies

In terms of classification algorithms, when single learning means to solve problems, it will always have its insurmountable defects such as slow convergence, weak global or local search capabilities. Ensemble learning (EL) is to upgrade the speed and accuracy of prediction by integrating the results of multiple algorithm learning in some way. Compared with a single algorithm, EL is not prone to over-adaptation. Although there are many manners to form EL, the current research on ensemble methods is mainly focused on the way of learning by changing the sample structure. For example, Boosting (e.g., AdaBoost, GBDT, XGBoost, LightGBM, and CatBoost) and Bagging (e.g., random forest) are representative algorithms by changing learning samples, which are currently both well-established. So far, in addition to the above two methods, EL also includes Bayesian Averaging, Stacking,and Blending. Most of the studies are still focused on the same algorithm to an ensemble.Because distinct algorithms have diverse learning mechanisms, it is difficult to utilize a unified standard to measure their accuracy.

4.2 Hybrid learning with genetic algorithms

Evolutionary learning adopts the genetic algorithm (GA), which is a parallel heuristic search algorithm that imitates the survival of the fittest in Darwin’s theory of biological evolution. GA generates a large number of initial solutions in the solution space and obtains the optimal unit through the cross-mutation of a large number of initial individuals. Moreover, GA has fine global search ability, but its local search capacity is weak. The gradient method is the most effective method for solving large-scale nonlinear programming problems, which has remarkable local search capabilities but poor global search. Therefore, this paper combines the advantages of the two algorithms. When the number of evolution reaches a multiple of 5, the conjugate gradient method is employed once, so that the optimal solution of the function in a defined domain can be acquired more efficiently.

The hybrid method of GA weight optimization is to adjust the weight of each weak learner with the help of the GA global search feature. The weights initially determined by the six single classifiers are introduced as the incipient input of GA for optimization and adjustment. The offspring of GA is the potential best-weight subsets of every single classifier. The flow of generating offspring by genetics is the process of searching for the best weight coefficient subset.

4.3 Specific optimization process

First of all, initializing the population which contains several individuals. If the maximum number of iterations is not reached, an iteration is entered. In the iterative process, calculate the fitness value of individuals in the population, and determine whether to enter the mating pool to form a parent-child group according to the fitness value. Then carry out crossover and mutation operations according to the probability to ensure the diversity of the population and ultimately produce the excellent offspring. The specific process is divided into the following 8 steps (from a to h ):

(a) Population Initialization: The selection of the initial population number is extremely significant. If it is too large, although the diversity of the individual population is boosted, it will make the entire search space broad, which raises the amount of calculation in the search process and affects the overall speed. If too small, it may fall into the local optimum at the beginning and the search result is not the global optimum, making it impossible to accurately obtain the minimum or maximum points. The population value of this paper is considered to be 60.

(b) Individual Evaluation: The course of inheritance is the procedure of iteratively generating the optimal subset of weights. In this study, the F1 harmonic mean is introduced to measure the effect of the individual classifier under the current weight value, which also represents the individual adaptive value in the population.

The individual fitness steps are as follows:

Input: training set T, integer n represents n-fold cross-validation (n = 10). Output: F1 value.

Divide T into the majority sample set M′ and the minority sample set S′.

Divide M′ and S′ into n equal parts respectively to get $M_{1}^{'}$ , $M_{2}^{'}$ , ... , $M_{n}^{'}$ and $S_{1}^{'}$ , $S_{2}^{'}$ , ... , $S_{n}^{'}$ , respectively, merge $S_{i}^{'}, M_{i}^{'}$ (i = 1, 2,..., n), get n new data sets N: N₁ … N_n.

Operate each N_i, adopt N_i as the test set, and other data sets in N as the training. Employ individual classifiers for training, test N_i, and get the classification result of N_i.

The results of each N_i are weighted and hybridized to obtain the classification result of the entire sample set T, and then the F1 value is calculated.

After the above calculation, the F1 value is obtained, that is, the individual fitness value.

(d) Individual Selection: The selection operation is mainly to select the better parent chromosomes for mutation operation to upgrade the diversity and prevent the community from falling into the local optimal solution. All individuals are sorted in descending order according to fitness value, and the first ath individual is selected to enter the mating pool to form a parent. The number of mutations that each individual enters into the mating pool is calculated according to the formula l_i = α (a + 1 - i). Among them, l_i represents the number of pools where the ith individual enters the mating pool, α is the multiplier factor, and the total number of individuals entering the mating pool is Σl_i. They act as parents to generate new individuals. This strategy helps to preserve the best individuals of the past.

(e) Crossover Operator: Set crossover probability, perform parental individual crossover operation based on the probability and ensure that the individual obtained after crossover is not in the parent population. If the crossover rate is too high, there will be more chances that the well-adapted strings in the group will be destroyed, and even finally converge to the local optimum. If is too low, the search procedure will stagnate and cannot converge quickly. According to the population characteristics, the population types should be more random and diverse at the beginning and gradually converge in the later period, and the community similarity increases, so the crossover operation has a greater effect in the early stage than later. Therefore, as iterations upgrades, the number of chromosomes and crossover rate should be reduced accordingly, which is beneficial to reduce the computational cost of NHEL. The number of chromosomes P_i for crossover operation in each iteration is shown in formula (7), and the rate is updated as formula (8). The max crossover rate is 0.8, and the min is 0.2. $P_{i} = ⌊ {swsize}^{*} (1 - \frac{g}{g_{\max}}) ⌋$ (7)

$\begin{matrix} Croprobability = & ({pr}_{max} - {pr}_{min}) * {(1 - \frac{g}{g_{\max}})}^{2} \\ + {pr}_{min} \end{matrix}$ (8)

Among them, swsize represents the number of all chromosomes, g represents the current iteration, and g_max represents the max iterations. pr_max and pr_min are the max and min probability respectively.

(f) Mutation Operator: Random mutation operations are performed on individuals according to mutation rate. Choosing an appropriate mutation can ensure the diversity of a group. But if the mutation rate is too small, it is easy to fall into the local optimal solution; the rate is too large, which may increase the randomness, and finally cannot guarantee the effective convergence. In this paper, the mutation probability is adaptively changed, and the initial mutation rate is 0.05. When the fitness difference is greater than 0, in other words, a better new weight is found. At this time, the mutation rate is reduced to prevent the larger mutation rate from missing the global optimum. When the fitness difference is less than 0, the new weight found causes the classification effect worse. At this time, boost the mutation, raise the randomness and expand the global search ability, and try to find the new weight as quickly as possible to make the classification effect is enhanced. However, the mutation rate cannot be infinity, and the max threshold is set to 0.5. $p_{coefficient} = {\begin{matrix} \frac{f_{max} - f_{i}}{f_{max} - f_{avg}} f_{i} > f_{avg} \\ f_{coefficient}, f_{i} ⩽ f_{avg} \end{matrix}$ (9)

Among them, f_max is the max fitness in the group; f_avg is the average fitness; f_i is the population fitness of the individual performing the mutation; f_coefficient is a constant less than or equal to 0.5.

(g) GA_CG: In the process of solving the problem, after each evolution of GA for N generations (N = 5), the central individual of the population is searched locally with the conjugate gradient method (CG) to obtain the optimal solution efficiently. CG is a manner between the steepest descent and Newton’s method which only needs to apply the first derivative information.However, it overcomes the shortcomings of slow convergence of the steepest descent method, and avoids the shortcomings of Newton’s method that need to store and calculate the Hesse matrix and find the inverse. The process is as follows:

Set a₀ as the initial search point of CG, set k = 1, set the number of iterations n₀, and set the accuracy requirement ɛ=1×10^–8.

If the formula (10) is satisfied, stop the calculation (a^(k) is the solution of the wireless constraint problem), otherwise, follow the (11). $∥ \nabla f (a^{(k)}) ∥ ⩽ ɛ$ (10) $d^{(k)} = \nabla f (a^{(k)}) + β_{k - 1} d^{(k - 1)}$ (11)

$\begin{matrix} β_{k - 1} \\ = {\begin{matrix} 0, k = 1 \\ \frac{{∥ \nabla f (x^{(k)}) ∥}^{2}}{{∥ \nabla f (x^{(k - 1)}) ∥}^{2}} = \frac{\nabla f {(x^{(k)})}^{T} (\nabla f (x^{(k)}) - \nabla f (x^{(k - 1)}))}{\nabla f {(x^{(k - 1)})}^{T} \nabla f (x^{(k - 1)})}, k > 1 \end{matrix} \end{matrix}$ (12)

Perform a one-dimensional search according to (13) and (14) to solve the problem and obtain α _k : $max φ (α) = f (a^{(k)} + α d^{(k)})$ (13) $a^{(k + 1)} = a^{(k)} + α_{k} d^{(k)}$ (14)

Set k = k+1, go to step ②, and record the result as a⁽ⁿ⁾.

Calculate the fitness function of a⁽ⁿ⁾, and generate an individual according to a⁽ⁿ⁾. Add it to the population, and replace the individual with the smallest fitness in the population.

(h) Optimal Weight: The parent population generates new offspring. Let g = g+1, skip to step 4) to perform a new iteration until g = g_max. The offspring obtained after iteration is the optimal individual classifier weight subset.

From the description in Fig. 5, it can be seen that the hybrid GA based on CG only needs to alternately execute GA and CG on the central individual of the population. In this way, the global search feature of GA and the local fast convergence of CG can be fully utilized, so that the algorithm can quickly converge to the global optimal solution.

Fig. 5

An improved genetic algorithm optimization process.

5 Experiment and analysis of results

The experimental environment of this paper is configured as Quad-Core Intel (R) Xeon (R) E-2224 G CPU @3.50 GHz and 16.0 GB RAM. The experiment was performed under Python3.8 and TensorFlow1.14 with Scikit-learn framework to train six single classification models. This paper uses iterative random 10-fold cross-validation to average the results of each time (training set: test = 7 : 3).

5.1 Evaluation metric

The accuracy and error rate are the two most commonly introduced evaluation indicators. Since both of them are equally vital for each category, it is not suitable for analyzing the unbalanced data. In IMU data, it is more meaningful to correctly classify rare classes than to most classes. At this time, the precision rate P and the recall rate R are more suitable than the accuracy and error rate, but both of them are usually contradictory. In order to comprehensively consider P and R, their harmonic mean F1 measurement is proposed.

ROC curve, the full name is “Receiver Operating Characteristics”, which sorts the samples according to the prediction results of the classifier and selects different cut-off points (i.e., discrimination threshold) one by one to predict the samples as positive examples in this order. ROC can be obtained by calculating TPR and FPR of the current classifier each time, and then plotting them on the vertical and horizontal axes respectively. The AUC (Area Under ROC Curve) can be adopted as an accuracy evaluation index which measures the performance of the classifier in predicting sample sorting.

$P = \frac{TP}{TP + FP}, R = \frac{TP}{TP + FN}, F_{1} = \frac{2 TP}{2 TP + FN + FP}$ (15) $TPR = (\frac{TP}{TP + FN}) \times 100 %, FPR = (\frac{FP}{FP + TN}) \times 100 %$ (16)

5.2 Results of six individual classifiers

The first classifier is the Naive Bayes and the probability of features is assumed to be Gaussian. It can be seen that the effect of NBC is not satisfactory. When the surface type is carpet, P, R and F1 are all 0%. Among the 9 surface types, soft_tiles has the best result, and P, R and F1 are respectively 59%, 86% and 70%. Furthermore, when the type is hard_tiles, R reaches 100%.

The second classifier is SVM. The selection of the kernel function of SVM is particularly vital. According to the scope of application of different kernels, this paper selects 3 of the 5 commonly used kernel functions (linear, RBF, and Sigmoid kernels) for testing. In the learning phase, the IMU signals are extracted and normalized, and SVM is trained using the feature matrix composed of feature vectors. During the training process, the SVM parameters (c,g) are determined by particle swarm optimization (PSO) and 10-fold cross-validation. So as to get an optimized SVM that can be performed for robot contacting ground classification. Among them, g is the kernel parameter that represents the width of the function. The classification performance of SVM largely depends on the choice of c and g. The classification result of SVM is obviously more outstanding than that of the NBC model. When the surface type is carpet, P reaches 100%. Among the 9 surface types, the soft_tiles effect is still the best. The P, R, and F1 are respectively 73%, 92%, and 81%, indicating a certain enhancement.

The third classifier is K-nearest-neighbor learning. KNN finds k nearest neighbor points of the sample points to be tested from the sample set with existing label categories, and determines the category of the sample points based on similarity. KNN is classified by measuring the distance between sample points. When the surface type is hard_tiles, the P reaches 100%, which is significantly superior to SVM.

The fourth classifier is the decision tree. The DT employs CART algorithm, which generates a tree by introducing maximum information gain at each node (i.e., Gini index to divide attributes) and threshold values. The soft_tiles works best among the nine surface types, with the P, R and F1 reaching 80%, 82% and 81%, respectively. Finally, the mean value of the three evaluation indexes of the DT is 50%, and the classification effect is more balanced.

The fifth classifier is the random forest which takes advantage of ensemble ideas to boost the performance of DT. The RF in this paper utilizes both training sample and input attribute perturbation ways (i.e., mixed perturbation) to ensure that the individual learner is outstanding enough to enhance diversity, that is, to elevate generalization ability and robustness. The main hyper-parameters for RF is n_estimators, which is the number of trees in the forest. Generally, the larger quantity, the better effect, but the calculation time will also increase. In addition, when the number of trees exceeds a critical value, the effect will not be significantly raised.

The sixth classifier is XGBoost. The final classification performance and RF are approximately the same, but the generalization capability for each class is different. The optimal parameters of each individual classifier are shown in Table 4.

Table 4
The optimal parameters of each model

Model Optimal hyper-parameters

NBC Naive Bayesian network types=’Gaussian’

SVM Kernel=’rbf’ penalty and kernel parameters (c,g)=(0.93,4.79)

KNN Number of neighbors K = 10

DT types=’CART’

RF n_estimators=300, random_state = 42

XGBoost N_estimators=150 (i.e., number of generated trees), learning_rate = 0.025, max_depth = 5,min_child_weight=1.2 and subsample = 0.8

Model	Optimal hyper-parameters
NBC	Naive Bayesian network types=’Gaussian’
SVM	Kernel=’rbf’ penalty and kernel parameters (c,g)=(0.93,4.79)
KNN	Number of neighbors K = 10
DT	types=’CART’
RF	n_estimators=300, random_state = 42
XGBoost	N_estimators=150 (i.e., number of generated trees), learning_rate = 0.025, max_depth = 5,min_child_weight=1.2 and subsample = 0.8

5.3 Comparison of sorting performance

Until now, due to the small number of samples of the hard_tiles type, only the KNN has a certain effect on this. The other 5 methods are all performed very poorly for this type. In addition, the six classifiers have stronger generalization capabilities for the soft_tiles type, and the classification accuracy is excellent. The average value of each classifier evaluation measurement is shown in Table 5. Note: The avg/total mentioned in the last row of Table 5 is a prevalence-weighted macro-average across classes. This is because there is a serious imbalance in the data, which turns the problem into unbalanced classification. This paper introduces evaluation indicators that consider sample distribution. The ROC with AUC values for 6 individual classifiers are shown in Fig. 6.

Table 5
The results of six individual classifiers

Surface Type Naive Bayes classifier SVM with RBF kernel classifier KNN classifier

P R F1 P R F1 P R F1

Carpet 0.00 0.00 0.00 1.00 0.11 0.19 0.37 0.15 0.21

Concrete 0.66 0.11 0.19 0.57 0.79 0.66 0.57 0.68 0.62

Fine_concrete 0.23 0.19 0.21 0.58 0.08 0.14 0.37 0.20 0.26

Hard_tiles 0.05 1.00 0.10 0.00 0.00 0.00 1.00 0.20 0.33

Hard_tiles_large_space 0.60 0.39 0.47 0.62 0.49 0.55 0.60 0.45 0.52

Soft_pvc 0.53 0.60 0.56 0.57 0.87 0.69 0.60 0.79 0.68

Soft_tiles 0.59 0.86 0.70 0.73 0.92 0.81 0.69 0.91 0.78

Tiled 0.30 0.67 0.42 0.53 0.46 0.49 0.49 0.42 0.45

Wood 0.53 0.31 0.39 0.73 0.53 0.64 0.68 0.64 0.66

Avg / total 0.48 0.40 0.38 0.63 0.61 0.57 0.57 0.58 0.56

Surface Type Decision tree classifier Random forest classifier XGBoost classifier

P R F1 P R F1 P R F1

Carpet 0.37 0.38 0.37 0.80 0.26 0.39 1.00 0.11 0.19

Concrete 0.52 0.52 0.52 0.57 0.79 0.66 0.57 0.79 0.66

Fine_concrete 0.21 0.21 0.21 0.52 0.14 0.22 0.58 0.08 0.14

Hard_tiles 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Hard_tiles_large_space 0.44 0.42 0.43 0.62 0.52 0.56 0.62 0.49 0.55

Soft_pvc 0.59 0.62 0.61 0.64 0.87 0.74 0.57 0.87 0.69

Soft_tiles 0.80 0.82 0.81 0.81 0.92 0.86 0.73 0.92 0.81

Tiled 0.41 0.40 0.40 0.58 0.47 0.52 0.53 0.46 0.49

Wood 0.52 0.51 0.52 0.76 0.70 0.73 0.73 0.58 0.64

Avg / total 0.50 0.50 0.50 0.64 0.64 0.62 0.63 0.61 0.57

Surface Type	Naive Bayes classifier	SVM with RBF kernel classifier	KNN classifier
Carpet	0.00	0.00	0.00	1.00	0.11	0.19	0.37	0.15	0.21
Concrete	0.66	0.11	0.19	0.57	0.79	0.66	0.57	0.68	0.62
Fine_concrete	0.23	0.19	0.21	0.58	0.08	0.14	0.37	0.20	0.26
Hard_tiles	0.05	1.00	0.10	0.00	0.00	0.00	1.00	0.20	0.33
Hard_tiles_large_space	0.60	0.39	0.47	0.62	0.49	0.55	0.60	0.45	0.52
Soft_pvc	0.53	0.60	0.56	0.57	0.87	0.69	0.60	0.79	0.68
Soft_tiles	0.59	0.86	0.70	0.73	0.92	0.81	0.69	0.91	0.78
Tiled	0.30	0.67	0.42	0.53	0.46	0.49	0.49	0.42	0.45
Wood	0.53	0.31	0.39	0.73	0.53	0.64	0.68	0.64	0.66
Avg / total	0.48	0.40	0.38	0.63	0.61	0.57	0.57	0.58	0.56
Surface Type	Decision tree classifier	Random forest classifier	XGBoost classifier
	P	R	F1	P	R	F1	P	R	F1
Carpet	0.37	0.38	0.37	0.80	0.26	0.39	1.00	0.11	0.19
Concrete	0.52	0.52	0.52	0.57	0.79	0.66	0.57	0.79	0.66
Fine_concrete	0.21	0.21	0.21	0.52	0.14	0.22	0.58	0.08	0.14
Hard_tiles	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
Hard_tiles_large_space	0.44	0.42	0.43	0.62	0.52	0.56	0.62	0.49	0.55
Soft_pvc	0.59	0.62	0.61	0.64	0.87	0.74	0.57	0.87	0.69
Soft_tiles	0.80	0.82	0.81	0.81	0.92	0.86	0.73	0.92	0.81
Tiled	0.41	0.40	0.40	0.58	0.47	0.52	0.53	0.46	0.49
Wood	0.52	0.51	0.52	0.76	0.70	0.73	0.73	0.58	0.64
Avg / total	0.50	0.50	0.50	0.64	0.64	0.62	0.63	0.61	0.57

Fig. 6

The ROC curves with AUC values for six individual classifiers.

5.4 NHEL classification result

GA is a global search algorithm, which adopts the selection mechanism of natural survival of the fittest, and can always find the global optimal in most cases. The NHEL model optimizes the weights of six individual classifiers by employing GA, and applies the weighted voting method to output the hybrid classifiers. At present, in most genetic operators, the crossover and mutation rates during population evolution are constant and remain unchanged in the iterative process, which will affect the convergence of evolution at the later stage and lead to the increase of NHEL training time. In order to accelerate the search of chromosomes in the process of evolution and promote the GA to find the optimal solution as soon as possible, this paper determines the crossover and mutation probability according to the different stages of community evolution, and introduces the conjugate gradient method to enhance the search mode, so as to accelerate the global optimum.

The weights of individual classifiers and the classification results of NHEL model are shown in Tables 6 and 7, respectively. Among the six classifiers, RF has the best performance, with P, R and F1 reaching 64%, 64% and 62% respectively. Finally, through the weighted voting method, the single classifiers are hybridized to obtain NHEL model, and its performance is dramatically elevated. The P, R and F1 all have remarkable improvement compared with 6 individual classifiers. The generalization ability and robustness of the NHEL model to the 9 surface environments is also boosted significantly.

Table 6
The optimized weights of each classifier

Individual classification Optimal weight Individual classification Optimal weight

NBC 0.05 DT 0.09

SVM 0.30 RF 0.36

KNN 0.06 XGBoost 0.14

Individual classification	Optimal weight	Individual classification	Optimal weight
NBC	0.05	DT	0.09
SVM	0.30	RF	0.36
KNN	0.06	XGBoost	0.14

Table 7

The NHEL classifier result

Surface type	Precision	Recall	F1
Carpet	0.98	0.45	0.65
Concrete	0.68	0.97	0.72
Fine_concrete	1.00	1.00	1.00
Hard_tiles	0.98	1.00	0.99
Hard_tiles_large_space	1.00	0.98	0.99
Soft_pvc	0.99	1.00	0.99
Soft_tiles	1.00	1.00	1.00
Tiled	0.97	0.96	0.98
Wood	0.95	0.52	0.67
Avg / total	0.91	0.87	0.86

In addition, the IMU has a severe imbalance of the categories, and NHEL can correctly classify the minority/rare categories (i.e., hard_tiles) without using over-sampling treatment for a few samples (e.g., classic SMOTE) or under-sampling for most classes (e.g., classic RUS). The 10-fold-cross validation with P, R, and F1 values of the NHEL and specific confusion matrix are shown in Fig. 7.

Fig. 7

The 10-fold-cross validation and confusion matrix.

6 Conclusion

Under the background of the booming development of big data and the intelligent robot industry, the existing navigation technology has a certain delay in identifying the driving form and environment of the robot. For this reason, it is very crucial to enhance the autonomous tactile perception and recognition of robots. In this paper, through the IMU data, effective feature extraction is carried out in multi-dimension, including RMS, PSD and Tsfresh time-series feature mining. Moreover, the parameters of the six individual classifiers (NBC, SVM, KNN, DT, RF, XGBoost) were adjusted and trained. The GA optimized by the conjugate gradient is applied to optimize weights. The amount of storage required in the number of optimization iterations is small, has step convergence, high stability, and does not require any external parameters.

NHEL achieves satisfactory results in robot ground recognition. The P, R, and F1 reach 91%, 87%, and 86% respectively, which boosted by 25% on average compared with the optimal single classifier. The proposed method has some limitations for the real realization of data fusion and evolution between the machine and the environment. In the future, we will conduct in-depth research to enhance the interpretability of the NHEL classifier and find a more systematic hybrid method.

Footnotes

Acknowledgments

This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFB2100800.

References

Ali Al-dabbagh

H.A.

, Ronsse

Renaud

, A review of terrain detection systems for applications in locomotion assistance, Robotics and Autonomous Systems 133 (2020), 103628.

Sun

, Back

and Liu

, Combining Contact Forces and Geometry to Recognize Objects During Surface Haptic Exploration, in, IEEE Robotics and Automation Letters 3(3) (2018), 2509–2514.

Luo

, Mou

, Althoefer

and Liu

, Novel tactile-SIFT descriptor for object shape recognition, IEEE Sensors J 15(9) (2015), 5001–5009.

Bimbo

, Luo

, Althoefer

and Liu

, In-hand object pose estimation using covariance based tactile to geometry matching, IEEE Robot Autom Lett 1(1) (2016), 570–577.

Friedl

K.E.

, Voelker

A.R.

, Peer

and Eliasmith

, HumanInspired Neurorobotic System for Classifying Surface Textures by Touch, in, IEEE Robotics and Automation Letters 1(1) (2016), 516–523.

Giguere

and Dudek

, A Simple Tactile Probe for Surface Identification by Mobile Robots, in, IEEE Transactions on Robotics 27(3) (2011), 534–544.

Sun

, Liu

, Huang

and Zhang

, Object Classification and Grasp Planning Using Visual and Tactile Sensing, in IEEE Transactions on Systems, Man, and Cybernetics: Systems, 46(7) (2016), 969–979.

Zhang

, Liu

, Yin

, Wang

, Gui

and Chen

, Intelligent Collaborative Localization Among Air-Ground Robots for Industrial Environment Perception, in, IEEE Transactions on Industrial Electronics 66(12) (2019), 9673–9681.

Bhattacharya

, et al., Surface-Property Recognition With Force Sensors for Stable Walking of Humanoid Robot, in, IEEE Access 7 (2019), 146443–146456.

10.

Asif

, Bennamoun

and Sohel

F.A.

, RGB-D Object Recognition and Grasp Detection Using Hierarchical Cascaded Forests, in, IEEE Transactions on Robotics 33(3) (2017), 547–564.

11.

Yang

, Dryanovski

, Valenti

R.G.

, et al., RGB-D camera calibration and trajectory estimation for indoor mapping, Auton Robot 44 (2020), 1485–1503.

12.

X.A.

, Huh

T.M.

, Mukherjee

and Cutkosky

, Integrated Ground Reaction Force Sensing and Terrain Classification for Small Legged Robots, in, IEEE Robotics and Automation Letters 1(2) (2016), 1125–1132.

13.

X.A.

, Huh

T.M.

, Sabin

, Suresh

S.A.

and Cutkosky

M.R.

, Tactile Sensing and Terrain-Based Gait Control for Small Legged Robots, in, IEEE Transactions on Robotics 36(1) (2020), 15–27.

14.

, Tian

, Wang

, Chen

and Kang

, Joint Input and Output Space Learning for Multi-Label Image Classification, in IEEE Transactions on Multimedia, 2020. doi: 10.1109/TMM.2020.3002185.

15.

Wei

and Wang

, Visual Navigation Using Projection of Spatial Right-Angle In Indoor Environment, in, IEEE Transactions on Image Processing 27(7) (2018), 3164–3177.

16.

Sleaman

W.K.

and Yavuz

, Indoor Mobile Robot Navigation Using Deep Convolutional Neural Network, Journal of Intelligent & Fuzzy Systems 39(4) (2020), 5475–5486.

17.

Yuan

, Tang

and Zhao

, Road Detection in Image by Fusion Laser Points Based on Fuzzy SVM for a Small Ground Mobile Robot, Journal of Intelligent & Fuzzy Systems 29(6) (2015), 2677–2688.

18.

Lee

U.H.

, Bi

, Patel

and Fouhey

, Image Transformation and CNNs: A Strategy for Encoding Human Locomotor Intent for Autonomous Wearable Robots, in, IEEE Robotics and Automation Letters 5(4) (2020), 5440–5447.

19.

Sezgin

and Çetin

Ö.

, Autonomous Line Follower Robot with Fuzzy Based Hybrid Controller, Journal of Intelligent & Fuzzy Systems 39(5) (2020), 6021–6031.

20.

Kertesz

, Rigidity-based surface recognition for a domestic legged robot, IEEE Robotics and Automation Letters 1(1) (2016), 309–315.

21.

Oliveira

F.G.

, Santos

E.R.S.

, Neto

A.A.

, Campos

M.F.M.

and Macharet

D.G.

, Speed-invariant terrain roughness classification and control based on inertial sensors, 2017 Latin American Robotics Symposium(LARS) and 2017 Brazilian Symposium on Robotics, Curitiba, 2017, pp.1–6.

22.

Brooks

C.A.

and Iagnemma

, Self-supervised terrain classification for planetary surface exploration rovers, Journal of Field Robotics, 29(3) (2012), 445–468.

23.

Friedl

K.E.

, Voelker

A.R.

, Peer

and Eliasmith

, Human-Inspired Neurorobotic System for Classifying Surface Textures by Touch, in, IEEE Robotics and Automation Letters 1(1) (2016), 516–523.

24.

Li.

, Research on the ground classification method of wheeled robots based on vibration signals, Harbin Engineering University, 2013.

25.

Bai.

, Research on ground classification of mobile robots based on Bayesian framework. Harbin Engineering University, 2017.

26.

Valada

and Burgard

, Deep spatiotemporal models for robust proprioceptive terrain classification, Journal of Robotics Research 36 (2017), 1521–1539.

27.

Walas

, Terrain classification and negotiation with a walking robot, Journal of Intelligent & Robotic Systems 78(3-4) (2015), 40–423.

28.

Mount

, Xu

, Dawes

and Milford

, Unsupervised Selection of Optimal Operating Parameters for Visual Place Recognition Algorithms Using Gaussian Mixture Models, in, IEEE Robotics and Automation Letters 6(2) (2021), 343–350.

29.

Pham

, Yang

and Sheng

, A Sensor Fusion Approach to Indoor Human Localization Based on Environmental and Wearable Sensors, in, IEEE Transactions on Automation Science and Engineering 16(1) (2019), 339–350.

30.

Chang

, Kim

and Jeon

, Research on Terrain Identification of the Smart Prosthetic Ankle by Fuzzy Logic, in, IEEE Transactions on Neural Systems and Rehabilitation Engineering 27(9) (2019), 1801–1809.

31.

, Song

, Guo

, Xi

and Wu

, Robotic-Assisted Rehabilitation Trainer Improves Balance Function in Stroke Survivors, in, IEEE Transactions on Cognitive and Developmental Systems 12(1) (2020), 43–53.

32.

and Wang

, On-board Training Strategy for IMU-Based Real-Time Locomotion Recognition of Transtibial Amputees With Robotic Prostheses, Front Neurorobot, 2020.

33.

Kolvenbach

, Bärtschi

, Wellhausen

, Grandia

and Hutter

, Haptic Inspection of Planetary Soils With Legged Robots, in, IEEE Robotics and Automation Letters 4(2) (2019), 1626–1632.

34.

www.zenodo.org/record/2653918#.XMgP2MRS-Uk

35.

Maximilian

, Nils

, Julius

and Kempa-Liehr

A.W.

, Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh-A Python package), Neurocomputing 307(9) (2018), 72–77.

A hybrid evolutionary learning classification for robot ground pattern recognition

Abstract

Keywords

1 Introduction

2 Problem formulation

Table 1 Robot IMU sensor data composition structure Meta data Position data Motion/movement data Row id Series id Measurement number Orientation Angular velocity Linear acceleration (x, y, z, w) (x, y, z) (x, y, z)

4.1 Combine and hybrid strategies

4.2 Hybrid learning with genetic algorithms

4.3 Specific optimization process

5.1 Evaluation metric

Table 6 The optimized weights of each classifier Individual classification Optimal weight Individual classification Optimal weight NBC 0.05 DT 0.09 SVM 0.30 RF 0.36 KNN 0.06 XGBoost 0.14

Footnotes

Acknowledgments

References

Table 1
Robot IMU sensor data composition structure

Meta data Position data Motion/movement data

Row id Series id Measurement number Orientation Angular velocity Linear acceleration

(x, y, z, w) (x, y, z) (x, y, z)

Table 6
The optimized weights of each classifier

Individual classification Optimal weight Individual classification Optimal weight

NBC 0.05 DT 0.09

SVM 0.30 RF 0.36

KNN 0.06 XGBoost 0.14