Fault identification of rolling bearing based on improved salp swarm algorithm

Abstract

Due to the rapid development of industrial manufacturing technology, modern mechanical equipment involves complex operating conditions and structural characteristics of hardware systems. Therefore, the state of components directly affects the stable operation of mechanical parts. To ensure engineering reliability improvement and economic benefits, bearing diagnosis has always been a concern in the field of mechanical engineering. Therefore, this article studies an effective machine learning method to extract useful fault feature information from actual bearing vibration signals and identify bearing faults. Firstly, variational mode decomposition decomposes the source signal into several intrinsic mode functions according to the actual situation. The vibration signal of the bearing is decomposed and reconstructed. By iteratively solving the variational model, the optimal modulus function can be obtained, which can better describe the characteristics of the original signal. Then, the feature subset is efficiently searched using the wrapper method of feature selection and the improved binary salp swarm algorithm (IBSSA) to effectively reduce redundant feature vectors, thereby accurately extracting fault feature frequency signals. Finally, support vector machines are used to classify and identify fault types, and the advantages of support vector machines are verified through extensive experiments, improving the ability of global search potential solutions. The experimental findings demonstrate the superior fault recognition performance of the IBSSA algorithm, with a highest recognition accuracy of 97.5%. By comparing different recognition methods, it is concluded that this method can accurately identify bearing failure.

Keywords

Fault diagnosis salp swarm algorithm feature selection variational mode decomposition

1. Introduction

Currently, with the development of the industrial sector, there is an increasing demand for various types of machines. Rolling bearings are prevalently acknowledged as the prominent mechanical elements observed in nearly all types of rotating machinery, and their condition directly affects the operation of the entire equipment [1,2]. Due to prolonged high-temperature and high-speed operation, the possibility of mechanical failures in rolling bearings is quite high [3]. Bearing failures account for more than 30% of all failures in rotating machinery. When these failures become severe, they can lead to increased production losses and maintenance costs. Consequently, the accurate identification and assessment of rolling bearing failures are of utmost importance in maintaining the reliable and safe operation of equipment [4,5].

Extracting the characteristic information of bearing faults from non-stationary signals is crucial for identifying bearing faults [6]. Currently, signal decomposition methods commonly employed encompass time-domain analysis, frequency-domain analysis, and time-frequency domain reconstruction techniques. Xue et al. [7] proposed using adaptive interpolation-based Fourier transform to efficiently analyze and model pulse eddy current signals, thereby calculating changes in harmonic impedance. However, Fourier transform has limited processing capabilities for non-stationary signals and can only obtain overall frequency components contained in a segment of the signal without knowledge of the occurrence time of each component. Additionally, Wang et al. [8] proposed a novel approach by combining empirical mode decomposition (EMD) with time-frequency peak filtering (TFPF) to address noise in the accelerometer calibration procedure. This innovative method proves successful in suppressing random noise that arises during the calibration process. However, EMD also has drawbacks, such as mode mixing, endpoint effects, and difficult-to-determine stopping conditions. To overcome these drawbacks, scholars have proposed variational mode decomposition (VMD). Luo et al. [9] used VMD in wind turbine gearbox fault diagnosis to decompose current signals and obtain intrinsic mode functions related to faults, which has significant advantages in terms of classification accuracy, simplicity, and efficiency. He et al. [10] effectively extracted fault information of flywheel energy storage system bearings using parameter-optimized VMD energy entropy method, taking into account the nonlinearity and non-stationarity of bearing signals. The results show that VMD can accurately extract the main mode, improve the mixing effect and endpoint effect in EMD modes, and have stronger robustness to noise. Given the successful experience mentioned above, this article uses VMD to decompose the source signal into multiple intrinsic mode functions, which are a combination of different frequency and amplitude features in the original signal. By using these intrinsic mode functions to decompose and reconstruct the bearing vibration signal, and through iterative search, adjusting the parameters of the VMD model and the selection of intrinsic mode functions, the optimal mode function that can better describe the characteristics of the original vibration signal is ultimately found.

The effective application of machine learning algorithms in fault recognition technology can solve many specific problems. For the bearing fault problem in rolling elements, machine learning methods can eliminate some irrelevant and redundant feature information. Feature selection refers to the process of eliminating redundant features and identifying an optimal subset of features for addressing a given problem [11]. This elimination of features serves multiple purposes: reducing data size, improving feature quality, and reducing complexity, which all contribute to the effective performance of diagnostic models [12]. Evaluation criteria classify feature selection methods into three main categories: wrapper, filter, and embedded methods [13]. Classic feature selection methods include univariate feature selection, linear modeling and regularization, recursive feature elimination, etc. [14]. However, due to the involvement of various complex sensor data such as vibration, sound, and temperature in bearing faults, these multi-dimensional data features often have high correlation and interactivity. Traditional feature selection methods are difficult to adapt to specific problems and data characteristics. The metaheuristic algorithm has good global search ability and can effectively handle complex feature selection problems. Therefore, the search algorithm based on metaheuristic has attracted great attention to solving feature selection problems. This article adopts a commonly used optimization algorithm to search for feature subsets for feature selection, filtering the decomposed signal to obtain maximum accuracy and fewer signal features.

Metaheuristic optimization algorithms are divided into evolutionary-based algorithms, physics-based algorithms, swarm intelligence-based algorithms, and human-based algorithms [15]. The most noteworthy of these is swarm intelligence (SI)-based algorithms, which solve optimization problems by imitating cooperative social behaviors of birds, animals, fish, and insects. Some popular SI-based algorithms include genetic algorithm (GA), particle swarm optimization (PSO), bat algorithm (BA), etc. They are all random optimization techniques that attempt to obtain better solutions by invoking feedback and heuristic information. However, these traditional optimization algorithms have a slow convergence speed and are sensitive to parameter settings when dealing with complex fault data, requiring a lot of parameter settings to balance the exploration and utilization capabilities of the algorithm. More importantly, due to the lack of differentiation in the search process, it is difficult to fully explore the whole solution space, resulting in these traditional optimization algorithms more easily falling into the local optimal solution. The Salp Swarm Algorithm (SSA) is a relatively new swarm intelligence algorithm that simulates the foraging behavior of simulated groups in the ocean. As a new heuristic optimization algorithm, SSA has the advantage of minimizing parameter requirements and being effective for both continuous and discrete problems. Integrating multiple random operators into SSA can effectively improve the initial random solution, making the algorithm better avoid using local solutions in multimodal search environments, thereby improving optimization efficiency and accuracy. Since bearing faults often show complexity and uncertainty, once a bearing fault occurs, fault diagnosis must be carried out immediately according to the fault characteristic frequency signal [16]. However, the performance of SSA in bearing fault diagnosis is rarely reported in the literature. Therefore, this study introduces an improved binary salp swarm algorithm (IBSSA). IBSSA utilizes the advantages of the SSA algorithm in solving search space difficulties and unknown practical problems, overcoming the binary encoding problem of the original SSA algorithm in processing binary space optimization problem sets, efficiently searching feature subsets, effectively reducing redundant feature vectors, and accurately extracting fault feature frequency signals.

When useful fault features are successfully extracted, some machine learning methods can be used to automatically identify the type of rolling bearing fault. Support vector machine (SVM) is a powerful classifier, and SVM models can compress and reduce the dimensionality of raw data to reduce the impact of redundant information and noise. Due to its remarkable capability in extracting latent knowledge from limited samples, handling nonlinear data, and operating in high-dimensional spaces, SVM have gained significant popularity and usage as a vital data processing and classification technique across diverse pattern recognition applications, especially in fault state recognition for complex industrial processes [17]. Researchers have studied an adaptive microgrid fault accurate identification scheme based on support vector machines. The overall adaptable approach utilizes a two-step SVM classifier framework, which can accurately detect unstable changes in microgrids under normal and fault conditions [18]. In a previous study by Nadji Hadroug et al. [19], a fault monitoring and detection technique for gas turbine systems was proposed. The method utilized support vector machine (SVM) as its underlying framework.By using SVM for gas turbine vibration monitoring, faults can be effectively located and identified. Therefore, this paper uses SVM for fault type classification and recognition, and the advantage of SVM is verified through a large number of experiments, improving the classification accuracy and the ability to globally search potential solutions.

Overall, this article has the following contributions:

1.
This article proposes a new method for identifying rolling bearing faults. Firstly, preprocess the vibration signals of different bearings. The denoising stage uses a VMD filtering algorithm to remove noise. Then, the packaging method is used for feature selection to identify the optimal combination of features and extract the most valuable information from the signal. The improved salp swarm algorithm effectively reduces redundant feature vectors, making it more accurate in extracting fault feature frequency signals. Finally, using support vector machines for fault classification and recognition, rolling bearing fault diagnosis is achieved.
2.
This study applies SSA to the feature selection problem in bearing fault diagnosis, benefiting from the flexibility and high randomness advantages of the algorithm in handling parameters and local solutions in different ranges. Compared with other traditional metaheuristic algorithms, SSA has the advantages of small parameter requirements, fast convergence speed, and effectiveness for both continuous and discrete problems, enabling more accurate processing of complex fault data. Integrating several random operators into SSA can effectively improve the initial random solution, avoid using local solutions in multimodal search environments, and effectively improve the accuracy and efficiency of global optimization, thus better adapting to the feature extraction requirements in the bearing fault diagnosis process.
3.
In IBSSA, an improved chaotic mapping is used to unify the initialization of population positions, achieving the goal of avoiding local optima and improving global convergence. By using the IBSSA algorithm, the continuous optimization problem is transformed into binary form, allowing the salp to move freely at any point in the search space. At the same time, using dynamic weight factors for position updates makes the population less likely to fall into local optima. Through experimental analysis, it can be concluded that IBSSA can effectively solve multi-objective optimization problems and has a strong ability to search for optimal solutions.

The remaining structure of this article is summarized as follows: Chapter 2 introduces the essential theories of relevant algorithms, including Variational Mode Decomposition (VMD), Salp Swarm Algorithm (SSA), and Support Vector Machine (SVM). Chapter 3 provides an in-depth introduction to the feature selection process based on IBSSA. In Chapter 4, a comprehensive comparative experimental analysis was conducted on the proposed bearing fault identification method, demonstrating its superiority. Chapter 5 is the conclusion.
2. Related basic theories

2.1 Variational mode decomposition

VMD is a process that iteratively seeks the optimal solution of the variational model for decomposing the real-valued input signal $f$ into discrete sub-signals $u_{k}$ , thus determining the Intrinsic Mode Functions (IMFs) and their corresponding central frequencies [20]. The optimal solution is a combined eigenmode function, which can describe the frequency component of the original signal well. The VMD process decomposes the signal into a series of IMFs, where each mode is mainly centered around a finite bandwidth compact central frequency, and each IMF can be represented as an amplitude-modulated frequency-modulated signal. The equation is as follows:

\begin{aligned} u_{k} = A_{k} (t) \cos (\emptyset_{k} (t)) \end{aligned}

(1)

where

A_{k} (t)

is the instantaneous amplitude,

\emptyset_{k} (t)

is a non-decreasing phase function that varies with time,

\emptyset_{k}^{^{'}} (t)

represents the instantaneous frequency.

A method was proposed to evaluate modal components as follows: (1) The Hilbert transform was employed to calculate the relevant analytic signal for each mode, resulting in a one-sided spectrum. (2) The spectrum for each mode was shifted to baseband, and the exponent was adjusted to match the estimated center frequency. (3) The bandwidth was estimated by parameterizing the smoothness of the demodulated signal,specifically by evaluating the square of the gradient.Assuming that the signal has been decomposed into K intrinsic mode functions (IMFs) through VMD processing, the K modal functions denoted as $u_{k} (t)$ were obtained, with the estimation bandwidth and value of modal functions minimized. The variational constrained model is given by:

\begin{aligned} \emptyset_{k}^{^{'}} (t) min_{{u_{k}, w_{k}}} {\sum_{k} {‖ \partial_{t} [(δ (t) + \frac{j}{π t}) * u_{k} (t)] e^{- j w_{k} t} ‖}_{2}^{2}} \end{aligned}

(2)

\begin{aligned} s . t . \sum_{k} u_{k} (t) = f \end{aligned}

(3)

where the

t

represents time, * represents convolution operation,

p a r t i a l_{t}

is the partial derivative of

t

{u_{k}}

is the set of

k

-th modal components,

{w_{k}}

is the central frequency set,

δ (t)

is the unit impact function, and

j

is the virtual unit.

In order to solve the optimal solution of the variational constrained model, a quadratic penalty function along with the utilization of a Lagrange multiplier is incorporated, and it is transformed into an unconstrained optimization search problem. The classical quadratic penalty function method usually adds Gaussian noise and reconstructs fidelity, where in the weighting of the penalty term relates inversely to the level of noise inherent in the data. The Lagrange multiplier is a commonly used method for strictly enforcing constraints. Therefore, the unconstrained optimal search problem benefits from both the good convergence of the quadratic penalty and the strict enforcement of constraints by the Lagrange multiplier. The expression for the optimal solution is as follows:

\begin{aligned} L ({u_{k}, w_{k}, λ}) & = α \sum_{k} ∥ \partial_{t} [(δ (t) + \frac{j}{π t} * u_{k} (t)] e^{- j w_{k} t} ∥_{2}^{2} \\ + ∥ f (t) - \sum_{k} u_{k} (t) ∥_{2}^{2} + [λ (t), f (t) - \sum_{k} u_{k} (t)] \end{aligned}

(4)

where

f (t)

the represents the signal, and

α

is the penalty factor.

To update the modal components, obtain each IMF through Fourier transform in Eq. (5), and update the central frequency of the power spectrum of each IMF as shown in Eq. (6).

\begin{aligned} {\hat{u}}_{k}^{n + 1} (w) & = \frac{f (w) - \frac{\sum_{i \neq k} {\hat{u}}_{i} (w) + \hat{λ} (w)}{2}}{1} + 2 α (w - w_{k})^{2} \end{aligned}

(5)

\begin{aligned} {\hat{w}}_{k}^{n + 1} & = \int_{0}^{\infty} w ∣ {\hat{u}}_{k} (w) ∣^{2} d w / \int_{0}^{\infty} {| {\hat{u}}_{k} (w) |}^{2} d w \end{aligned}

(6)

The VMD method solves the end effects and mode mixing problems of the EMD method. It splits the original signal into multiple subsets of relatively stable sequences at different frequency scales. Consequently, the ensuing analysis involves the handling of time series data characterized by elevated intricacy and robust nonlinear patterns. [21] All decomposed modes mainly include signal mode and noise mode, and the mode containing the main signal is reconstructed to achieve the denoising effect. VMD uses non-recursive iteration to decompose the signal into fewer modes represented by smaller feature subsets. Therefore, the VMD algorithm has reliable and robust decomposition results in signal decomposition.

2.2 Salp swarm algorithm

Mirjalili et al. [22] proposed an algorithm for salp swarm and divided the whole population into leaders and followers. In which, the leader is located at the front end and guides some follower population to search for the best solution in the multidimensional search space, as shown in Fig. 1.

Figure 1.

Structure diagram of ascidian group.

In this algorithm, the whole population is situated in the $n$ $*$ $d$ dimensional search space, where $n$ is the number of problem variables and $d$ is the number of space dimensions. The location of the population is represented as a two-dimensional matrix by $X_{i}$ , where $i = 1, 2, \dots, n$ .

\begin{aligned} X_{i} = [\begin{matrix} x_{1}^{1} & x_{2}^{1} & \dots & x_{d}^{1} \\ x_{1}^{2} & x_{2}^{2} & \dots & x_{d}^{2} \\ \dots & \dots & \dots & \dots \\ x_{1}^{n} & x_{2}^{n} & \dots & x_{d}^{n} \end{matrix}] \end{aligned}

(7)

The random initialization group formula is as follows:

\begin{aligned} X_{n * d} = l b + r a n d (n, d) * (u b - l b) \end{aligned}

(8)

Given that the search range of the search space is $u b = [{u b}_{1}, {u b}_{2}, \dots, {u b}_{d}]$ and $l b = [{l b}_{1}, {l b}_{2}, \dots, {l b}_{d}]$ , which represent the upper and lower bounds respectively, during the search process, the boundaries must not be exceeded, otherwise the search will be pulled back to the designated range. The search target of the group in the search space can be defined as: $F = [F_{1}, F_{2}, \dots, F_{d}]$ .

Update the position formula of the leader as follows:

\begin{aligned} x_{j}^{1} = {\begin{cases} F_{j} + (c 1 (u b_{j} - l b_{j}) c 2 + l b_{j}), & if c 3 ⩾ 0.5 \\ F_{j} - (c 1 (u b_{j} - l b_{j}) c 2 + l b_{j}), & if c 3 < 0.5 \end{cases} \end{aligned}

(9)

where

j = 1, 2, \dots, d

x_{j}^{1}

and

F_{j}

are the positions of the leader and the

i

-th food source in the

j

-th dimension, respectively.

c 1

is a nonlinear decreasing process, and is calculated by the following equation:

\begin{aligned} c 1 = 2 \exp (- (4 l / L))^{2} \end{aligned}

(10)

where

l

is the current iteration, and

L

is the maximum number of iterations. The follower’s position update formula is as follows:

\begin{aligned} x_{j}^{i} = \frac{1}{2} (x_{j}^{i} + x_{j}^{i - 1}) \end{aligned}

(11)

where

i ⩾ 2

represents the position of the

i

-th follower population in the

j

-th dimension. Determine whether the condition meets the constraint threshold.Otherwise, stop updating and output the optimization result and continue to iterate.The pseudocode for the SSA procedure is given in Algorithm 1.

2.3 Support vector machines

In the linear SVM classifier, the decision hyperplane can correctly separate the data points in the training set. Two regions of two types of data are separated by a straight line $f (x) = ω^{T} x + b$ , where $o m e g a$ is the vector of the slope of the hyperplane, and the hyperplane in the data space is represented as: $ω^{T} x + b = 0$ . The training set is denoted by ${(x_{1}, y_{1}), \dots, (x_{m}, y_{m})}$ , where $m$ is the number of copies in the training set, $x_{i} \in R$ , and $y_{i} \in {- 1, 1}$ . In order for the classification line to classify all samples correctly and to ensure that all data points do not lie between the two supporting planes, the following formula must be satisfied:

\begin{aligned} y_{i} (ω^{T} x_{i} + b) ⩾ 1, i = 1, 2, \dots, m \end{aligned}

(12)

When the features in the feature space have non-linear correlations, a straight line is no longer able to achieve maximum separation between two classes of data points. Therefore, to address this issue, non-linear problems are transformed into linear problems by introducing the minimization function formula into support vector machines. SVM is then integrated with Gaussian kernel function.

\begin{aligned} K (x_{i}, y_{i}) = \exp (- g {∥ x_{i} - x_{j} ∥}_{2}^{2}), g > 0 \end{aligned}

(13)

That is, the inner product of $x_{i}$ and $x_{j}$ in the feature space is equal to their function values in the original sample space calculated by the function $K (x_{i}, y_{i})$ .

To limit the range of $ξ_{i}$ , the penalty factor $C > 0$ of the slack variable is added to the objective function, and the penalty parameter $C$ is introduced to control the generalization ability of SVM and prevent overfitting. The objective optimization function becomes the optimal hyperplane formula as follows:

\begin{aligned} min \frac{1}{2} ω^{T} ω + C \sum_{i}^{m} ξ_{i} \end{aligned}

(14)

The SVM algorithm utilizes high-dimensional data and fully utilizes existing detection data to identify the maximum distance between faulty and normal data, divide them by a hyperplane, and organize a fault diagnosis model.

Figure 2.

Flowchart of IBSSA.

3. Feature selection based on improved SSA

This section uses feature selection methods to extract prominent signals representing underlying features, and introduces an improved salp swarm algorithm for iterative search of feature subsets. Convert the continuous versions of SSA into binary representations and introduce chaotic mapping to evenly distribute the initial population within the spatial range. At the same time, use dynamic weight factors for position updates to prevent the population from falling into local optima. This method can find values closer to the expected value in fewer iterations, improving the accuracy and efficiency of the algorithm’s solution. The flowchart of IBSSA is shown in Fig. 2.

3.1 Improved binary salp swarm algorithm

3.1.1 Binary salp swarm algorithm

This article optimizes feature selection problems through SSA, benefiting from the flexibility and highly stochastic advantages of the algorithm in handling parameters and local solutions of different ranges. Transforming SSA into binary form to achieve feature selection, the search restriction of salps moves within the binary space, allowing the algorithm to search and optimize the solution space more effectively.

In this paper, the V-shaped transfer function (TF) is used to transform continuous algorithms into binary versions. The function represents the probability of converting elements in the feature subset to 0 or 1. Eq. (15) represents mapping the continuous position of truth values to a binary position.

\begin{aligned} T (x_{i}^{k} (t)) = | \frac{2}{π} \arctan (\frac{2}{π} x_{i}^{k} (t)) | \end{aligned}

(15)

$x_{i}^{k} (t)$ is the position of an individual in a k-dimensional space after $t$ iterations.

According to this function expression, use the following formula to convert it into a binary vector, where each cell in the vector has a value of 1 or 0. 1 indicates the selection of the corresponding feature; otherwise, the value is represented as 0. The position update is as follows:

\begin{aligned} x_{i}^{k} (t + 1) = {\begin{cases} _{\neg} X_{t}, & if U (0, 1) < T (x_{i}^{k} (t + 1)) \\ X_{t}, & if o t h e r w i s e \end{cases} \end{aligned}

(16)

$U (0, 1)$ is the sign of a uniformly distributed random number between 0 and 1.

Figure 3.

The basic bifurcation diagram of the Tent map.

3.1.2 Improved tent chaotic map

In the IBSSA algorithm, the population is typically initialized randomly, which increases the risk of premature convergence and can hinder the exploration of the global optimal solution. To address this issue, chaos strategies are used to allow the initial population to explore locations widely and evenly with both randomness and ergodicity. This paper introduces the Tent map initialization for the population, which has better chaotic properties than the Logistic map. The basic bifurcation diagram of the Tent map is shown in the Fig. 3, different chaotic behaviors of the x sequence obtained through iterative calculations are demonstrated for various values of r. The original Tent mappingF formula is shown in Eq. (17).

\begin{aligned} x_{n + 1} = {\begin{cases} 2 x_{n}, & if 0 ⩽ x_{n} < 0.5 \\ 2 (1 - x_{n}), & if 0.5 ⩽ x_{n} ⩽ 1 \end{cases} \end{aligned}

(17)

However, the Tent map is prone to getting stuck in fixed points and small cycles, resulting in simple, repetitive, and less diverse output. Therefore, this paper introduces an improved Tent map formula, as shown in Eq. (17). When $μ = 1 / 32$ and $θ = 4 π$ , the initial population is uniformly distributed.

\begin{aligned} x_{n + 1} = {\begin{cases} 2 x_{n} + μ \sin (θ x_{n}) & if 0 ⩽ x_{n} < 0.5 \\ 2 (1 - x_{n}) + μ \sin (θ x_{n}) & if 0.5 ⩽ x_{n} < 1 \end{cases} \end{aligned}

(18)

The effectiveness of the binary SSA algorithm can be enhanced by avoiding local optima through the use of an improved Tent map, thereby enhancing global convergence. The improved Tent map introduces a sine function and a $θ$ parameter, which transforms the mapping from a linear to a nonlinear function, improving its chaotic properties. The $θ$ parameter is used to control the complexity and properties of the mapping, making the improved Tent map more flexible. It is worth noting that under certain conditions, the improved Tent map exhibits a bimodal distribution, which provides great assistance in diagnosing bearing faults. Under normal conditions, the vibration signal generated by the bearing exhibits a relatively stable unimodal distribution. However, if a bearing fault occurs, the vibration signal may exhibit a bimodal distribution. Based on the different shapes of the bimodal distribution, the type and severity of the bearing fault can be analyzed, and future trends in its operation can be predicted, greatly improving the accuracy of fault prediction. The pseudocode for IBSSA is shown in Algorithm 2.

3.1.3 Position update strategy

To enhance convergence accuracy and mitigate the likelihood of encountering local optima, a weight factor $ω$ is incorporated into the follower position update formula. This factor governs the magnitude of individual follower movements. As the number of iterations increases, the value of $ω$ progressively decreases. This approach facilitates the population’s ability to accelerate its search in the initial stages and to focus on achieving relatively accurate results in the later stages. As a result, a balance is struck between global exploration and local convergence capabilities. The functional formula for the nonlinearly decreasing $ω$ is as follows:

\begin{aligned} ω (t) = ω_{min} + (ω_{max} - ω_{min}) * \exp (- 10 t / T) \end{aligned}

(19)

Here, $ω (t)$ represents the weight value of the $i$ -th salp in the $t$ -th iteration. After multiple experiments, the final values chosen for $ω$ were $ω_{min} = 0.5$ and $ω_{max} = 0.9$ , with T representing the maximum number of iterations. Therefore, the new follower update formula is as follows:

\begin{aligned} x_{j}^{i} = \frac{1}{2} * ω (t) * (x_{j}^{i} + x_{j}^{i - 1}) \end{aligned}

(20)

3.2 Feature subset representation

Due to the fact that feature selection helps to eliminate redundant features and improve the accuracy of fault diagnosis, finding representative features makes a greater contribution to fault diagnosis. In this section, for any signal decomposed into K feature vectors, different fault types will be multiples of K, which is a huge feature space that requires a thorough search. Therefore, IBSSA is used to adaptively search for the feature space representing the optimal subset of all features. The most ideal feature subset among them is to ensure the minimum classification error rate while selecting the minimum number of features. So, in IBSSA, a fitness function is used to evaluate search individuals, and the formula for the fitness function is as follows:

\begin{aligned} {f i t}_{θ} = α * E r r + (1 - α) \frac{\sum_{i} θ_{i}}{N} \end{aligned}

(21)

where

{f i t}_{θ}

is the classifier error measure,

α

is the constant between 0 and 1, which controls the selection probability of the selected feature number by the classification performance, and Err indicates the error rate. In each generation, the calculated fitness values can be used to evaluate the quality of feature subsets. The fitness function of IBSSA maximizes accuracy and attempts to achieve a smaller subset. Therefore, this article uses support vector machines as the training model, evaluates classification errors after searching for feature subsets, and uses classifier error rate as the evaluation criterion to prove its good classification performance.

3.3 Fault identification process

The proposed IBSSA-based rolling bearing diagnosis algorithm model is illustrated in Fig. 4. The process begins with data preprocessing, where the original fault vibration signal’s energy varies across frequencies. To attain a high-dimensional subset of fault features, the VMD method decomposes the signal into K IMF components. Redundant features are then eliminated from the high-dimensional feature vector using the feature selection wrapper method. The iterative search process of the IBSSA algorithm generates a low-dimensional feature subset, which is subsequently divided into a testing set and a training set. Finally, the SVM classification algorithm assesses the classification efficiency of the selected feature subset. Upon completing the model training, the trained SVM model is used to identify different bearing vibration fault states by introducing the test dataset. Figure 2 demonstrates the overall process of the proposed approach for the incipient fault diagnosis of rolling bearings.

Figure 4.

Overall process of the proposed method for incipient fault diagnosis of rolling bearings.

4. Experimental analysis

4.1 Dataset description

Given that the CWRU (Case Western Reserve University) dataset [23] is commonly used as the standard case for experiments in literature, which contains test samples with different types of faults, our research results are easier to compare with previous works. This dataset uses accelerometers to collect bearing vibration data, with the accelerometer mounted on a housing with a magnetic base at the 6 o’clock position on the drive end bearing of the motor housing. Due to measurement noise interference, the initial health status of each unit is also different during the data collection process in the dataset.The CWRU rolling bearing test stand is shown in Fig. 5.

For each test case, the dataset is divided based on its speed of 1797 rpm (0HP), 1772 rpm (1HP), 1750 rpm (2HP), and 1730 rpm (3HP). Each operating condition has a complete dataset represented by different fault locations, including normal, inner race, outer race, and rolling element (ball). A sampling rate of 12 kHz was used to introduce single point faults in the test bearings with defect diameters of 7 mils, 14 mils, and 21 mils and a fault depth of 0.011 inches.

The experiment was conducted using a single-machine setup consisting of a 2 horsepower electric motor with a motor torque sensor and encoder connected to the drive end, and a generator connected to the shaft, both using a 6205-2RS JEM SKF type bearing. Four different types of faults were distinguished based on different loads and defect diameters, as shown in Table 1 of the CWRU bearing dataset. For each operating condition, 50 datasets were selected, each containing 2000 data points and 24 features, including 11 time-domain features and 13 frequency-domain features.

Figure 5.

The CWRU rolling bearing test stand.

4.2 Feature extraction

In this article, in order to obtain better accuracy results, it is necessary to define the VMD mode in advance. Firstly, the parameter $α = 2000$ is determined, and the noise tolerance tau $=$ 0, DC $=$ 0, init $=$ 1, and convergence criterion tolerance tol $=$ le-7 can be defined. This method decomposes the collected vibration signal and separates each type of signal into a set of intrinsic mode functions (IMFs). After the signal decomposition process is completed, the VMD mode number K is determined in advance by using the central frequency. The value of K not only determines the number of IMF components in the signal decomposition, but also determines the dimension of the fault feature vector in practical applications. By mixing the VMD mode after decomposition, a one-to-one correspondence between the modes and input signals is established.

Table 1
Data set description.

Fault type Defect diameter 0 1 2 3

Normal 0 ✓ ✓ ✓ ✓

Inner race 0.007 ✓ ✓ ✓ ✓

0.014 ✓ ✓ ✓ ✓

0.021 ✓ ✓ ✓ ✓

Outer race 0.007 ✓ ✓ ✓ ✓

0.014 ✓ ✓ ✓ ✓

0.021 ✓ ✓ ✓ ✓

Ball 0.007 ✓ ✓ ✓ ✓

0.014 ✓ ✓ ✓ ✓

0.021 ✓ ✓ ✓ ✓

Fault type	Defect diameter	0	1	2	3
Normal	0	✓	✓	✓	✓
Inner race	0.007	✓	✓	✓	✓
	0.014	✓	✓	✓	✓
	0.021	✓	✓	✓	✓
Outer race	0.007	✓	✓	✓	✓
	0.014	✓	✓	✓	✓
	0.021	✓	✓	✓	✓
Ball	0.007	✓	✓	✓	✓
	0.014	✓	✓	✓	✓
	0.021	✓	✓	✓	✓

Figure 6.

The original signal and the time-domain waveform and frequency spectrum of IMF1-IMF4 (from top to bottom) decomposed by VMD.

Figure 7.

Envelope spectra of IMF1-IMF4 with different fault types.

When $K =$ 4 is chosen as the next experiment, the central frequency feature is obvious, indicating that it contains more of the primary features of the original signal. We observed that the value of parameter K affects the effectiveness of signal decomposition, so VMD decomposes the bearing vibration signal data into 4 modal components. In practical applications, parameter selection will account for most of the time, and improper parameter selection will affect the accuracy of the results.

Entropy analysis evaluates changes in frequency characteristics of data to ensure the accuracy of coarse-grained sequence information and provide more reliable entropy estimates. To further validate the optimal number of modes for measured signals, envelope spectrum analysis is performed on the inner circle, outer circle, and rolling unit of the IMF obtained by decomposing the fault status using the VMD algorithm. The VMD algorithm decomposes the fault signal into components $u_{k}$ . The greater the noise, the weaker the sparsity of $u_{k}$ , that is, the greater its amplitude spectral entropy. The formula for obtaining the envelope entropy of the VMD algorithm modal component amplitude spectrum is:

\begin{aligned} H_{p} (m) & = - \sum_{i = 1}^{k} p_{i} l n (p_{i}) \end{aligned}

(22)

\begin{aligned} p_{i} & = \frac{a (i)}{\sum_{i = 1}^{N} a (i)}, i = 1, 2, \dots, N \end{aligned}

(23)

Here, $a (i)$ is the envelope signal of the differentiated variable obtained through Hilbert transform, and m is the embedding dimension.

When the optimal value of VMD parameter combination [K, $α$ ] $=$ [4, 2000] is obtained, the sizes of each local minimum are compared. Local minima are the ultimate goal of this paper. As can be seen from Fig. 7(a), the fault characteristic frequency obtained by envelope spectrum analysis of the inner circle fault is clearly distinguished, with a fault characteristic frequency of 107.7 Hz. Combined with octave analysis, the frequency shows a multiplicative increase of the fault characteristic frequency, indicating that the fault is caused by periodic characteristics due to internal race defects.

Due to the reduced fault information contained in the IMF components after IMF4, adjacent components are prone to mixed effects. If the fault characteristic frequency and biphase line amplitude are not prominent, the background noise is severe, and there are too many interference spectral lines, this will affect the analysis effect of further research. Figures 6(a) to (d) show the time-domain waveform and frequency spectrum analysis of the original signal and IMF1-IMF4, respectively (from top to bottom). The amplitude spectrum is the amplitude expression of the best component instantaneous frequency, and its amplitude is represented in a single peak form, indicating that the characteristic frequency of the vibration signal has been completely decomposed without multi-peak phenomena, and the demodulation effect is best.

From Fig. 7(b), the actual fault characteristic frequency obtained by envelope spectrum demodulation of the outer race fault is 77.46 Hz, and its fault characteristic harmonics increase to 77.64 kHz, 137.7 kHz, 215.3 kHz, and 293 kHz. Therefore, we believe that this fault is caused by periodic characteristics due to inner race defects. Figure 7(c) clearly distinguishes the fault characteristic frequency of the rolling element, with actual fault characteristic frequencies of 41.02 kHz, 98.88 kHz, 146.5 kHz, and 185.3 kHz, approximately 2, 3, 4 times the fault characteristic frequency, respectively. This indicates that the fault is caused by periodic characteristics due to rolling element defects.

Through signal analysis, it has been found that feature extraction plays an important role in fault diagnosis and can affect the accuracy and efficiency of fault diagnosis models. This article proposes three methods of feature combination, which are time-domain and frequency-domain features extracted from the original vibration signal, as well as features extracted from the IMFs obtained through VMD decomposition [24]. The feature values are then used as input subsets for building and testing the model.

4.3 Fault identification

To validate the performance of the IBSSA method, the CWRU dataset was divided into a 4:1 ratio of training set samples and test set samples. The experiment in this section aims to evaluate different optimization schemes for selecting important features using different algorithms, selecting 23 benchmark functions of different dimensions to test the performance of IBSSA, and comparing them with binary particle swarm optimization (BPSO), binary genetic algorithm (BGA), and binary bat algorithm (BBA). All algorithms are tested under fair and equal computing conditions, with a population size of $N =$ 30 and a maximum number of iterations of 100. In order to eliminate the randomness of the experiment, the evaluation criteria used in this article are the mean and standard deviation of the results obtained by four comparative algorithms, and each algorithm is run independently 30 times.

Table 2 lists 23 benchmark functions with different dimensions from CEC2005, where F1-F7 is an unimodal function, F8-F13 is a simple multimodal function, and F14-F23 is a composite function. Figure 8 shows a two-dimensional version example of 23 benchmark functions, from which it can be seen that the unimodal function has a unique optimal solution, which is used to evaluate the basic search and convergence ability of the optimization algorithm. Multimodal and composite functions have multiple optimal values, which are used to test the algorithm’s ability to avoid local optima and search for global optima, and the composite function is more similar to the actual search space, making it more challenging. Set the dimension to 30 to test the performance of the algorithm in challenging problems with a large number of variables.

Table 2
List of the benchmark functions used in the experiments of the paper and their related information.

Function D Search space $f_{min}$

$f_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$ 30 $[- 100, 100]^{D}$ 0

$f_{2} (x) = \sum_{i = 1}^{n} | x_{i} | + \prod_{i = 1}^{n} | x_{i} |$ 30 $[- 10, 10]^{D}$ 0

$f_{3} (x) = \sum_{i = 1}^{n} {(\sum_{j - 1}^{i} x_{j})}^{2}$ 30 $[- 100, 100]^{D}$ 0

$f_{4} (x) = max_{i} {| x_{i} |, 1 ⩽ i ⩽ n}$ 30 $[- 100, 100]^{D}$ 0

$f_{5} (x) = \sum_{i = 1}^{n - 1} [100 (x_{i + 1} - x_{i}^{2})^{2} + (x_{i} - 1)^{2}]$ 30 $[- 30, 30]^{D}$ 0

$f_{6} (x) = \sum_{i = 1}^{n} {([x_{i} + 0.5])}^{2}$ 30 $[- 100, 100]^{D}$ 0

$f_{7} (x) = \sum_{i = 1}^{n} i x_{i}^{4} + r a n d o m [0, 1)$ 30 $[- 1.28, 1.28]^{D}$ 0

$f_{8} (x) = \sum_{i = 1}^{n} - x_{i} \sin (\sqrt{| x_{i} |})$ 30 $[- 500, 500]^{D}$ $-$ 12569.5

$f_{9} (x) = \sum_{i = 1}^{n} [x_{i}^{2} - 10 \cos (2 π x_{i}) + 10]$ 30 $[- 5.12, 5.12]^{D}$ 0

$f_{10} (x) = - 20 \exp (- 0.2 \sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}) - \exp (\frac{1}{n} \sum_{i = 1}^{n} \cos (2 π x_{i}))$ $+ 20 + e$ 30 $[- 32, 32]^{D}$ 0

$f_{11} (x) = \frac{1}{4000} \sum_{i = 1}^{n} x_{i}^{2} - \prod_{i = 1}^{n} \cos (\frac{x_{i}}{\sqrt{i}}) + 1$ 30 $[- 600, 600]^{D}$ 0

$f_{12} (x) = \frac{π}{n} {10 s i n^{2} (π y_{1}) + \sum_{i = 1}^{n - 1} (y_{i} - 1)^{2} [1 + 10 s i n^{2} (π y_{i + 1})] + (y_{n} - 1)^{2}} + \sum_{i = 1}^{n} u (x_{i}, 10, 100, 4)$ 30 $[- 50, 50]^{D}$ 0

$y_{i} = 1 + \frac{x_{i} + 1}{4}, u (x_{i}, a, k, m) = {\begin{matrix} k (x_{i} - a)^{m} & x_{i} > a \\ 0 & - a < x_{i} < a \\ k (- x_{i} - a)^{m} & x_{i} < - a \end{matrix}$

$f_{13} (x) = 0.1 {s i n^{2} (3 π x_{1}) + \sum_{i = 1}^{n - 1} (x_{i} - 1)^{2} [1 + s i n^{2} (3 π x_{i + 1})] + (x_{n} - 1)^{2} * [1 + s i n^{2} (2 π x_{n})]} + \sum_{i = 1}^{n} u (x_{i}, 5, 100, 4)$ 30 $[- 50, 50]^{D}$ 0

$f_{14} (x) = {(\frac{1}{500} + \sum_{j = 1}^{25} \frac{1}{j + \sum_{i = 1}^{2} (x_{i} - a_{i j})^{6}})}^{- 1}$ 2 $[- 65.536, 65.536]^{D}$ 0.99800383779445

$f_{15} (x) = \sum_{i = 1}^{11} {[a_{i} - \frac{x_{1} (b_{i}^{2} + b_{i} x_{2})}{b_{i}^{2} + b_{i} x_{3} + x_{4}}]}^{2}$ 4 $[- 5, 5]^{D}$ 0.0003075

$f_{16} (x) = 4 x_{1}^{2} - 2.1 x_{1}^{4} + \frac{1}{3} x_{1}^{6} + x_{1} x_{2} - 4 x_{2}^{2} + x_{2}^{4}$ 2 $[- 5, 5]^{D}$ $-$ 1.03162845348988

$f_{17} (x) = {(x_{2} - \frac{5.1}{4 π^{2}} x_{1}^{2} + \frac{5}{π} x_{1} - 6)}^{2} + 10 (1 - \frac{1}{8 π}) {cosx}_{1} + 10$ 2 $[- 5, 10] \times [0, 15]$ 0.397887357729738

$f_{18} (x) = [1 + (x_{1} + x_{2} + 1)^{2} (19 - 14 x_{1} + 3 x_{1}^{2} - 14 x_{2} + 6 x_{1} x_{2} + 3 x_{2}^{2})] \times [30 + (2 x_{1} - 3 x_{2})^{2} (18 - 32 x_{1} + 12 x_{1}^{2} + 48 x_{2} - 36 x_{1} x_{2} + 27 x_{2}^{2})]$ 2 $[- 2, 2]^{D}$ 2.99999999999992

$f_{19} (x) = - \sum_{i = 1}^{4} c_{i} \exp (- \sum_{j = 1}^{3} a_{i j} (x_{j} - p_{i j})^{2})$ 3 $[0, 1]^{D}$ $-$ 3.86278214782076

$f_{20} (x) = - \sum_{i = 1}^{4} c_{i} \exp (- \sum_{j = 1}^{6} a_{i j} (x_{j} - p_{i j})^{2})$ 6 $[0, 1]^{D}$ $-$ 3.32199517158424

$f_{21} (x) = - \sum_{i = 1}^{5} {[(x - a_{i}) (x - a_{i})^{T} + c_{i}]}^{- 1}$ 4 $[0, 10]^{D}$ $-$ 10.153199679

$f_{22} (x) = - s u m_{i = 1}^{7} {[(x - a_{i}) (x - a_{i})^{T} + c_{i}]}^{- 1}$ 4 $[0, 10]^{D}$ $-$ 10.4029405667869

$f_{23} (x) = - \sum_{i = 1}^{10} {[(x - a_{i}) (x - a_{i})^{T} + c_{i}]}^{- 1}$ 4 $[0, 10]^{D}$ $-$ 10.5364

Function	D	Search space	$f_{min}$
$f_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$	30	$[- 100, 100]^{D}$	0
$f_{2} (x) = \sum_{i = 1}^{n} \| x_{i} \| + \prod_{i = 1}^{n} \| x_{i} \|$	30	$[- 10, 10]^{D}$	0
$f_{3} (x) = \sum_{i = 1}^{n} {(\sum_{j - 1}^{i} x_{j})}^{2}$	30	$[- 100, 100]^{D}$	0
$f_{4} (x) = max_{i} {\| x_{i} \|, 1 ⩽ i ⩽ n}$	30	$[- 100, 100]^{D}$	0
$f_{5} (x) = \sum_{i = 1}^{n - 1} [100 (x_{i + 1} - x_{i}^{2})^{2} + (x_{i} - 1)^{2}]$	30	$[- 30, 30]^{D}$	0
$f_{6} (x) = \sum_{i = 1}^{n} {([x_{i} + 0.5])}^{2}$	30	$[- 100, 100]^{D}$	0
$f_{7} (x) = \sum_{i = 1}^{n} i x_{i}^{4} + r a n d o m [0, 1)$	30	$[- 1.28, 1.28]^{D}$	0
$f_{8} (x) = \sum_{i = 1}^{n} - x_{i} \sin (\sqrt{\| x_{i} \|})$	30	$[- 500, 500]^{D}$	$-$ 12569.5
$f_{9} (x) = \sum_{i = 1}^{n} [x_{i}^{2} - 10 \cos (2 π x_{i}) + 10]$	30	$[- 5.12, 5.12]^{D}$	0
$f_{10} (x) = - 20 \exp (- 0.2 \sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}) - \exp (\frac{1}{n} \sum_{i = 1}^{n} \cos (2 π x_{i}))$ $+ 20 + e$	30	$[- 32, 32]^{D}$	0
$f_{11} (x) = \frac{1}{4000} \sum_{i = 1}^{n} x_{i}^{2} - \prod_{i = 1}^{n} \cos (\frac{x_{i}}{\sqrt{i}}) + 1$	30	$[- 600, 600]^{D}$	0
$f_{12} (x) = \frac{π}{n} {10 s i n^{2} (π y_{1}) + \sum_{i = 1}^{n - 1} (y_{i} - 1)^{2} [1 + 10 s i n^{2} (π y_{i + 1})] + (y_{n} - 1)^{2}} + \sum_{i = 1}^{n} u (x_{i}, 10, 100, 4)$	30	$[- 50, 50]^{D}$	0
$y_{i} = 1 + \frac{x_{i} + 1}{4}, u (x_{i}, a, k, m) = {\begin{matrix} k (x_{i} - a)^{m} & x_{i} > a \\ 0 & - a < x_{i} < a \\ k (- x_{i} - a)^{m} & x_{i} < - a \end{matrix}$
$f_{13} (x) = 0.1 {s i n^{2} (3 π x_{1}) + \sum_{i = 1}^{n - 1} (x_{i} - 1)^{2} [1 + s i n^{2} (3 π x_{i + 1})] + (x_{n} - 1)^{2} * [1 + s i n^{2} (2 π x_{n})]} + \sum_{i = 1}^{n} u (x_{i}, 5, 100, 4)$	30	$[- 50, 50]^{D}$	0
$f_{14} (x) = {(\frac{1}{500} + \sum_{j = 1}^{25} \frac{1}{j + \sum_{i = 1}^{2} (x_{i} - a_{i j})^{6}})}^{- 1}$	2	$[- 65.536, 65.536]^{D}$	0.99800383779445
$f_{15} (x) = \sum_{i = 1}^{11} {[a_{i} - \frac{x_{1} (b_{i}^{2} + b_{i} x_{2})}{b_{i}^{2} + b_{i} x_{3} + x_{4}}]}^{2}$	4	$[- 5, 5]^{D}$	0.0003075
$f_{16} (x) = 4 x_{1}^{2} - 2.1 x_{1}^{4} + \frac{1}{3} x_{1}^{6} + x_{1} x_{2} - 4 x_{2}^{2} + x_{2}^{4}$	2	$[- 5, 5]^{D}$	$-$ 1.03162845348988
$f_{17} (x) = {(x_{2} - \frac{5.1}{4 π^{2}} x_{1}^{2} + \frac{5}{π} x_{1} - 6)}^{2} + 10 (1 - \frac{1}{8 π}) {cosx}_{1} + 10$	2	$[- 5, 10] \times [0, 15]$	0.397887357729738
$f_{18} (x) = [1 + (x_{1} + x_{2} + 1)^{2} (19 - 14 x_{1} + 3 x_{1}^{2} - 14 x_{2} + 6 x_{1} x_{2} + 3 x_{2}^{2})] \times [30 + (2 x_{1} - 3 x_{2})^{2} (18 - 32 x_{1} + 12 x_{1}^{2} + 48 x_{2} - 36 x_{1} x_{2} + 27 x_{2}^{2})]$	2	$[- 2, 2]^{D}$	2.99999999999992
$f_{19} (x) = - \sum_{i = 1}^{4} c_{i} \exp (- \sum_{j = 1}^{3} a_{i j} (x_{j} - p_{i j})^{2})$	3	$[0, 1]^{D}$	$-$ 3.86278214782076
$f_{20} (x) = - \sum_{i = 1}^{4} c_{i} \exp (- \sum_{j = 1}^{6} a_{i j} (x_{j} - p_{i j})^{2})$	6	$[0, 1]^{D}$	$-$ 3.32199517158424
$f_{21} (x) = - \sum_{i = 1}^{5} {[(x - a_{i}) (x - a_{i})^{T} + c_{i}]}^{- 1}$	4	$[0, 10]^{D}$	$-$ 10.153199679
$f_{22} (x) = - s u m_{i = 1}^{7} {[(x - a_{i}) (x - a_{i})^{T} + c_{i}]}^{- 1}$	4	$[0, 10]^{D}$	$-$ 10.4029405667869
$f_{23} (x) = - \sum_{i = 1}^{10} {[(x - a_{i}) (x - a_{i})^{T} + c_{i}]}^{- 1}$	4	$[0, 10]^{D}$	$-$ 10.5364

Figure 8.

2D images of 23 test functions.

Table 3

CEC2005 test results.

Function		IBSSA	BSSA	BGA	BPSO	BBA
$F_{1}$	mean	2.3524e+03	3.2718e+03	1.7386e+05	2.8659e+04	2.2323e+05
	std	2.6858e-13	3.7219e-09	2.9250e-11	5.2945e+04	7.2372e+01
$F_{2}$	mean	5.0829e+00	2.5254e+01	1.4728e+11	2.9925e+10	1.5075e+11
	std	8.3824e-10	1.5231e-05	6.1343e-05	2.9871e+11	1.1585e+12
$F_{3}$	mean	2.4778e+03	5.4604e+03	2.7432e+05	1.3415e+04	1.5400e+05
	std	1.6742e-15	9.4798e-11	1.1700e-10	1.2730e+04	8.7751e-11
$F_{4}$	mean	2.0514e+01	2.0455e+01	8.2217e+01	2.7948e+01	9.0596e+01
	std	6.0576e-02	0.0000e+00	1.0392e+00	1.0353e+01	1.5237e-01
$F_{5}$	mean	7.1936e+02	9.2535e+05	2.7001e+08	6.2061e+06	2.6435e+08
	std	3.5625e-09	7.4460e+05	1.7971e-07	3.1243e+07	2.0292e+05
$F_{6}$	mean	6.7458e+02	2.9960e+03	5.2016e+04	4.6366e+03	7.3397e+04
	std	0.0000e+00	3.3584e+02	2.1938e-11	8.3132e+03	1.2782e+02
$F_{7}$	mean	1.1734e-02	4.9969e-01	1.0850e+02	2.3823e+00	6.6254e+01
	std	8.8196e-11	1.2438e-02	5.2880e+00	1.1385e+01	4.8856e-02
$F_{8}$	mean	$-$ 5.3669e+03	$-$ 3.5544e+03	$-$ 2.1832e+03	$-$ 3.2806e+03	$-$ 2.6394e+03
	std	1.8847e-15	4.2608e+02	1.3711e-12	3.4408e+02	8.5603e+01
$F_{9}$	mean	1.5629e+02	2.1503e+02	4.6684e+02	2.6537e+02	3.7135e+02
	std	1.1136e-17	1.2618e+01	1.7139e-13	2.9352e+01	1.0621e+01
$F_{10}$	mean	1.1988e+01	1.2119e+01	2.0450e+01	1.1707e+01	1.9987e+01
	std	2.3771e+00	1.1900e+00	1.0712e-14	4.3812e-14	1.2010e-01
$F_{11}$	mean	1.4369e+01	2.6722e+01	5.5898e+02	4.7059e+01	5.7026e+02
	std	1.9035e-13	7.3129e+00	2.2852e-13	7.8043e+01	1.8736e-02
$F_{12}$	mean	8.0971e+01	3.2363e+02	4.0981e+08	6.9167e+06	6.3298e+08
	std	4.7993-11	4.7580e+02	1.1981e-07	5.3200e+07	5.6998e+06
$F_{13}$	mean	5.4227e+04	2.4494e+05	9.4323e+08	1.6650e+07	6.1313e+08
	std	1.8325e-07	3.5407e+05	5.1619e+07	1.0381e+08	2.3962e-07
$F_{14}$	mean	6.8476e+00	7.5282e+00	1.9602e+01	6.2708e+00	1.1339e+01
	std	3.8829e+04	5.6812e+00	1.7553e+01	3.8673e-17	1.7853e-15
$F_{15}$	mean	1.0435e-02	2.3940e-02	6.8066e-02	3.6279e-03	7.1055e-02
	std	3.3224e+03	2.7010e-02	1.3948e-17	1.7018e-19	2.9177e-02
$F_{16}$	mean	$-$ 1.0316e+00	$-$ 3.0647e+00	2.2400e+00	$-$ 9.9425e-01	$-$ 4.3097e-01
	std	0.0000e+00	3.5960e-08	3.6978e-13	3.0103e-01	9.8054e-02
$F_{17}$	mean	2.5846e-03	–	–	4.3155e-01	–
	std	1.1269e-04	–	–	1.8935e-01	–
$F_{18}$	mean	3.0000e+00	3.0001e+00	2.8589e+01	3.8277e+00	7.6449e+00
	std	9.1884e-09	2.1271e-04	2.5688e+01	7.5953e+00	9.6514e+00
$F_{19}$	mean	$-$ 3.8613e+00	$-$ 3.8075e+00	$-$ 3.6815e+00	$-$ 3.8323e+00	$-$ 3.5194e+00
	std	0.0000e+00	3.7802e-02	8.9265e-16	1.0875e-01	6.0023e-02
$F_{20}$	mean	$-$ 3.3185e+00	$-$ 2.9400e+00	$-$ 2.5803e+00	$-$ 3.2405e+00	$-$ 2.0541e+00
	std	2.4993e-17	2.0530e-01	1.3390e-15	2.6529e-01	4.7126e-02
$F_{21}$	mean	$-$ 8.2272e+00	$-$ 5.1866e+00	$-$ 5.2563e-01	$-$ 4.2760e+00	$-$ 5.9337e-01
	std	4.2436e-11	2.7603e+00	3.6062e-02	1.0788e+00	1.4888e-02
$F_{22}$	mean	$-$ 7.0703e+00	$-$ 4.8129e+00	$-$ 3.1449e+00	$-$ 2.5443e+00	$-$ 1.0844e+00
	std	1.7100e-03	2.3744e+00	6.1382e-01	3.5381e-01	9.4798e-02
$F_{23}$	mean	$-$ 3.0329e+00	$-$ 2.6944e+00	-8.3048e-01	$-$ 8.4686e+00	$-$ 1.5132e+00
	std	1.7658e+00	1.9809e-01	8.6988e-02	9.5054e-05	9.0979e-02

Table 3 provides the mean and standard deviation of the objective function values for each algorithm to display the average execution and stability of the IBSSA algorithm overall running cycles. From the Table 3, it can be seen that the average and standard deviation of the IBSSA algorithm are better than other algorithms on most test functions, indicating that IBSSA is easier to find and converge to the global optimum on unimodal problems, verifying its stronger global exploration ability. From the results of the multimodal test function $F_{8}$ - $F_{13}$ , it can be seen that the IBSSA algorithm is once again superior to other algorithms in most test functions. These quantitative analysis results strongly demonstrate that the algorithm can effectively search for space and avoid local optima. From the test results of composite functions $F_{14}$ - $F_{23}$ , it can be seen that functions $F_{14}$ and $F_{15}$ are slightly weaker than BPSO, which may be due to the decrease in the number of followers, resulting in weaker performance of the IBSSA algorithm in functions $F_{14}$ and $F_{15}$ . Overall, the results of IBSSA in these functions demonstrate its universality and adaptability to different problem domains and complexities.

The Wilcoxon rank sum test [25] is a nonparametric statistical test used to determine whether there are significant differences between the IBSSA algorithm and other algorithms. Therefore, we take the results of 30 independent tests on 23 test functions for each of the four algorithms as samples. Perform the Wilcoxon rank sum test at a significance level of 0.05 to determine whether there is a significant difference between the solution results of the four comparison algorithms and those of IBSSA. The test results are shown in Table 4. When $P < 0.05$ can be considered as rejecting the null hypothesis, it indicates a significant difference between the two compared algorithms, while a $P$ value greater than 0.05 means that the search performance between algorithms is comparable. From Table 4, it can be seen that the IBSSA algorithm has significant differences from other algorithms. In summary, IBSSA has substantial advantages over BSSA, BGA, BPSO, and BBA, indicating that the superiority of the IBSSA algorithm is statistically significant.

Table 4

CEC2005 Wilcoxon rank sum test results.

Function	BSSA	BGA	BPSO	BBA
$F_{1}$	1.92e-05 $<$ 0.05	1.64e-04 $<$ 0.05	4.726e-01	1.64e-04 $<$ 0.05
$F_{2}$	3.28e-05 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{3}$	2.51e-05 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	3.40e-04 $<$ 0.05
$F_{4}$	1.02e-04 $<$ 0.05	1.73e-05 $<$ 0.05	1.73e-05 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{5}$	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	2.39e-04 $<$ 0.05
$F_{6}$	3.32e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{7}$	5.5966e-01	2.2e-04 $<$ 0.05	4.36e-05 $<$ 0.05	7.69e-05 $<$ 0.05
$F_{8}$	1.00e+00	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{9}$	1.76e-03 $<$ 0.05	1.64e-04 $<$ 0.05	5.795e-03 $<$ 0.05	2.46e-04 $<$ 0.05
$F_{10}$	1.57e-04 $<$ 0.05	2.36e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{11}$	1.152e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{12}$	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	2.83e-03 $<$ 0.05
$F_{13}$	2.152e-04 $<$ 0.05	1.64e-04 $<$ 0.05	2.39e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{14}$	1.297e-01	1.64e-04 $<$ 0.05	4.32e-02 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{15}$	1.64e-04 $<$ 0.05	1.01e-03 $<$ 0.05	2.202e-03 $<$ 0.05	2.1224e-01
$F_{16}$	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{17}$	9.15e-06 $<$ 0.05	1.64e-04 $<$ 0.05	1.32e-03 $<$ 0.05	3.76e-02 $<$ 0.05
$F_{18}$	1.0825e-01	5.71e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{19}$	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.83e-03 $<$ 0.05
$F_{20}$	2.96e-05 $<$ 0.05	7.69e-04 $<$ 0.05	7.69e-05 $<$ 0.05	4.27e-01
$F_{21}$	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05
$F_{22}$	9.15e-04 $<$ 0.05	3.35e-04 $<$ 0.05	6.28843e-01	1.64e-04 $<$ 0.05
$F_{23}$	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05	1.64e-04 $<$ 0.05

As shown in Fig. 9, IBSSA began to converge at the 20th iteration, and the improvement speed of IBSSA was significantly faster than that of other algorithms, enabling the algorithm to quickly reach a balance during the search process, indicating that IBSSA’s optimization performance has good robustness.

IBSSA uses SVM as the fault classifier, introducing penalty parameter $C$ to control the generalization ability of SVM and the kernel parameter $g$ of the Gaussian kernel function. The optimal kernel parameter $g$ was determined to be 1.716 and the penalty factor $C$ was 49.712 based on the optimal position value and fitness value of IBSSA. To demonstrate the performance of IBSSA, this paper compares classic feature selection algorithms with meta-heuristic-based feature selection methods. Classic feature selection algorithms include mixed recursive feature elimination, mutual information, and Lasso, while meta-heuristic-based feature selection methods include BBA, BPSO, BGA, and BSSA. The final feature importance list for each method is averaged through five-fold cross-validation, and these methods are tested on SVM classifiers to find the best feature set that produces the highest classification accuracy. As shown in Table 5, the optimal fitness value of IBSSA is 0.9689, and compared with other meta-heuristic-based optimizers, its classification accuracy has improved by about 2%–5%. Similarly, the IBSSA method performs more significantly because it selects fewer features but achieves the optimal F-value. This means that IBSSA is a more ideal feature selection method, producing a smaller subset of features and higher classification accuracy.

Table 5

Comparison of different feature selection methods.

Algorithm	Feature	F-value	Fitness value
Hybrid-RFE	23	0.8104	–
mRMR	16	0.9166	–
Lasso	15	0.9295	–
BBA	20	0.9642	0.9362
BPSO	10	0.9811	0.9257
BGA	12	0.9771	0.9499
BSSA	15	0.9428	0.9277
IBSSA	8	0.9870	0.9689

Figure 9.

Convergence curve of fitness value of different algorithms.

To further evaluate the influence of feature subsets on the algorithm, Fig. 10 shows the accuracy fluctuations corresponding to different optimization algorithms as the number of features varies. Both IBSSA and BPSO showed relatively stable accuracy when extracting optimal feature subsets, and IBSSA had already extracted the most effective feature subset when the feature quantity was set to 8. The experimental results demonstrate the good performance of the IBSSA method.

To demonstrate the superiority of the algorithm, multiple classification evaluation metrics will be used to analyze IBSSA, SSA, and standard SVM. In addition to accuracy analysis, two useful indicators are precision and recall [26]. The formula of accuracy rate, recall rate, and F-value is shown as follows, where TP represents true positives, FP represents false positives, FN represents false negatives, and TN stands for true negatives. On the one hand, to reduce the cost burden caused by unnecessary factors such as a high recall rate and low accuracy rate, a lower false positive rate is required. On the other hand, if only true errors are labeled without reporting false positive results, the accuracy rate is high but the recall rate is low. For each type of fault, a comprehensive balance of these two indicators requires the passage of time and can be calculated based on the precision and recall rates of each type of fault to obtain the F-score until the actual fault occurs, triggering an alarm without missing or false alarm.

\begin{aligned} a c c u r a c y & = \frac{T P + T N}{T P + F P + T N + F N} \end{aligned}

(24)

\begin{aligned} p r e c i s i o n & = \frac{T P}{T P + F P} \end{aligned}

(25)

\begin{aligned} r e c a l l & = \frac{T P}{T P + F N} \end{aligned}

(26)

\begin{aligned} F - score & = \frac{2 P R}{P + R} \end{aligned}

(27)

To ensure the fairness of the experiments and the reliability of the results, we conducted 30 experiments using the same dataset. As shown in Table 6, when training with standard SVM, the algorithm iteration optimization for a certain part of features is not required due to the lack of a feature selection process, resulting in a relatively shorter training runtime and lower accuracy rate. The IBSSA algorithm selects effective feature frequencies, improving overall efficiency. Moreover, as traditional algorithms perform poorly in executing multiple classification tasks and cannot adaptively learn fault features, their feature classification effect is inferior, leading to significantly lower testing accuracy compared to other models. The experiments demonstrate that IBSSA performs well in various evaluation indicators.

Table 6

Comparison of different algorithms.

Algorithm	IBSSA	SSA	Standard SVM
F-score	0.9715	0.9193	0.9008
Recall	0.9668	0.9132	0.8967
Accuracy	0.9750	0.9250	0.9033
Precision	0.9763	0.9255	0.8519

Figure 10.

Feature selection F value of different algorithms.

As shown in Figs 11 and 12, fault label 1 represents the normal type, label 2 represents the inner race fault, label 3 represents the outer race fault, and label 4 represents the rolling element fault. Samples of the same category are well-classified in the feature space, while samples between different categories are far apart. The IBSSA-based fault diagnosis algorithm mainly focuses on misclassifying test samples labeled as 2. After multiple runs, we found that the SSA fault diagnosis errors were distributed on the inner race fault, outer race fault, and rolling element fault, exhibiting unstable performance. Due to its lack of ability to find global optimum solutions, it has poor convergence. At the same time, it has weak capabilities in handling outliers and missing values, making it difficult to ensure stability when processing real-time massive data. In contrast, IBSSA significantly optimized the algorithm’s error diagnosis. The results show that the IBSSA iterative algorithm has the least number of error diagnoses and good stability.

Figure 11.

IBSSA algorithm diagnosis result.

Figure 12.

SSA algorithm diagnosis result.

From the diagnostic results of various algorithms, it becomes apparent that the feature selection technique proposed in this paper, known as IBSSA, surpasses alternative approaches, reflecting the most stable and highest fitness value of 98.7%. The results show that IBSSA can effectively improve the algorithm’s efficiency. Therefore, the SVM model based on IBSSA has advantages in all aspects of fault diagnosis, and the proposed feature selection method helps eliminate redundant features and improve the accuracy of fault diagnosis.

As mentioned above, the good performance of IBSSA has been verified. To provide additional evidence of the algorithm’s superior performance, we conducted a comparative analysis between the model developed in this study and three alternative fault recognition models. The results of this comparative assessment are summarized in Table 7. In these models, the experimental dataset is not processed and is split completely the same under the same dataset. As shown in the Table 7, Double SVM combined with wavelet denoising and machine learning for bearing fault diagnosis achieved an experimental accuracy of 96%. IMDE extracts multi-scale fault features from the original signal through an improved multi-scale diffusion entropy (IMDE) method, and automatically selects sensitive features using the maximum correlation minimum redundancy algorithm with an accuracy rate of 95.15%. Using the Principal Component Analysis (PCA) method to reduce the dimensionality of feature subsets. By inputting the filtered feature subset into the neural network for diagnosis, the highest recognition accuracy of 94.17% was achieved on the same dataset. These bearing fault diagnosis results indicate that the rolling bearing fault recognition method based on IBSSA proposed in this article has significant improvements compared to other traditional fault recognition methods, and the fault diagnosis rate has been significantly improved compared to other algorithms, which strongly verifies the effectiveness of this method.

Table 7

Comparison of the accuracy of different classification fault models.

Diagnosis method	Precision	Fault type
Double SVM [27]	0.960	OR; IR; Ball; Normal
IMDE [28]	0.9515	OR; IR; Ball; Normal
PCA [29]	0.9417	OR; IR; Ball; Normal
IBSSA	0.9750	OR; IR; Ball; Normal

In this paper, the IMF obtained by VMD decomposition was chosen as the initial feature for fault diagnosis, and the feasibility of this method was demonstrated through experiments. In order to eliminate redundant and unrelated information, a feature selection method based on IBSSA was proposed to extract significant features from all features. We observed that the difference in testing accuracy between our proposed solution and other studies is not equal in magnitude. However, due to the efficient iterative calculation of the algorithm, it has improved the recognition accuracy to some extent, while also efficiently extracting features of test samples, thus better enhancing the signal extraction with practical physical significance and achieving global optimization of classification performance.

5. Conclusion and future work

As industrialization continues to advance, high-quality processing of industrial big data has become particularly important. In order to accurately identify different fault states of bearings in massive data and prevent production stagnation and increased maintenance costs due to faults, this article proposes a new method for bearing fault diagnosis based on an improved salp swarm algorithm. Through performance testing of the algorithm and instance simulation analysis, the results show that:

Vibration signal fault diagnosis is mainly divided into three parts: data preprocessing, feature selection, and fault recognition. The VMD method is used to preprocess the collected vibration signals, observe the amplitude of the impact pulse and the spectrogram of its time domain waveform, and determine the size of K value using the center frequency.

The IBSSA algorithm converts continuous space into binary space and introduces chaotic mapping and inertia weighting factor to balance the trade-off between global search and local search, in order to achieve higher search efficiency and better global optimal solution. The wrapper method of feature selection is used to reduce the dimensionality of fault feature components, and the IBSSA algorithm is used to iteratively search for low-dimensional feature subsets in the feature space. Finally, BPSO, BGA, BBA, and BSSA are selected for analysis, and the performance of the algorithms under different evaluation criteria is compared. The results show that the research method proposed in this paper has the best classification effect and is superior to other methods in terms of accuracy and stability.

By analyzing the vibration signals of bearings and identifying the type of fault, the industrial bearing fault problem is effectively solved, thereby reducing the error rate of faults and repairing them more effectively.

It should be noted that this method is only validated based on the data obtained by the testers from the signals. In practical engineering, the signals emitted by faulty equipment will be even weaker and more complex. Therefore, before actual application, this method still needs to undergo further preprocessing. This is our future research direction.

References

Gao

Zhang

Pei

, Rolling bearing fault diagnosis based on SSA optimized self-adaptive DBN, ISA Transactions 128(B) (2022), 485–502.

Zhang

Yang

, Rolling bearing faults identification based on multiscale singular value, Advanced Engineering Informatics 57 (2023), 102040.

Zheng

Cao

Pan

, Spectral envelope-based adaptive empirical Fourier decomposition method and its application to rolling bearing fault diagnosis, ISA Transactions 129 (2022), 476–492.

Alsalaet

J.K.

Hajnayeb

Bahedh

A.S.

, Bearing fault diagnosis using normalized diagnostic feature-gram and convolutional neural network, Measurement Science and Technology 34(4) (2023), 045901.

Joint learning system based on semi-pseudo-label reliability assessment for weak-fault diagnosis with few labels, Mechanical Systems and Signal Processing 189 (2023), 110089.

Gao

Zhang

Pei

, Rolling bearing fault diagnosis based on SSA optimized self-adaptive DBN, ISA Transactions 128 (2022), 485–502. doi: 10.1016/j.isatra.2021.11.024. https://www-sciencedirect-com-443.web.bisu.edu.cn/science/article/pii/S0019057821005942.

Xue

Fan

Cao

, Efficient analytical modeling for pulsed eddy current signals using adaptive interpolation-based Fourier transform, Nondestructive Testing and Evaluation 38 (2023), 631–647.

Wang

Cui

Liu

Shen

, High-G MEMS Accelerometer Calibration Denoising Method Based on EMD and Time-Frequency Peak Filtering, Micromachines 14(5) (2023), 970.

Luo

Chen

Huang

Zhang

, Joint Application of VMD and IWOA-PNN for Gearbox Fault Classification via Current Signal, IEEE Sensors Journal 23(12) (2023), 13155–13164.

10.

Liu

Jin

Chen

Shan

, Fault diagnosis of flywheel bearing based on parameter optimization variational mode decomposition energy entropy and deep learning, Energy 239 (2022), 122108.

11.

Sadeghian

Akbari

Nematzadeh

, A hybrid feature selection method based on information theory and binary butterfly optimization algorithm, Engineering Applications of Artificial Intelligence 97 (2021), 104079.

12.

Wang

Xue

Liang

Zhang

, Feature clustering-Assisted feature selection with differential evolution, Pattern Recognition 140 (2023), 109523.

13.

Zhou

Zhu

Jiao

, A two-stage hybrid ant colony optimization for high-dimensional feature selection, Pattern Recognition 116 (2021), 107933.

14.

Dinov

I.D.

, Variable importance and feature selection, in: Data Science and Predictive Analytics: Biomedical and Health applications using R, Springer, 2023, pp. 579–639.

15.

Bairathi

Gopalani

, An Improved Salp Swarm Algorithm for Complex Multi-Modal Problems, Soft Comput 25(15) (2021), 10441-10465.

16.

Mishra

Panigrahi

R.R.

, Advanced signal processing and machine learning techniques for voltage sag causes detection in an electric power system, International Transactions on Electrical Energy Systems 30(1) (2020), 12167.

17.

Deng

Zhang

Zhao

, Intelligent identification of incipient rolling bearing faults based on VMD and PCA-SVM, Advances in Mechanical Engineering 14(1) (2022), 1–18.

18.

Aiswarya

Nair

D.S.

Rajeev

Vinod

, A novel SVM based adaptive scheme for accurate fault identification in microgrid, Electric Power Systems Research 221 (2023), 109439.

19.

nadji hadroug

Iratni

Hafaifa

Bachir

Colak

, Implementation of Vibrations Faults Monitoring and Detection on Gas Turbine System Based on the Support Vector Machine Approach, International Journal of Engineering Technologies IJET (2023).

20.

Wang

, Research on Fault Diagnosis of Gearbox with Improved Variational Mode Decomposition, Sensors 18(10) (2018), 3510.

21.

Wang

Liu

Huang

, Removal of AM-FM harmonics using VMD technology for operational modal analysis of milling robot, Mechanical Systems and Signal Processing 200 (2023), 110475.

22.

Mirjalili

Gandomi

A.H.

Mirjalili

S.Z.

Saremi

Faris

Mirjalili

S.M.

, Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems, Advances in Engineering Software 114 (2017), 163–191.

23.

Lei

, Fault diagnosis of rotating machinery based on multiple ANFIS combination with GAs, Mechanical Systems and Signal Processing 21(5) (2007), 2280–2294.

24.

Smith

W.A.

Randall

R.B.

, Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study, Mechanical Systems and Signal Processing 64-65 (2015), 100–131.

25.

Derrac

García

Molina

Herrera

, A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms, Swarm and Evolutionary Computation 1(1) (2011), 3–18.

26.

Grandini

Bagli

Visani

, Metrics for multi-class classification: an overview, arXiv preprint arXiv:2008.05756 (2020).

27.

Wang

Mao

, A Bearing Fault Diagnosis Method Based on Wavelet Denoising and Machine Learning, Applied Sciences 13(10) (2023), 5936.

28.

Yan

Jia

, Intelligent fault diagnosis of rotating machinery using improved multiscale dispersion entropy and mRMR feature selection, Knowledge-Based Systems 163 (2019), 450–471.

29.

Yue

Aidong

Kai

Xiaojia

Haifeng

Wei

, A novel bearing fault diagnosis method based on principal component analysis and BP neural network, 2019 14th IEEE International Conference on Electronic Measurement & Instruments (ICEMI). (2019), 1125–1131.