Deep convolution neural network for machine health monitoring using spectrograms of vibration signal and its EMD-intrinsic mode functions

Abstract

A spectrum-image based representation of machine vibration signals with deep convolution neural network is proposed for machine fault classification in which the convolution layer is used for automatic feature extraction as an alternate to the conventional feature-based methods. Two different forms of spectrum representations are proposed, one based on the short time Fourier transform of the original signals and the other based on the short time Fourier transform of the intrinsic mode functions acquired by empirical mode decomposition. Empirical mode decomposition has its own merits in discriminating non stationary signals and the novelty of the work is to use the short time Fourier transform of intrinsic mode functions with deep convolution neural network model. The classification and validation accuracy of the model are investigated with respect to epochs. It is demonstrated that both spectrum-based techniques perform good with 100% model accuracies in a numerical experiment of binary classification on a bearing dataset that comprises of normal and faulty signals. In another experiment using milling data set, short time Fourier transform of intrinsic mode functions representation performs better with 100% training accuracy, F1 score of 0.8933 which is better than that of using short time Fourier transform of raw signals whose training accuracy is 64% and F1 score of 0.7486. The numerical study shows that the empirical mode decomposition based spectrum representation delivers the highest accuracy in the learning model obviating the necessity for independent feature extraction, feature selection, and dimension reduction. The numerical experiment is extended using empirical mode decomposition based spectrums for multiple class classification problems in bearing dataset. The confusion matrix obtained for 10 classes, shows that validation accuracy is 100% for all classes. The performance comparison throws light on the merits of empirical mode decomposition spectrum method over other state of the art methods.

Keywords

Convolutional neural network (CNN)empirical mode decomposition (EMD)intrinsic mode function (IMF)short-time Fourier transform (STFT)

1 Introduction

Vibration analysis is a powerful approach to perform condition monitoring for industrial machines and processes for identifying and predicting failures. The majority of portable vibration measuring devices have a Fast Fourier Transform (FFT) processing mechanism to analyse the vibrations. Depending on the severity and location of a machine problem, each vibration signature will be produced by the flaw [4]. Machinery investigation is required for the endurance of rotating machine parts, as it allows for recording continuous information on their status and scheduling optimized maintenance procedures. This process is named predictive maintenance. Predictive maintenance on rotating equipment frequently uses vibration monitoring. Unbalanced bearings, gears, and other components are the sources of the anomalies in the machine’s vibration. The time for repair can be precisely calculated if a condition monitoring and predictive maintenance system are put in place since it can detect parts that are about to malfunction in advance. Figure 1 shows the benefits of predictive maintenance.

Fig. 1

Benefits of predictive monitoring.

Rolling element generates vibration signals which include a wealth of information that can aid in condition monitoring, fault diagnosis, and machine health management by identifying the possibility of series of failures. The use of accelerometers in specific locations to quantify vibration is a typical practice. Machine learning techniques, particularly neural networks, have recently been popular in feature engineering to extract useful information from vibration signals. Fault diagnosis is a pattern recognition problem that consists of two main steps: feature extraction and classification [15]. Feature extraction involves the process of computation of statistical and signal parameters and involves a lot of computations and pre-processing steps [29]. Moreover, it’s a challenge to select the right features and sometimes a wrong selection of features affects the learning accuracy [15]. Feature selection introduces bias in the learning process [7]. The deep convolution neural network (DCNN) is a popular machine learning method for image classification and segmentation and is experimentally proven to give good classification results [18]. In DCNN the convolution layer extracts the useful feature directly from the input images and a separate feature extraction stage is not required.

Many research work had been carried out [27] to represent vibration signals in the form of images and to learn the images using machine learning algorithms to classify the machine faults. The very primitive method is to reshape the one-dimensional time domain vibration signals to two dimensional images and to use the images to train DCNN model. [17]. The frequency domain based image representation will discriminate the good and fault classes better and the modern approaches like wavelet analysis had been used in a recent work to create signal images from time domain vibration signals [27]. In this research, we propose to represent vibration signals as spectrums and apply DCNN directly on the spectrum images, eliminating the feature extraction signal processing block. In a recent work, sub-spectrograms of vibration signals are used with DCNN with improved classification accuracy [31]. Many researchers are attracted by the method of analysing the vibration signals using empirical mode decomposition (EMD) which is proven to be robust in bringing out the machine fault [5] with the help of intrinsic mode functions (IMF) for non-stationary signals. The IMF signals of the EMD analysis are time-domain signals and in a research work [21] the entropy of the each IMF signal is calculated and the set of entropies is used as additional features along with the features learned using CNN from the Fourier spectrums of the original vibration signals. In this proposed paper, the following research question is raised assuming the environment of DCNN learning. Will learning the spectrum images of the time-domain IMF signals obtained by EMD analysis of the vibration signal lead to better accurate classification than learning the spectrum images of original vibration signals ?. It is expected that the EMD based method will have a better performance, as EMD is a better analysis tool for non-stationary signals. It is also expected that the proposed method will also have the benefit of the automatic feature extraction capability of DCNN. The objectives of the proposed research are to show the efficacy of the proposed CNN method of using spectrum images of EMD-IMF to classify machine faults for a benchmark vibration dataset and to reassure the repeatability of the precision of the proposed method for different kinds of datasets.

In this paper, an investigation is done on the method of using short-time Fourier transform (STFT) as the spectrum image representation technique for vibration signals and to use the images in DCNN to develop a model to discriminate the good and faulty conditions of the machine. It is proposed to integrate EMD analysis with the imaging technique to represent the signal fault distinctly. In this paper, the proposed STFT imaging is performed in two ways one on the original vibration signal and another on the IMF’s signals obtained using EMD. The images obtained are used to train the DCNN to classify the normal signal and faulty signal and the experiment is conducted on two different data sets to bring out good models with better training and validation accuracies than the other state-of-the-art methods. This research throws light on the answer for the research question by bringing out the merits of EMD-based spectrum imaging of vibration signals. The experimental results show that the EMD-based spectrum imaging technique used in DCNN is very promising and gives almost 100% classification accuracy. The validation accuracy graph plotted over epochs clearly indicates that results obtained for EMD-IMF₀ is far better than other IMFs and also better than the direct method that uses STFT analysis of original vibration signals. To reassure the efficacy of the method, the numerical experiment is conducted for another small dataset named milling dataset to classify the milling tools as good or bad using the proposed approach. This selected milling dataset problem is not a very ideal as it involves too many dissimilar images in same class and highly non stationary. In one of the trial of the numerical experiments, it is brought out that learning the spectrum of EMD-IMF₀ signal has a validation accuracy of 81.25% which is better than the direct method based on STFT on original signal whose validation accuracy is around 50%. These classification methods using vibration signals for milling data is analysed statistically by performing 50 trials of same CNN experiment and it is found that EMD-IMF₀ leads to the best average validation accuracy of 74.29%. There are 42 testing samples in the experiment and using binomial distribution based sign test [10] approximately a minimum number of 28 wins are required to have p < 0.05. It is brought out by an experiment that EMD-IMF₀ based method has 37 wins and is much superior over direct method that has only 28 wins.

This best method of using EMD-IMF₀ spectrums in DCNN is extended for multiple fault classification problems on bearing datasets considering 10 different fault classes which is also found to be very accurate that gives a proper diagonal confusion matrix that implies 100% validation accuracy. The experimental results obtained in this proposed research for CWRU bearing dataset is compared with other state of the art methods those used the same CWRU dataset and it is inferred that the proposed CNN learning method on EMD-IMF₀ based spectrum images outperforms all other methods The objective of the research is to validate the proposed idea in two different datasets and the objective is achieved by bringing out the merits of the novel method of high accurate DCNN learning on STFT images of the EMD-IMF₀ signals. The novelty of the paper is its idea of using the STFT image representation of IMF signals of EMD analysis to use in DCNN which is the best tool for image classification. In previous work on EMD-based vibration analysis, the IMF signals are not used completely and only the energy of the signals are used in the machine learning problems as feature vectors [26] which makes this proposed work outstanding as the whole spectrum information of the IMF signals is used. The proposed method of completely using STFT images of EMD-intrinsic mode functions is new and novel, throwing light on the merits of EMD analysis of vibration signals in the machine learning problems of machine monitoring.

2 Related works

The aim of this research is to study the efficient methods to represent vibration signals produced by machines in form of images that can be used for accurate fault classification using DCNN. As DCNN is a very good classifier for visual images the method will lead to very accurate machine fault classification. Case Western Reserve University’s (CWRU bearing vibration signal database [33] is a popular benchmark dataset used by lot of researchers [8,9,11,13,14,16,18,26–28 , 8,9,11,13,14,16,18,26–28] and in this work, it is proposed to use the same CWRU dataset. The very simple primitive method to represent the vibration signals as image is to arrange the one-dimensional time-domain signal obtained from the dataset as two-dimensional image and to use it in CNN learning. In a recent work, [17] the one-dimensional CWRU vibration data are converted to reshaped time-domain two dimension images and used in CNN to obtain an accuracy more than 96%. In another work [13] similar method of reshaped time-domain images is used with the Alexa-net based transfer learning architecture and an accuracy of 98% is achieved. The frequency-domain analysis works better than time-domain analysis as frequency-domain analysis brings out lot of hidden details of the signal. In another work [18] the frequency-domain sub-spectrograms of the vibration signals are used as images in a similar Alexa-net based CNN transfer learning architecture and classification accuracy of 99% is demonstrated. The key merit of both the methods [17,18, 17,18] is that they use transfer learning architecture and thus characterised by a very high accuracy, leveraging the previous learning done in Alexa-net. In another work [27] continuous wavelet transform (CWT) is applied to get signal spectrums and using these in CNN a very high accuracy of 99% is demonstrated. Using the IMF obtained by EMD of vibration signals for further analysis is attracting many researchers as it fits much better for non-stationary signals [8,9, 8,9]. In another work, [26] the very basic entropy feature of the IMF signals and other time-domain features are used along with CNN learned features in a SVM-based classifier and an accuracy of 99% is demonstrated. This work is based on a multi-stage architecture in which the CNN is used to learn the features from STFT images of the vibration signals and used along the EMD-IMF energies and other time-domain features in a SVM classifier for further classification. The IMF of EMD of signals are time-domain signals and as a novel method, in the proposed paper, it is proposed to use STFT spectrum images of the IMF signals in DCNN for fault classification. It is proposed to test STFT-based spectrums of the raw signal and also the STFT of IMFs of EMD of vibration signals for better classification of machine faults. To prove upon the robustness of the proposed EMD-IMF method and its merit over raw signal STFT method another milling dataset [34] is also used in this paper. This dataset comprises of many other data along with vibration signal to study the wear and tear of the tool used in milling process and this data set had been used in many work [1,3,6,12,19,22,24,25,30,32 , 1,3,6,12,19,22,24,25,30,32] to develop machine learning algorithm to estimate remaining useful life (RUL) of the tool by using all the data columns of the database. In this work it is restricted to use only the spindle vibration signal to classify the tool as good or fault.

This extended experiment is to verify the repeatability of the proposed method of CNN learning on spectrum images of EMD-IMF signals to classify the tools as good or faulty.

3 Proposed work

In this research, the CNN directly learns the useful parameters to classify the vibration sample signal, thus eliminating a separate feature extraction stage. The STFT is applied to both the original signal and EMD-IMF signals on two different data sets, the bearing data set [33] and the milling data set [34] which are discussed in previous section. The process involved in the proposed experiment is represented as an algorithm using standard block diagram structure as in Fig. 2.

Fig. 2

Algorithm of proposed work.

3.1 Creation of dataset images for vibration signals

3.1.1 Spectrum image formation method

The vibration signals corresponding to good condition and faulty conditions are fetched from the data set and subdivided into K overlapping short signals that exist for T time units. Every such signal is considered as a class member of either a good or faulty class. Then the spectrum image for the member signal is obtained by applying STFT with FFT length N and with 50% overlapping to two different forms of the signal (i) original signal and (ii) M intrinsic mode functions obtained by EMD. The K depends upon the dataset we select and, in this study, we considered two datasets [33,34, 33,34]. In case (i) STFT transforms the original time-domain signal that is seen through the window into the frequency-domain and the peaks are formed corresponding to the spectrum components. In a grayscale image, the peaks appear as white points. STFT is complex-valued and magnitude value is used in the spectrum display. The spectrum depicts the change of frequency with respect to time. Figure 3 shows the spectrum image formation using STFT for case (i) in case (ii) Empirical mode decomposition decomposes the time-domain signal into separate intrinsic mode functions and STFT is applied to each function. The minute details of the time-domain signal will be distributed in the IMF. The IMF is still the time-domain function and serves as the orthogonal basis for the signal [2]. Figure 4 shows the spectrum image formation using EMD.

Fig. 3

Spectrum image formation using STFT.

Fig. 4

Spectrum image formation using STFT-EMD.

3.1.2 Spectrum image formation for bearing data set

To show the efficiency of our approach, experiments were performed on the bearing vibration signal database of Case Western Reserve University (CWRU). Figure 5 depicts the components of the test stand, which include a 2 HP motor, a torque transducer, control electronics, and a dynamometer. The test bearings hold up the motor shaft. The test bearings are being subjected to electro-discharge machining to produce single-point flaws with diameters of 7, 14, 21, and 28 mils (one mil equals 0.001 inch). For drive-end bearing failures, digital data were captured at a rate of 12,000 pulses per second. The torque transducer was used to measure speed and horsepower, which had to be manually recorded. Vibration data will be measured using accelerometers. According to the reference [33]. the bearing vibration signal database includes numerous signal components and specifications. It covers a wide range of experiment apparatus fault types, such as inner race (IR), ball (BA) and orthogonal (OR) faults. In this experiment, the normal signal and BA fault were taken into consideration. In the proposed experiment, drive end (DE) accelerometer data is used for analysis of the vibration signals corresponding to normal conditions and BA fault.

Fig. 5

Block diagram of experimental setup of CWRU bearing dataset.

As explained in section 3.1.1 case (i), the original DE vibration signals of the bearing dataset corresponding to normal and BA fault are transformed to image data with N = 256, to get 4737 normal images and 1096 images corresponding to BA fault as shown in Table 1. A sample of normal and faulty spectrum images are given in Fig. 6.

Table 1

Configuration of normal and ball fault bearing dataset

Data set images	Normal	Fault
Total images	4737	1096
Training images	3553	821
Testing images	1184	275

Fig. 6

(a) STFT normal. (b) STFT fault.

As explained in section 3.1.1 case (ii), the original DE vibration signals of the bearing dataset are decomposed to M components using EMD and the first 4 IMF signals are considered and STFT is applied to every IMF signal and transformed to image data with N = 256, to get 4737 × 4 normal images and 1096 x 4 images corresponding to BA fault as shown in Table 2. A sample of normal and faulty spectrum images is given in Fig. 7 for all the 4 IMF’s. The columns of the image corresponds to frequency the image is carrying this column frequency information varying in x axis. These are images obtained for just one good sample and faulty sample and these images can be used in DCNN for further classification. Quick analysis reveals that frequency component distribution Fig. 7e for faulty signals is like a noisy pattern.

Table 2

Configuration of normal and ball fault bearing dataset IMF(0-3)

Intrinsic mode function	IMF₀		IMF₁		IMF₂		IMF₃
Data set images	Normal	Fault	Normal	Fault	Normal	Fault	Normal	Fault
Total images	4737	1096	4737	1096	4737	1096	1240	1096
Training images	3553	821	3553	821	3553	821	930	821
Testing images	1184	275	1184	275	1184	275	310	275

Fig. 7

(a) STFT of IMF₀ of normal bearing. (b) STFT of IMF₁ of normal bearing. (c) STFT of IMF₂ of normal bearing. (d) STFT of IMF₃ of normal bearing. (e) STFT of IMF₀ of fault bearing. (f) STFT of IMF₁ of fault bearing. (g) STFT of IMF₂ of fault bearing. (h) STFT of IMF₃ of fault bearing.

3.1.3 Spectrum image formation for milling data set

Another dataset we consider to demonstrate the efficacy of the proposed method is milling data [34]. Milling data comprises of 1×167 struct array with fields: Case, run, VB, time, DOC, feed, material, smcAC, smcDC, vib_table, vib_spindle, AE_table, AE_spindle. This struct array corresponds to 167 recordings of the experiment done and in this, we use the vibration data vib-spindle for analysis. For every recording, the VB data represents the flank wear and tear involved in the tool at the end of the experiment. In our analysis, we considered two cases Normal and Fault. The data with less wear and tear corresponding to VB of range (0-2) are considered normal and data with high wear and tear corresponding to VB > 3 are considered Fault.

As explained in section 3.1.1 case (i), the original Milling vibration signals are transformed to image data with N = 256, to get 54 normal images and 70 fault images as shown in Table 3.

Table 3
Configuration of normal and fault Milling dataset

Data set images Normal Fault

Total images 54 70

Training images 34 48

Testing images 20 22

Data set images	Normal	Fault
Total images	54	70
Training images	34	48
Testing images	20	22

As explained in section 3.1.1 case (ii), the original DE vibration signals of the bearing dataset are decomposed to M components using EMD the first 4 intrinsic mode functional signals are considered, and STFT is applied to every component signal and transformed to image data with N = 256, to get 54 × 4 normal images and 70 × 4 images corresponding to BA fault as shown in Table 2.

4 Proposed numerical experiment

In this paper, it is proposed to analyse the raw vibration data represented in the form of a spectrum image to determine the condition of the machine. This eliminates the need for a separate feature extraction process. Based on the fact that the frequency spectrum captured over time comprises the complete details of the signal of interest we proposed to use the spectrum images to learn the machine condition. In doing so a binary classification problem is assumed for vibration data samples obtained in two use cases. (i) bearing setup with two classes of signals normal and faulty (ii) milling process with two classes of signals normal those with less wear and tear of the tool and faulty those with more wear and tear. In this process, the first use case uses the CWRU data set, and it’s a quite straightforward problem as the data set contains normal and fault vibration signals. In the second use case, the problem is slightly complex as it is assumed that the samples with a wear and tear factor less than 2 as a normal class and those greater than 3 as a fault class. In this use case, both the normal and fault classes accommodate signals with a range of wear and tear factors which will make the classification process difficult and this dataset samples are highly non stationary. Further, to this, it is proposed to use two types of STFT-based spectrum image techniques (a) applying STFT for raw signals and (b) applying STFT for IMF of raw signals obtained using EMD. Recent work [21] demonstrates that EMD analysis of vibration signal is very promising to determine the machine condition and, in our proposed work, it is planned to demonstrate the merits of EMD-based spectrum imaging techniques. It is proposed to use the spectrum images obtained using (a) & (b) in machine learning problems to classify the samples as normal and fault for the use cases (i) & (ii). DCNN is a promising AI architecture that gives robust classification results for images as good as humans and in some places better than humans [23]. As the approach is an image-based classification approach DCNN suites well for the numerical demonstration. The convolutional neural network (CNN) is a feed-forward neural network that keeps the actual connectivity of neurons. It is performed on the basis of the nervous system of living creatures and now has implications in machine vision, recommender systems, and natural language processing. The values of the convolution operation are linked, giving them the ability to train.

CNN is a tool with a remarkable capacity to learn features in a robust and sensitive approach since they automatically obtain the properties of the images through convolutional filters. The three layers of the CNN architecture are convolutional layer, sub-sampling layer, and fully connected layer. In CNN, categorization is done in a single body removing the need for manual handling while extracting data features. The CNN model parameters that is used in this numerical demonstration are given in Table 5. The suggested deep CNN model is composed of three stages. Convolution, activation, and pooling layers are included in each stage, which represents a feature learning stage with various feature levels. The proposed deep CNN model contains two convolutional layers with 64,(3,3) and 128,(3,3) filters, respectively, as illustrated in Fig. 8. Additionally, two 2x2 pooling size max-pooling layers were utilised. To feed the input images into the CNN model, the input images are resized to 64x64. ReLU (Rectified Linear Units) activation function was used to add nonlinearity to each level, enabling CNN to train sophisticated models. Subsampling was used to introduce pooling layers, and the proposed model used max-pooling to reduce the number of extracted features and to minimise over fitting. The feature maps were flattened into a column vector after the three steps. A feed-forward neural network received the flattened output vector, and backpropagation was used during each training cycle. The suggested model was able to distinguish between the dominant and low-level features in texture images during training and classify them using a fully connected layer. One of the optimization techniques based on gradients was utilised to estimate the parameters of the suggested model. In order to obtain quicker convergence, the parameters were updated using the Adam optimizer.

Table 5
CNN model parameters

Parameters Values

Model Type Sequential

Layers Conv2D, Max-pooling2D, Flatten, Dense

Activation Layer Relu, sigmoid

Class Mode Binary

Loss Binary Cross Entropy

Flip Type Horizontal

Metrics Accuracy

Optimizer Adam

Kernel Size 3×3

Target size 64, 64

Rescale size 1./255

Batch Size 32

Epoch 100

Filter Size 64,64,3

Shear Range 0.2

Zoom Range 0.2

Parameters	Values
Model Type	Sequential
Layers	Conv2D, Max-pooling2D, Flatten, Dense
Activation Layer	Relu, sigmoid
Class Mode	Binary
Loss	Binary Cross Entropy
Flip Type	Horizontal
Metrics	Accuracy
Optimizer	Adam
Kernel Size	3×3
Target size	64, 64
Rescale size	1./255
Batch Size	32
Epoch	100
Filter Size	64,64,3
Shear Range	0.2
Zoom Range	0.2

Fig. 8

CNN architecture.

Principally using this CNN model, learning experiment was conducted for the following

ball bearing use case with STFT done for original raw data using the image dataset Table 1.

ball bearing use case with STFT done for EMD-IMF (0-3) using the image dataset in Table 2.

milling use case with STFT done for original raw data using the image dataset Table 3.

milling use case with STFT done for EMD-IMF (0-3) using the image dataset in Table 4.

The result of CNN model learning accuracies for (1) & (2) is tabulated in Table 6. For all the cases depicted in Table 6 the classification accuracy, and validation accuracy are plotted against epoch and & Fig. 9 shows the accuracy and validation accuracy of bearing dataset of both STFT & all IMF’S in a stack plot for better visualisation.

Table 4

Configuration of normal and fault Milling dataset IMF(0-3)

Intrinsic mode function	IMF₀		IMF₁		IMF₂		IMF₃
Data set images	Normal	Fault	Normal	Fault	Normal	Fault	Normal	Fault
Total images	54	70	54	70	54	70	54	70
Training images	34	48	34	48	34	48	34	48
Testing images	20	22	20	22	20	22	20	22

Table 6

Model accuracies for CNN classification of normal & fault for bearing dataset

Bearing	Original spectrum	IMF₀ Spectrum	IMF₁ Spectrum	IMF₂ Spectrum	IMF₃ Spectrum
Loss	4.14*10^-12	1.26*10^-8	0.0339	2.40*10^-4	7.16*10^-9
Accuracy	1	1	1	1	1
Validation-loss	5.55*10^-12	1.25*10^-8	0.0343	3.11*10^-4	3.15*10^-9
Validation-accuracy	1	1	1	1	1

Fig. 9

(a) Consolidated stack plot of model accuracy with respect to epochs for bearing dataset using STFT spectrum of IMF₀, IMF₁, IMF_2, IMF₃ and the STFT spectrum of raw data (a) Training accuracy (b)Validation accuracy.

The result of CNN learning accuracies for (3) & (4) is tabulated in Table 7. For all the cases depicted in Table 7 the classification accuracy, validation accuracy, classification loss & validation loss are plotted against epoch and Figs. 10 & 11 shows the accuracy, validation accuracy, training loss and validation loss of milling dataset of both STFT & all IMF’S in a stack plot for better visualisation.

Table 7

Model accuracies for CNN classification of milling dataset

Milling	Original spectrum	IMF₀ Spectrum	IMF₁ Spectrum	IMF₂ Spectrum	IMF₃ Spectrum
Loss	0.6257	0.0039	0.1019	0.0342	0.1544
Accuracy	0.64	1	0.9844	0.98	0.98
Validation-loss	0.7668	0.4469	1.3704	0.682	1.1628
Validation-accuracy	0.4062	0.8125	0.7188	0.75	0.6875

Fig. 10

Model accuracy & validation-accuracy with respect to epochs using the STFT spectrum of raw data, STFT spectrum of IMF₀, IMF₁, IMF_2, IMF₃ of milling dataset.

Fig. 11

Model loss & validation-loss with respect to epochs using the STFT spectrum of raw data, STFT spectrum of IMF₀, IMF₁, IMF_2, IMF₃ of milling dataset.

5 Result and discussion

So, for the numerical experiment conducted is to build DCNN for classification of binary classes and here the inference from the results obtained is discussed. The multiclass extension of the problem is also addressed for bearing dataset and the performance of the proposed method is compared with other state of the art methods.

5.1 Inference from the numerical experiments for binary classification

By observing the classification accuracy depicted in Table 6 that corresponds to the bearing dataset and Fig. 9, it can be inferred that it’s very promising for both the cases of spectrum formation. The classification and the validation accuracy are found to be 100% when STFT is applied to the original signal and also for the case of STFT applied to EMD_IMFs. This achievement of 100% accuracy without any feature extraction stage is achieved by DCNN in which the convolution layer functions as a feature extractor. The stack plot given in Fig. 9 shows that the results obtained using EMD- IMF₀ is consistently good through all epochs but overall, all the methods have similar performances. As both STFT on the original signal and EMD_IMFs are performing similarly, the experiment using bearing dataset does not bring out the merits of the EMD application.

Table 11
The accuracy of CWRU bearing fault diagnosis by different classification algorithms

Author 1 Features Methodology Validation-accuracy

Yuan Xie, et al. (2017) [26] 11 features based on time-domain analysis and EMD energy. SVM 83.14%

80 CNN features CNN+SVM 99.05%

91 combined features CNN+SVM 99.75%

Mingyong Li, et al. (2019) [17] Reshaped time-domain images Convolutional neural network model more than 96%

Juying Dai, et al. (2019) [9] EMD entropy features Stacked sparse denoising autoencoder (SSDAE)-SSDAE 99.55%

Tao Lu, et al. (2020) [18] Sub-spectrograms CNN with transfer learning (AlexNet) 99.9%

Luis A. Pinedo-S $\overset{´}{a}$ nchez, et al. (2020) [20] Reshaped time-domain images CNN with transfer learning (AlexNet) 99%

Yang Xu, et al. (2021) [27] Continuous wavelet transforms (CWT) Hybrid DCNN and gcForest 99.8%

Proposed work STFT-spectrum image DCNN 100%

EMD-IMF₀ spectrum DCNN 100%

EMD- IMF₁ spectrum DCNN 100%

EMD- IMF₂ spectrum DCNN 100%

EMD- IMF₃ spectrum DCNN 100%

EMD- IMF₀ spectrum(multi-class) DCNN 100%

Author 1	Features	Methodology	Validation-accuracy
Yuan Xie, et al. (2017) [26]	11 features based on time-domain analysis and EMD energy.	SVM	83.14%
80 CNN features	CNN+SVM	99.05%
91 combined features	CNN+SVM	99.75%
Mingyong Li, et al. (2019) [17]	Reshaped time-domain images	Convolutional neural network model	more than 96%
Juying Dai, et al. (2019) [9]	EMD entropy features	Stacked sparse denoising autoencoder (SSDAE)-SSDAE	99.55%
Tao Lu, et al. (2020) [18]	Sub-spectrograms	CNN with transfer learning (AlexNet)	99.9%
Luis A. Pinedo-S $\overset{´}{a}$ nchez, et al. (2020) [20]	Reshaped time-domain images	CNN with transfer learning (AlexNet)	99%
Yang Xu, et al. (2021) [27]	Continuous wavelet transforms (CWT)	Hybrid DCNN and gcForest	99.8%
Proposed work	STFT-spectrum image	DCNN	100%
	EMD-IMF₀ spectrum	DCNN	100%
	EMD- IMF₁ spectrum	DCNN	100%
	EMD- IMF₂ spectrum	DCNN	100%
	EMD- IMF₃ spectrum	DCNN	100%
	EMD- IMF₀ spectrum(multi-class)	DCNN	100%

The outcome of the experiment conducted for the milling data set can be observed in Table 7 and in Figs. 10-11. In this case the maximum classification and validation accuracy is achieved for the case STFT applied for IMF₀ of EMD. It can be inferred that the model accuracy for the case of STFT on original signal is 64% which is quite low and it’s improved to 100% for STFT applied to IMF₀ of EMD. This brings out the merits of EMD in highlighting the defective portion of the vibration signal. To substantiate the inference, the conventional performance metrics calculation of all methods used for milling dataset is performed and a simple nonparametric statistical analysis [10] for the numerical experiment on milling dataset is done and tabulated in Tables 8 & 9. In Table 8, the F1 score of method using EMD-IMF₀ is about 0.8933 which is in very higher side than that of the method using STFT of raw data whose score is 0.7486. This brings out that the EMD-IMF₀ is better than direct method in terms of sensitivity and selectivity. The total number of test samples used are 42 and assuming binomial distribution, as per the sign test thumb rule, [10] minimum of $(\frac{n}{2}) + sqrt (n) = 28$ wins are required to have a probability < 0.05. As in Table 8, the total number of wins for EMD-IMF₀ is 37 and it is very higher than the thumb rule requirement and the direct STFT method got exact 28 wins. To understand the repeatability of the proposed methods the classification training and testing is done 50 times and the average value of the accuracies are given in Table 9 and it can be observed EMD-IMF₀ method has the highest average validation accuracy of 74.29%. It’s very encouraging and promising that spectrum image of vibration signal with EMD application works good with DCNN-based image classification system to classify the vibration signal as faulty and good for both the use cases we considered.

Table 8

Performance metrics of CNN classification of milling dataset using n = 42 samples

Parameters	Original spectrum	IMF₀ Spectrum	IMF₁ Spectrum	IMF₂ Spectrum	IMF₃ Spectrum
True Positive (TP)	21	21	20	21	21
True Negative (TN)	7	16	16	13	7
False Positive (FP)	1	1	2	1	1
False Negative (FN)	13	4	4	7	13
Precision P = TP/(TP + TN)	0.954	0.9540	0.909	0.9540	0.954
Recall R = TP/(TP + FN)	0.617	0.84	0.833	0.75	0.617
F1 score (2 * (P * R)/(P + R))	0.7486	0.8933	0.869	0.8397	0.7486
Total success (S)/n	28/42	37/42	36/42	34/42	28/42
Total failure/n	14/42	5/42	6/42	8/42	14/42

Table 9

Statistical performance analysis of CNN classification of milling dataset for 50 trials

Parameters	Original spectrum	IMF₀ Spectrum	IMF₁ Spectrum	IMF₂ Spectrum	IMF₃ Spectrum
Maximum Validation Accuracy	0.6979	0.8438	0.8125	0.8125	0.7188
Minimum Validation Accuracy	0.5321	0.6572	0.625	0.6875	0.5312
Mean Validation Accuracy	0.6262	0.7429	0.712	0.753	0.6205

5.2 Multiclass classification

The result and discussion made in 5.1, clearly shows that using the spectrum images obtained by STFT on EMD-IMF₀ in the DCNN gives a robust result for a binary classification problem. This method is extended for a multiclass classification problem and the class confusion matrix is analysed. As we explained in 3.1.2 in case (ii) IMF₀ of bearing data are taken for 10 different faults as illustrated in the dataset description [33] and tabulated in Table 10. After training and testing the DCNN model we obtained a model accuracy of 100%. The classification of different faults which is been predicted by the model is shown in Fig. 12 as a confusion matrix which is perfectly a diagonal matrix. This shows that the method of using the spectrum of EMD intrinsic function directly with DCNN results promising model accuracies without a need for feature extracting stage.

Table 10
Configuration of multi class bearing dataset using IMF₀

Intrinsic mode function IMF₀ IMF₀ IMF₀ IMF₀ IMF₀ IMF₀ IMF₀ IMF₀ IMF₀ IMF₀

Fault diameter 0.007 0.007 0.007 0.007 0.007 0.021 0.021 0.021 0.021 0.021

Motor load 3 3 3 3 3 3 3 3 3 3

MOTOR SPEED (rpm) 1730 1730 1730 1730 1730 1730 1730 1730 1730 1730

Fault name Inner race Ball Centered Orthogonal Opposite Inner race Ball Centered Orthogonal Opposite

Total images 1100 1096 1100 1099 1103 1096 1102 1105 1100 1108

Training (80%) 880 876 880 879 882 876 881 884 880 886

Testing (20%) 220 220 220 220 221 220 221 221 220 222

Intrinsic mode function	IMF₀	IMF₀	IMF₀	IMF₀	IMF₀	IMF₀	IMF₀	IMF₀	IMF₀	IMF₀
Fault diameter	0.007	0.007	0.007	0.007	0.007	0.021	0.021	0.021	0.021	0.021
Motor load	3	3	3	3	3	3	3	3	3	3
MOTOR SPEED (rpm)	1730	1730	1730	1730	1730	1730	1730	1730	1730	1730
Fault name	Inner race	Ball	Centered	Orthogonal	Opposite	Inner race	Ball	Centered	Orthogonal	Opposite
Total images	1100	1096	1100	1099	1103	1096	1102	1105	1100	1108
Training (80%)	880	876	880	879	882	876	881	884	880	886
Testing (20%)	220	220	220	220	221	220	221	221	220	222

Fig. 12

Confusion matrix of validation of bearing dataset.

6 Conclusions

enlargethispage 2pt In this work, machinery condition monitoring is demonstrated by using vibrational analysis with help of machine learning algorithms. Instead of extracting signal parameters and features from the vibrational signal, a spectrum-based image representation is proposed to use with DCNN in which the convolution layer is used for automatic feature extraction and classifying the input as good or faulty. Two types of spectrum representations are proposed, one is based on the STFT of the original signal, and the other one is based on the STFT of IMF signals obtained using EMD. An Adam optimizer-based CNN is used for training and classification and validation accuracies are studied. The numerical experiments are conducted on two different datasets. In the numerical experiment on CWRU, a ball-bearing dataset that has clearly demarcated normal and faulty signals, it is shown that both spectrum-based methods perform very well with 100% model accuracies. In another experiment based on the milling dataset in which we assume the vibration signals corresponding to less wear and tear as a good signal and with more wear and tear as a faulty signal, it’s shown that EMD-based STFT analysis performs better with 100% training accuracy. F1 scores of both the methods reveal that second type of EMD based representation is better than first type. Statistical analysis is also done to reassure that EMD based method is better in consistence. The numerical experiments demonstrate that the spectrum image representation by applying EMD performs very well with DCNN-based signal classification which also eliminates the need for separate feature extraction feature selection and dimension reduction. The improvement of classification accuracy with respect to epochs is studied and reported. This imaging method is extended for a multiclass fault classification using CWRU bearing dataset in which we considered 10 different fault classes and the confusion matrix obtained is purely diagonal implying 100% validation accuracy for all the 10 classes. The performance of the proposed method is compared with state of the art methods of same kind those used CWRU dataset and it is shown that the proposed method using EMD-IMF spectrums is better than other similar methods. The merits of EMD is well known but it is first time to use the spectrum images of EMD-IMFs with DCNN to classify machine faults which is the novelty of this work. One of the limitation in the proposed work is that, the joint learning of the individual IMFs of EMD is not performed which can be considered for futurework.

References

Aghazadeh

, Tahan

, Thomas

Tool condition monitoring using spectral subtraction algorithm and artificial intelligence methods in milling process, Int J Mech Eng Robot Res 7 (2018), 30–34.

Ahmadi

, Ekhlasi

Types of EMD Algorithms,5th Iran. Conf. Signal Process. Intell. Syst. ICSPIS 2019, (2019), 18–19.

Assafo

, Langendorfer

A TOPSIS-Assisted Feature Selection Scheme and SOM-Based Anomaly Detection for Milling Tools under Different Operating Conditions, IEEE Access 9 (2021), 90011–90028.

Azeez

N.I.

, Alex

A.C.

Detection of rolling element bear-ing defects by vibration signature analysis: A review, 2014 Annu. Int. Conf. Emerg. Res. Areas Magn. Mach. Drives,AICERA/ICMMD 2014- Proc, (2014).

Barbosh

, Singh

, Sadhu

Empirical mode decomposition and its variants: A review with applications in structural health monitoring, Smart Mater Struct 29 (2020).

Bazi

, Benkedjouh

, Habbouche

, Rechak

, Zerhouni

A hybrid CNN-BiLSTM approach-based variational mode decomposition for tool wear monitoring, Int J Adv Manuf Technol 119 (2022), 3803–3817.

Cerrada

, Sánchez

R.V.

, Pacheco

, Cabrera

, Zurita

, Li

Hierarchical feature selection based on relative dependency for gear fault diagnosis, Appl Intell 44 (2016), 687–703.

Chen

, Zhang

, Luo

, Sun

A novel rolling bearing fault diagnosis and severity analysis method, Appl Sci 9 (2019).

Dai

, Tang

, Shao

, Huang

, Wang

Fault diagnosis of rolling bearing based on multiscale intrinsic mode function permutation entropy and a stacked sparse denoising autoencoder, Appl Sci 9 (2019).

10.

Derrac

, García

, Molina

, Herrera

A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms, Swarm Evol Comput 1 (2011), 3–18.

11.

Eren

, Ince

, Kiranyaz

A Generic Intelligent Bearing Fault Diagnosis System Using Compact Adaptive 1D CNN Classifier, J Signal Process Syst 91 (2019), 179–189.

12.

Gittler

, Glasder

, Öztürk

, Lüthi

, Weiss

, Wegener

International Conference on Advanced and Competitive Manufacturing Technologies milling tool wear prediction using unsupervised machine learning, Int J Adv Manuf Technol 117 (2021), 2213–2226.

13.

Hendriks

, Dumond

, Knox

D.A.

Towards better benchmarking using the CWRU bearing fault dataset, Mech Syst Signal Process 169 (2022), 108732;.

14.

Huo

, Zhang

, Jombo

, Shu

Adaptive Multiscale Weighted Permutation Entropy for Rolling Bearing Fault Diagnosis, IEEE Access 8 (2020), 87529–87540.

15.

Koukoura

, Carroll

, McDonald

, Weiss

Comparison of wind turbine gearbox vibration analysis algorithms based on feature extraction and classification, IET Renew Power Gener 13 (2019), 2549–2557 .

16.

D.C.

, Zhang

, Kang

T.B.

, Li

, Xiang

H.B.

, Wang

K.S.

, Pei

Z.L.

, Tang

X.Y.

, Wang

Fault diagnosis of rotating machinery based on dual convolutional-capsule network (DC-CN), Meas J Int Meas Confed 187 (2022).

17.

, Wei

, Wang

, Zhang

Research on fault diagnosis of time-domain vibration signal based on convolutional neural networks, Syst Sci Control Eng 7 (2019), 73–81.

18.

, Yu

, Han

, Wang

A generic intelligent bearing fault diagnosis system using convolutional neural networks with transfer learning, IEEE Access 8 (2020), 164807–164814.

19.

Mohanraj

, Yerchuru

, Krishnan

, Nithin Aravind

R.S.

, Yameni

Development of tool condition monitoring system in end milling process using wavelet features and Hoelder’s exponent with machine learning algorithms, Meas J Int Meas Confed 173 (2021), 108671.

20.

Pinedo-Sánchez

L.A.

, Mercado-Ravell

D.A.

, Carballo-Monsivais

C.A.

, Vibration analysis in bearings for failure prevention using CNN, J Brazilian Soc Mech Sci Eng 42 (2020), 1–17.

21.

Shah

A.K.

, Yadav

, Malik

EMD and ANN based intelligent model for bearing fault diagnosis, J Intell Fuzzy Syst 35 (2018), 5391–5402.

22.

Shao

, Deng

, Yuan

, Mechefske

C.K.

, Chen

Characteristic recognition of chatter mark vibration in a rolling mill based on the non-dimensional parameters of the vibration signal, J Mech Sci Technol 28 (2014), 2075–2080.

23.

Wang

, Xu

, Liu

Bearing fault diagnosis based on improved vmd and dcnn, J Vibroengineering 22 (2020) 1055–1068.

24.

Wright

, Dornfeld

, Ota

Green Manufacturing and Sustainable Manufacturing Wireless Sensor Networks (WSNs), (2008).

25.

Xie

, Wu

Optimal transport-based transfer learning for smart manufacturing: Tool wear prediction using out-of-domain data, Manuf Lett 29 (2021), 104–107.

26.

Xie

, Zhang

Fault Diagnosis for Rotating Machin-ery Based on Convolutional Neural Network and Empirical Mode Decomposition, Shock Vib 2017 (2017).

27.

, Li

, Wang

, Li

, Sarkodie-Gyan

, Feng

A hybrid deep-learning model for fault diagnosis of rolling bearings, Meas J Int Meas Confed 169 (2021), 108502.

28.

Yang

, Fu

, He

Bearing Fault Automatic Classification Based on Deep Learning, IEEE Access 6 (2018), 71540–71554.

29.

Yeh

C.W.

, Chen

Using convolutional neural network for vibration fault diagnosis monitoring in machinery, Proc. 2018 IEEE Int. Conf. Adv. Manuf. ICAM 2018, (2019), 246–249.

30.

, Kim

I.Y.

, Mechefske

Remaining useful life estimation using a bidirectional recurrent neural network based autoencoder scheme, Mech Syst Signal Process 129 (2019), 764–780.

31.

Zhao

, Yuan

, Zhang

An improved scheme for vibration-based rolling bearing fault diagnosis using feature integration and adaboost tree-based ensemble classifier, Appl Sci 10 (2020).

32.

Zhou

, Sun

Tool Wear Condition Monitoring in Milling Process Based on Current Sensors, IEEE Access 8 (2020), 95491–95502.

33.

Download a Data File| Case School of Engineering | CaseWestern Reserve University,(n.d.).

34.

NASA Milling Dataset| Kaggle, (n.d.).

Deep convolution neural network for machine health monitoring using spectrograms of vibration signal and its EMD-intrinsic mode functions

Abstract

Keywords

1 Introduction

3 Proposed work

3.1.1 Spectrum image formation method

Table 3 Configuration of normal and fault Milling dataset Data set images Normal Fault Total images 54 70 Training images 34 48 Testing images 20 22

5.1 Inference from the numerical experiments for binary classification

References

Table 3
Configuration of normal and fault Milling dataset

Data set images Normal Fault

Total images 54 70

Training images 34 48

Testing images 20 22