Abstract
Fault diagnosis is an important link in intelligent development of industrial robots. Aiming at the problem of weak fault diagnosis performance caused by insufficient training samples, a fault diagnosis model based on triplet network is proposed. Firstly, we combine the multiscale convolutional neural network (MSCNN) with channel attention networks (squeeze-and-excitation network, SENet), and use it to construct a triple sub-network structure MS-SECNN, which can adaptively extract features from the original fault signal. Then, the feature similarity is calculated by triplet loss in the low dimensional space to realize the fault classification task. The experiments are based on the real industrial robot operation data set. In this model, we use Few-shot learning strategy to test the diagnostic performance under small samples, and compare it with WDCNN, FDCNN and MSCNN models. Experimental results show that the proposed model has more effective fault classification ability under small samples. In addition, when the training sample size is 1400, the average accuracy of MS-SECNN reaches 99.21%.
Introduction
Industrial robots are highly automated modern manufacturing machines and equipment, which consist of mechanical engineering, electronic technology, material science, software control, artificial intelligence and other disciplines. Industrial robot arm is usually used to simulate human hands for the continuous work of day and night in the fields of welding, assembly and construction, which realizes the efficient operation of the whole production system. Once the industrial robot breaks down, it will delay the whole production cycle or cause casualties. It is very important to accurately and effectively diagnose the fault of the industrial robot to ensure the best working state.
Deep learning is a new method in the field of fault diagnosis [15] and has made great progress in many aspects [10,11]. There are two main advantages of deep learning in the field of fault diagnosis. One is the powerful self-learning ability, which can realize end-to-end fault diagnosis to avoid the lack of original information, causing by artificially extracting and selecting fault features. Second, the deep network model can more comprehensively learn the internal characteristics of data and depict richer fault information. The effective premise of the fault diagnosis model based on deep learning is that the data distribution of the training set and the test set is the same and the sample size of the target data is sufficient [13]. However, in practical applications, the high reliability of mechanical equipment, complex working environment, changeable operating conditions and difficult direct signal measurement lead to the problems of small sample. For example, in most cases, mechanical equipment is working in a normal state, but it can only collect limited fault data or does not even have fault data sometimes. The distribution of collected signals of the same type of mechanical equipment is also different under different operating conditions, such as variable load and variable speed. These often lead to differences in the distribution of training set data and testing set data, resulting in low generalization ability of the training model, easy overfitting, lack of robust generalization performance or even no longer applicable. Therefore, how to realize the accurate identification of industrial robot fault diagnosis under small samples is an urgent problem to be solved.
In recent years, deep learning methods are increasingly enriched in the research of mechanical fault diagnosis. Many studies began to pay attention to mechanical fault diagnosis under small samples. Li et al. [12] proposed a novel domain adaptation method to solve limited labeled data problems, which used deep balanced domain adaptation neural network (DBDANN) to diagnose the fault of planetary gearboxes. Zhang et al. [21] proposed a Siamese neural network model, which used samples of the same or different classes to learn and diagnose the fault of bearings in limited data. Wang et al. [19] proposed a feature space metric-based meta-learning model (FSM3), which was a mixture of general supervised learning and episodic metric meta-learning for bearing and gearbox fault diagnosis under finite data. Li et al. [9] proposed a novel meta-learning fault diagnosis method (MLFD) based on model-agnostic meta-learning. This method first converted the raw signals into time-frequency images and realized fast and accurate bearing fault diagnosis with few samples through meta-learning. The methods above use deep learning model to improve the diagnosis accuracy in bearing and gearbox fault diagnosis. Some problems can be solved by expanding the sample size. Xiao et al. [20] proposed a health assessment and state prediction algorithm based on hidden Markov model and temporal convolutional networks for mechanical axis health management. Sonal et al. [5] proposed an improved conditional variational autoencoder (CVAE) to generate synthetic samples. Through the new training samples of synthetic, the size of data set was increased and the centroid loss was introduced to ensure the rationality of synthetic samples. Chicco et al. [3] used SVM to predict and expand the data under the condition of small samples, which could improve the problem of small samples without prior knowledge. Long et al. [14] used centralized learning of attitude data to make fault diagnosis in the multi-axis robot, which was based on the hybrid deep learning architecture of hybrid sparse auto-encoder (SAE) and support vector machine (SVM). Deebak et al. [4] proposed a digital-twin-assisted fault diagnosis approach using deep transfer learning to analyze the operational conditions of machining tools. Cabrera et al. [2] used the GANs model to evaluate the data distribution of each few failure modes and solve the problem of data imbalance.
Fault diagnosis based on deep learning has achieved some results in small samples, but it is still difficult to solve the fault diagnosis of multi-axis industrial robots in small samples. Because the multi-axis industrial robot is in the running state of the whole machine in actual production. The mechanical axis is interrelated. Once a mechanical shaft fails, it will affect the operation data status of other shafts. Therefore, the fault diagnosis of multi-axis industrial robots must be judged by the overall operation data. However, the current methods pay more attention to single bearing fault diagnosis, which is difficult to be directly applied to small sample fault diagnosis of multi-axis industrial robots. As a result, it is very important to study a fault diagnosis method for multi-axis industrial robots in small samples.
In this paper, a fault diagnosis model based on triplet network is proposed to solve the fault diagnosis problem of multi-axis industrial robots under the condition of small samples. The main contributions of this paper include:
An improved multi-scale convolutional neural network (MS-SECNN) structure based on SENet is proposed. The attention mechanism of Sequence and Exception (SE) module in SENet is added to the multi-scale convolutional neural network to enhance the guiding role of important features in the repair process. The proposed structure is used to extract fault features of multi-axis industrial robots. The multi-scale feature extraction and classification can be carried out at the same time.
A triplet network model method based on MS-SECNN is proposed for end-to-end fault diagnosis of multi-axis industrial robots under the condition of small samples. In the case of limited data, the diagnosis accuracy of the model is 99.21% by using the Few-shot learning strategy.
The rest of the paper is organized as follows: Section 2 describes the structure and method of triplet network model based on MS-SECNN. Section 3 presents the experiments, results and discussion. Section 4 concludes the paper.
Fault diagnosis model
Correlation theory
Squeeze-and-Excitation module
Sequence and Exception Network (SENet) was proposed by Momenta at the CVPR2017 meeting and won the champion of image recognition at ImageNet 2017 [8]. When using multi-scale convolution network (MSCNN) [7] to carry out industrial robot fault diagnosis, some invalid information will also be extracted by the network. Invalid information is distributed on some channels of the feature map, which distracts the “attention” of the CNN network [1] and hinders the improvement of fault diagnosis accuracy. To solve this problem, this paper uses the channel attention mechanism SENet to improve the structure of MSCNN. We give different weights to each channel of feature mapping to adjust the importance of it, and the model will focus on the channel which has more effective feature information. This structure can adaptively learn the importance of each feature channel, which can improve the fault diagnosis accuracy of the model. The typical structure is shown in Fig. 1.

Channel attention model SENet.
The SENet module includes two parts: Sequence and Exception. The squeeze operation aims to use a value with global receptive field to represent the importance of each channel feature. A feature map with global receptive field is obtained by pooling the global mean of each channel. The exception operation uses the full connection layer to act on the characteristic graph to predict the importance of each channel. The importance weight that we have obtained then acts on the corresponding channels to construct the correlation between channels.
Brief description of triplet loss
The fault diagnosis model based on triplet network uses triplet loss [16] as the loss function. The input of triplet loss includes anchoring example, positive example and negative example. The similarity calculation between samples is realized by making the distance between anchoring and the positive example less than that between anchoring and negative example. At the same time, in order not to aggregate the characteristics of the sample into a very small space, it is required that for two positive examples and one negative example of the same class, the negative example should be at least margin away from the positive example. As shown in formula (1):
Where a is anchoring example, p is positive example, which is similar to a, n stands for negative example, which is different from a, margin is a constant, which is greater than 0. The final optimization goal is to shorten the distance between a and p and widen the distance between a and n, so as to realize the fault classification of industrial robot.
When the general convolution neural network uses a single size convolution kernel to extract fault features, it may miss local important features, which will result in low accuracy and poor generalization performance of the model. Besides, complex feature information often exists in the fault data of multi-axis industrial robots. Therefore, multi-scale learning is integrated into the traditional CNN architecture, and the SENet module attention mechanism is added to it to build the MS-SECNN network structure, as shown in Fig. 2. MS-SECNN consists of three layers. The convolution kernel scale used in each layer is different from that of other layers. Our convolution kernel scale adopts 3 × 1. 5 × 1. 7 × 1. The operational signals of industrial robots are divided into training set, test set and validation set. A random sample X is taken from the training set as input signal, which is also as the input of three-level branches at the same time. Three convolution cores of different sizes are used for convolution operation, and different filters are used to acquire fault features of different frequency bands of signals to improve the model diagnostic accuracy. Each branch has two convolution layers and two pooling layers. After extracting the features, a concatenation layer is used to connect the features after three layers of multiscale convolution. Then we will enhance feature transmission through SENet attention mechanism to obtain deeper feature information. Finally, low-dimensional features are output through global pooling. By using the MS-SECNN structure above, the fault signal can further extract feature effectively. It plays a key role in the important feature classification process of fault diagnosis models.

Structure of MS-SECNN.
The improved MS-SECNN is used in the sub-network structure of the triplet network fault diagnosis model and has two advantages: (1) The hierarchical learning structure of multiple convolutions and pooling layers can effectively learn advanced fault features. (2) Multiscale learning schemes can provide additional diagnostic information at different scales. So, it can greatly improve the learning ability of features and the fault diagnosis performance of multi-axis industrial robots with small samples.
The structure of MS-SECNN fault diagnosis model based on the triplet network is shown in Fig. 3. The triplet network consists of three pre-feedback networks with shared weights [6]. The model input signal consists of a triple of sample Anchor, Positive, and Negative. First, a random sample is selected from the training dataset, which is called Anchor (denoted as x). Then we randomly select one Anchor sample of the same class and another Anchor sample of different classes, called Positive (denoted as x+) and Negative (denoted as x−) respectively. The training parameters make x close to x+ and away from x−, so as to realize the task of fault classification. The main structure of the model is composed of three identical pre-feedback networks (parameter sharing) MS-SECNN. Triplet loss is used as the loss function to calculate the difference of triplet samples. The input dimension of the model is the actual dimension of the industrial robots operation data, and the output dimension is the machine fault category.

MS-SECNN fault diagnosis model based on triplet network.
In most fault diagnosis data sets, the number of samples in each category is not easy to balance, and it is much more difficult to obtain fault samples than normal samples, which will lead to the poor diagnosis effect of the model for fault types with a small number of samples. In the case of small samples, the triplet network can expand the training times of the model, deeply mine the relationship between different samples and avoid the problems of network underfitting caused by the insufficient number of samples, which can better solve the difficulty of industrial robot fault diagnosis under small and medium samples in actual production.
Dataset description
In order to verify the fault diagnosis model of multi-axis industrial robot, the real operation data set of a six axis industrial robot is used. This robot adopts high rigid arm, advanced servo and fast movement speed, which can be fully applicable to grinding, handling, welding and other industries. The six-axis industrial robot used in this experiment is shown in Fig. 4.

Six axis industrial machine.
The faults of multi-axis industrial robot are characterized by low incidence and high impact. Due to the lack of fault data records in the factory, we got real-time running data of six joint axes of industrial robots through fault injection experiments, and there are 31 characteristic variables for each axis running data. The experiment focuses on the faults of the robot decelerator and the server motor. Two effective variables are selected: the feedback moment (tfb) and the feedback current (flow). The failure dataset is shown in Table 1. It can be seen that the data set includes seven types of states: normal, 1-axis reducer and 2-axis motor failure, 1 and 3-axis reducer failure, 3 and 4-axis reducer failure, 3-axis reducer failure, 2-axis motor failure, 4-axis reducer failure. In the experiment, the data sets containing seven categories are divided into training set, test set and verification set according to the proportion of 60%, 20% and 20%.
Industrial robot fault data set
During the experiment, in order to verify the performance of MS-SECNN fault diagnosis model based on triplet network, we evaluated it through Few-shot [17] learning method. 7, 35, 70, 350 and 1400 samples were randomly selected from the training samples of the whole data set for a series of comparative experiments.
As shown in Table 2. In order to verify the impact of different training sample sizes on the performance of fault diagnosis model, we use Few-shot (One-shot, Three-shot, Five-shot) learning strategy to compare the performance of the proposed method with deep learning algorithms WDCNN [21], FDCNN [18], MSCNN [7] under different sample sizes. Among them, MS-SECNN and MSCNN are compared as the sub-network structure of the proposed model. It can be seen from Table 2 that the accuracy of the proposed method MS-SECNN under the three strategies is higher than that of the other three algorithms, and the accuracy performance is the highest under the Five-shot strategy. The accuracy of MS-SECNN, WDCNN, FDCNN and MSCNN increases with the increasing number of training samples. When the training set is 350 samples, only MS-SECNN achieves 91.03% accuracy under the Five-shot learning strategy. When the training set is 1400 samples, both MS-SECNN and MSCNN can achieve 99% accuracy under the Five-shot learning strategy, but MS-SECNN is better than MSCNN algorithm with 99.21% accuracy. These results show that the MS-SECNN fault diagnosis model based on triplet network has better performance on limited data sets. The accuracy of the model in Five-shot learning is always higher than that of One-shot and Three-shot.
Performance comparison of MS-SECNN with WDCNN, FDCNN and MSCNN under different learning strategies of few-shot (one-shot, three-shot and five-shot)
Performance comparison of MS-SECNN with WDCNN, FDCNN and MSCNN under different learning strategies of few-shot (one-shot, three-shot and five-shot)
The confusion matrix for the methods presented in this paper is shown in Fig. 5. The ordinate is the actual label that each row represents the number of correct and incorrect classifications of such failures. It adds up to 250, representing 250 test sample sets. The abscissa is the prediction label and each column represents the number of such faults in all samples. As can be seen from Table 2, Five-shot has better evaluation performance in the Few-shot learning method, so it is used for confusion matrix test. (a), (b) and (c) in Fig. 5 are the test results with training samples of 7, 350 and 1400 respectively. The diagnostic performance is the best when the training sample is 1400.

Confusion matrix.
In this paper, a fault diagnosis method of multi-axis industrial robots with small samples is proposed, which is based on triplet network MS-SECNN fault diagnosis model. This method effectively solves the problem of low accuracy and under-fitting of fault diagnosis model for industrial robots under small samples. Among them, the sub-network structure of the triplet network is improved on the basis of the traditional CNN. The combination of multi-scale learning and SENet channel attention mechanism enhances feature learning ability and effectively improves the performance of fault diagnosis models. By comparing the similarity of triplet signals, the proposed method can also achieve better classification effect in the case of small samples.
In the future, we will further study the imbalance of data distribution based on small samples to improve the performance of learning algorithms. Fault diagnosis research of industrial robots under multiple working conditions is also considered to expand the scope of application scenarios.
Conflict of interest
None to report
Funding
This study was funded by the Guangdong Basic and Applied Basic Research Fund Project (No. 2020B1515120010), Key Technology Project of Foshan City (No. 1920001001367), the Guangdong Science and Technology Plan Project (No. 2019B010139001), the Guangzhou Science and Technology Plan Project (No. 201902020016), and the Guangdong Natural Science Fund Project (No. 2021A1515011243).
