Abstract
In the actual industrial application of robots, the characteristics of robot malfunctions change accordingly as the working environment becomes increasingly diverse and complex. Utilizing the original fault diagnosis models in new working environments correspondingly leads to a decline in the performance and the generalization capability of the model. Moreover, the monitoring data collected in new working processes often has limited or no labels, making the diagnosis models trained with this data unable to identify faults accurately. In this paper, we propose a Domain adaptive Cross-process Fault Diagnosis method (DCFD) to leverage knowledge from existing working processes for diagnosing faults in new working processes. DCFD uses Multi-Kernel Maximum Mean Discrepancy (MK-MMD) to measure the difference between the current working processes and the previous working processes, enhancing the fault diagnosis capability of the robotic system in cross-process scenarios. DCFD achieves an average fault classification accuracy of 98% on 12 types of migration tasks, which demonstrates the effectiveness of DCFD on cross-process fault diagnosis classification tasks in real-time industrial application scenarios.
Introduction
Industrial robots are usually used for different types of working scenarios. With the continuous diversification of the process environment, industrial robots are usually affected by different working conditions, such as different temperatures, humidity, and loads. These factors may affect the robots’ performance and fault characteristics; correspondingly, the fault features and patterns often change when robots operate under different working conditions. It means that the fault labels known in the previous process cannot be applied to fault diagnosing in the target process, increasing the difficulty of diagnosing faults in robots. Furthermore, the generated fault data is usually unlabeled. The lack of labeled data becomes a significant obstacle to effective fault diagnosis. To enable fault diagnosis models to be trained and tested under diverse working conditions, it is necessary to establish a cross-process fault diagnosis model based on transfer learning, enabling the model to diagnose faults in the target domain without labeled data. Cross-process fault diagnosis can provide a more comprehensive understanding of system faults under different operating conditions, thereby enhancing the accuracy and reliability of fault diagnosis.
Aiming at this motivation, we propose a transfer learning method based on Domain adaptive Cross-process Fault Diagnosis (DCFD) to diagnose faults of industrial robots. By continually learning from the source domain knowledge and using MK-MMD to constrain the distance between the two kinds of domains, the model can accurately diagnose faults even in the unlabeled target domain, demonstrating its strong generalization capabilities. Different feature extraction methods are applied to both the source and target domains. The source domain, with abundant labeled data, employs the Triplet MS-SECNN to learn more detailed and discriminative features through deep comparative learning. For lacking labeled data, the target domain uses the MS-SECNN network structure to extract the features. In addition, MK-MMD is used to minimize the distance between these two kinds of domains. MK-MMD uses a set of kernels to measure the distribution differences between these domains. It allows the model to capture and adapt to domain differences across different feature representations and scales, providing more flexible and comprehensive domain adaptation capabilities. Overall, the main contributions of this paper can be summarized as follows:
1) The combined network structure of Triplet MS-SECNN is used as the feature extractor to enhance the fault diagnosis performance. Specifically, the Triplet MS-SECNN is used for the feature extraction layer in the source domain, and a separate MS-SECNN network is used for feature extraction in the target domain. The Triplet MS-SECNN network can extract practical features to obtain high fault diagnosis performance in the case of a small sample dataset.
2) Multi-Kernel Maximum Mean Discrepancy (MK-MMD) is used to improve the fault diagnosis ability in cross-process scenarios. A similar distribution in feature space is obtained through feature transformation or domain alignment of the data in different processes. Utilizing domain adaptation to minimize MK-MMD between multiple layers of two kinds of domains, the model can adapt the learned representations from the source domain and apply them to the target domain.
3) To optimize model effectiveness, we employ a joint utilization of triplet loss and MK-MMD loss as the loss function. The triplet loss is applied to enhance fault classification accuracy, while the MK-MMD loss mitigates the distribution discrepancy between both the source and target domains. This reduction in distribution difference ensures that the cross-process fault diagnosis model exhibits satisfactory classification results on target domain data, enhancing the model’s generalization performance.
4) Extensive experimental results show that the proposed method can extract cross-domain invariant features and significantly improve cross-domain testing performance. The fault classification accuracy of the cross-process fault diagnosis model on 12 types of migration tasks is 98% on average.
The remainder of the paper is structured as follows. Section 2 delves into the existing literature pertinent to transfer learning. Section 3 provides a comprehensive background on transfer learning, domain adaptation, and Multi-Kernel Maximum Mean Discrepancy (MK-MMD). Section 4 outlines the proposed structure of the DCFD and the cross-process fault diagnosis model. Finally, Section 5 discusses the experimental findings, and Section 6 concludes the paper with insights into potential research directions.
Related works
Industrial robot cross-process fault diagnosis is a typical cross-domain problem. In intelligent fault diagnosis, Yang et al. [23] proposed a feature-based transfer neural network model that transferred diagnostic knowledge from laboratory machines to actual machines for fault diagnosis. Wang et al. [18] targeted bearing operating under various conditions and combined CNN with Deep Long Short-Term Memory (DLSTM) to build the model. They utilized transfer learning strategies for identifying bearing fault types under new conditions. Wu et al. [19] used meta-learning to develop a few-shot transfer learning method suitable for fault diagnosis of rotating machinery under variable conditions. Dong et al. [6] generated a large amount of diverse simulated data using dynamic models of bearings. They then applied diagnostic knowledge learned from this simulated data to real-world scenarios using CNN and a parameter transfer strategy. Han et al. [8] proposed a limited data deep transfer learning method for mechanical fault diagnosis. The network adapted to the target diagnostic task without overfitting using rich source data and sparse target data. They also designed multiple domain discriminators to learn domain-invariant features in each corresponding fault category. Qian et al. [13] introduced a deep transfer learning network based on convolutional auto-encoders. The convolutional autoencoder, serving as a feature extractor, removes noise, enabling fault transfer diagnosis of planetary gearboxes under different loads. G. D’Angelo et al. [5] proposed a network attack classification method based on Recurrence Plots and CNN-Autoencoders. This approach involves extracting meaningful features from the constructed plots to achieve effective identification and classification of network attacks. G. D’Angelo, E. Farsimadan, M. Ficco, et al. [4] presented a method for privacy-preserving malware detection in Android-based IoT devices through federated Markov chains. This approach leverages the capabilities of federated learning and Markov chains to detect and classify malicious software without sharing user data, providing a novel approach to feature extraction and classification using deep networks. Yan et al. [22] proposed a new general domain adaptation method for diagnosing unknown bearing faults. This method involves extracting features from each type of fault sample to form a diagnostic feature center and addressing the harmful transfer problem of unknown fault-type samples with three optimization functions. Previous research has focused on learning domain-invariant features through domain adaptation.
However, domain alignment methods cannot eliminate domain shift, and target samples may be incorrectly classified by the decision boundaries learned from the source domain, ultimately leading to the domains being aligned in the wrong direction. Wang et al. [16] proposed a deep adversarial domain adaptation network (DADAN) to transfer fault diagnosis knowledge. DADAN used domain-adversarial training based on the Wasserstein distance to learn domain-invariant features from the raw signal. Chen et al. [3] proposed a Domain Adversarial Transmission Network (DATN), which utilized feature learning networks to learn hierarchical representations of the source and target domains. The network weights learned from the source task were transferred to the target task, and domain adversarial training techniques were introduced to minimize the differences between the source and target distributions. Wang et al. [17] proposed a Multi Discriminator Deep Weighted Adversarial Network (MDWAN) method, which was especially suitable for partial transfer learning where the number of target domain categories was less than the source domain. Li et al. [12] used two deep learning methods, Convolutional Neural Network (CNN) and Multi-Layer Perceptron (MLP), to train several base models with a mount of source data, which can improve the diagnostic accuracy between not only the working conditions from the same component but also different components. Zhang et al. [25] proposed a universal source-free domain adaptation method that can handle cross-process fault diagnosis scenarios without access to the source data and was free of explicit assumptions about the target fault modes. Data acquisition and labeling are costly in industrial applications, and datasets are often imbalanced. To address this, Cao et al. [1] propose a Pseudo-Classifier Maximum Mean Discrepancy (PCMMD) approach, which is utilized to drive a Multi-Input Multi-Output Convolutional Network (MIMOCN). This method aims to reduce the cross-domain distribution differences among various categories within the deep feature space.
Existing methods typically focused on acquiring diagnostic knowledge from a single source domain, often overlooking the rich foundational information available from multiple source domains. Wang et al. [15] developed an innovative multi-source domain feature adaptation network (MDFAN) for bearing fault diagnosis under time-varying conditions. This network employed feature extractors to learn transferable features from various source and target domain pairs, incorporating a domain distribution alignment module to minimize the domain shifts, thereby enabling knowledge acquisition across multiple domains. Many learning approaches for variable speed fault diagnosis overlooked the task-specific decision boundaries, making it challenging to align feature distributions across different domains completely. Li et al. [11] proposed an adversarial domain adaptation with CORAL alignment and asymmetric mapping. This asymmetric mapping feature extractor could extract more distinctly different features specific to each domain.
Although the fault diagnosis based on transfer learning has achieved some results in the fault classification of robots, it still needs to solve the fault diagnosis problem of cross-processes conditions. Due to variations in the operational states of robots across different processes and the corresponding differences in the generated data states, the accuracy of cross-process fault diagnosis is impacted. However, in the practical application of industrial robots, the same robot is often used in different process scenarios. Consequently, studying a cross-process fault diagnosis method based on transfer learning is significant.
Background
Transfer learning
Transfer learning is a widely adopted approach to tackle the challenge of constructing high-performance predictive models when faced with limited data or a lack of labels in the target domain. It can apply the knowledge learned from the source domain to the target domain [10]. However, the potential differences in data distribution between different domains may degrade the learning performance in the target domain. Transfer learning seeks to enhance the learning performance in the target domain by leveraging knowledge from the source domain, thus mitigating the distribution discrepancy between the source and target domain.
Domain adaption
Domain adaptation is a transfer learning method that learns knowledge from the source domain and applies it to other related goals to achieve cross-domain classification tasks [7]. Since industrial robots may encounter various faults in different processes, it is essential to consider the differences between these processes. Domain adaptation methods can transform features or align domains by processing data from different processes, obtaining similar distributions in feature space. In this way, the data in the source domain can be used to improve the diagnosis performance in the target domain.
Multi-kernel maximum mean discrepancy (MK-MMD)
To better reduce the distribution difference between the source domain and the target domain, the Maximum Mean Discrepancy (MMD) method is used in transfer learning. As a measure of the difference in the probability distribution, MMD can measure the distance between two probability distributions and perform tasks such as feature selection and anomaly detection. The basic idea of MMD is to compare and quantify the difference by calculating the central difference of the two probability distributions in the feature space. In the application of MMD [2], the square distance between the embedding of empirical kernel means is usually defined as an estimated value, as shown in formula (1):
In the formula,
In the cross-process fault diagnosis of industrial robots, there may be some problems in the direct use of MMD due to the significant differences in the distribution of robot motion data under different processes. The Multi-Kernel Maximum Mean Discrepancy (MK-MMD) algorithm is proposed to enhance the model’s generalization ability. The basic idea of MK-MMD is to reduce the difference in feature distributions by comparing the difference between two probability distributions. To achieve this goal, MK-MMD employs multiple kernel functions to construct an overall kernel function. Different kernel functions can generate different distributions, and the combination of these distributions can better capture the differences between the source domain and the target domain. It can enable feature transformations under different processes, thereby improving domain adaptation performance. The formula for the Gaussian kernel is presented in Eq. (2):
Where K represents the new Gaussian kernel obtained by combining multiple kernel functions, and U represents the size of the number of kernels.
The calculation process of MK-MMD can be divided into the following steps: 1) Mapping the dataset to the high-dimensional feature space; 2) Calculating the inner product matrix in the feature space; 3) Based on the inner product matrix, the value of each mapping function f is calculated and combined, ultimately yielding the value of MK-MMD. The calculation formula for MK-MMD is shown in Eq. (3):
Where

The structure diagram of DCFD.
MK-MMD can better deal with the data under nonlinear distribution by combining the MMD algorithm with the kernel method. Specifically, MK-MMD maps the data into a high-dimensional feature space and calculates the inner product in the feature space through the kernel function to obtain the difference between the two probability distributions. By selecting appropriate kernel functions and tuning parameters, MK-MMD can adapt more effectively to different data distributions, thereby enhancing fault diagnosis accuracy.
Structure of DCFD

Construction of domain adaptive cross-process fault diagnosis model.
The proposed cross-process fault diagnosis model structure based on transfer learning is shown in Fig. 1. In order to enhance the generality of the fault diagnosis, the extraction of universal features (transfer fault features) is accomplished using the Multi-Scale Enhanced Convolutional Neural Network with attention (MS-SECNN) [7] structure. The Triplet MS-SECNN [21] network structure is used in the feature extraction layer of the source domain, and a separate MS-SECNN network structure is used in the target domain. First, the pre-training model of source domain data is used as a feature extractor, and triplet loss [24] is used as a loss function of the pre-training model according to the characteristic that the pre-training input data is a triplet sample pair. Then, the model is fine-tuned on the target domain to adapt it to the characteristics of the target domain. This process realizes the weight sharing of the feature extraction layer. Subsequently, deep features from the source and target domains are mapped to a common high-dimensional feature space in the domain adaptive layer. The feature distribution difference between the two domains must be entirely reduced to extract practical transferable features. In this paper, the MK-MMD method is adopted for domain adaption. Meanwhile, the MK-MMD Loss is used to assess the adaptive effect of the domain, with smaller values indicating more minor domain distribution differences and better domain adaptation performance.
The end-to-end cross-process fault diagnosis of industrial robots is constructed by feature transfer learning and domain adaption. The construction process of the cross-process fault diagnosis model is shown in Fig. 2.
The specific detailed steps are as follows:
Data processing. The original data come from the project cooperation plant’s real-time operation, and four types of processes are combined into 12 types of migration tasks. According to the source domain and target domain data sets of each type of migration task, 70% is used as the training set, 15% as the testing set, and 15% as the verification set. Model training. Firstly, a domain adaptive cross-process fault diagnosis model is built, and the MS-SECNN network structure is used for multi-scale feature extraction and learning. Besides, the weights of feature extraction layers in the source and target domains are shared. Then, the labeled source domain data features and unlabeled target domain data features are mapped to high-dimensional space for depth domain adaptation MK-MMD calculation. Finally, the training model is obtained and saved. Fault diagnosis. Output cross-domain fault diagnosis results. Firstly, the feature indexes are randomly extracted from the training set, and the testing samples and feature indexes are transformed into low-dimensional features through the trained model. Next, use Euclidean distance to measure the low-dimensional feature distances between samples and calculate the average distance between the testing samples and the feature indicators to complete the fault classification.
Experimental results
Dataset and experimental setting
The dataset is collected based on operating current signals of a six-axis industrial robot manufactured by a robotics company in Guangdong, China, to establish a model capable of performing cross-process fault diagnosis for industrial robots. The experimental subjects, six-axis industrial robots, are characterized by low incidence rates of faults but high impact when they occur, primarily used in grinding, welding, and handling tasks. Due to the lack of failure data records in the factory, real-time operational data of the six joint axes of the industrial robot was obtained through fault injection experiments. The specific fault injection experiment setup is as follows: (1) Utilize multiple six-axis industrial robots of the same model. (2) Select faulty components from different axes and replace them on robots operating under normal conditions (for example, a faulty reducer in the J2 axis, a faulty motor in the J4 axis). (3) Set the machine operation parameters to be the same but with different running speeds and collect feedback current signals from the six-axis industrial robots under normal and faulty conditions. The dataset can be divided into A, B, C, and D types according to process conditions. The dataset variables collected under the four types are all feedback current (flow). The fault type studied in this paper is a motor fault. The fault components are four mechanical axes, J1, J2, J4, and J6, with 140 samples in each type and 560 samples, as shown in Table 1.
Industrial robot cross-process fault diagnosis dataset
Industrial robot cross-process fault diagnosis dataset
The waveform of the feedback current variable under four different processes of the six-axis industrial robot collected in this experiment is shown in Fig. 3. It can be observed that the feedback current under different processes has different fluctuation periodicity, and the variation amplitude of the feedback current under different process conditions is also different. Therefore, diagnosing faults under cross-process conditions is more complex than under fixed-process conditions. The distribution difference of signals leads to the uneven distribution of training data and test data under cross-process conditions, which quickly causes domain offset and affects the accuracy of the fault diagnosis model.

Four types of process feedback current waveforms.
In order to validate the DCFD cross-process fault migration diagnosis model, the experiments use 12 types of migration tasks A-B, A-C, A-D, B-A, B-C, B-D, C-A, C-B, C-D, D-A, D-B, and D-C workers to assess the effectiveness of the proposed method. The dataset division of each class of migration task follows the same setup. Taking the migration task (A-B) as an example, the dataset of the class A process is used as the labeled source domain, and the dataset of the class B process is used as the unlabeled target domain, where the source and target domain datasets of each class are randomly divided, 70% as the training set (98 samples), 15% as the test set (21 samples), and 15% as the validation set (21 samples). The source and target domain datasets for the 12 classes of migration tasks are divided as shown in Table 2.
Source and target domain dataset partitioning for 12 types of migration tasks
To appreciate the classification of the proposed model on the target domain dataset, we considered traditional evaluation metrics that can be extracted from the confusion matrix: Accuracy (Acc), as defined below.
For each category, TPs (True Positives) refer to the number of samples the model correctly classifies as usual. In contrast, TNs (True Negatives) refer to the number of fault samples correctly identified and classified as faults by the model. Conversely, FPs (False Positives) refer to the number of fault samples incorrectly classified as usual by the model, while FNs (False Negatives) refer to the number of usual samples that the model incorrectly classifies as faulty.
In the experiments, we set the Margin for TripletMarginLoss to 4. We employed a stochastic gradient descent (SGD) optimizer with a momentum of 0.8 and a momentum decay of 0.001. We set the initial learning rate to 0.01, allowing it to decay gradually. The batch size was set to 64. Set the kernel_mul parameter to 2.0 to compute the multi-kernel Gaussian kernel matrix. The number of training epochs was 100.
Gaussian kernel selection of MK-MMD
In the cross-domain fault diagnosis model for industrial robots, a distribution difference often exists between the source domains, affecting the model’s generalization performance. The MK-MMD can adapt the distribution difference between the source and target domains by using multiple kernel functions and improve the model’s performance in the target domain. The number of Gaussian kernels in the MK-MMD method is an important parameter influencing the model’s performance and robustness. To select the optimal and reasonable number of Gaussian kernels for MK-MMD, a learning rate of 0.001 is set in the experiments. Considering both model performance and computational efficiency, the range of Gaussian kernel numbers for MK-MMD is set from 1 to 7, as in Formula 3, and grid search is used to fine-tune the parameters. Firstly, MK-MMD with kernel numbers ranging from 1 to 7 is applied to the proposed DCFD cross-domain fault diagnosis model. The best Gaussian kernel number is selected based on the evaluation of the average fault classification accuracy for 12 transfer tasks.

Average classification accuracy of MK-MMD with different Gaussian kernels.
Figure 4 illustrates the average classification accuracy of MK-MMD across various numbers of Gaussian kernels. Intuitively, it is evident from the figure that the fault diagnosis model’s average classification accuracy exhibits a gradual increase as the number of Gaussian kernels rises. It peaks when the number of Gaussian kernels is set to 5. After reaching the peak, the classification accuracy starts to decline and stabilize. The results show that the cross-domain fault diagnosis model performs best when the number of MK-MMD kernels is 5. Therefore, the optimal MK-MMD Gaussian kernel number is determined to be 5, and this optimal model is used for experimental analysis in the following experiments.
In the above experiments, we identify the MK-MMD method with the optimal number of cores and use it to construct a cross-domain fault diagnosis model for industrial robots. In order to further validate the application of Multi-Kernel Maximum Mean Discrepancy (MK-MMD) in measuring the distance between source and target domains and assess its performance in reducing distribution discrepancies for cross-domain fault diagnosis models, a comparative experiment is designed and tested. This experiment is compared with three methods: MK-MMD, MMD, and a Domain-Free Adaptive Approach (DFA). We aim to evaluate their performance differences in fault diagnosis models. Initially, MK-MMD and MMD are separately applied to the domain adaptation layer of the proposed fault diagnosis model as loss functions to minimize the distance between the two domains. In the case of the DFA, no distance-based loss function is utilized to constrain the distance between the two domains. The experiment involves datasets from 12 different transfer tasks and focuses on comparing the accuracy of target domain data classification under varying distribution discrepancy loss functions. The experimental results are shown in Fig. 5.

Comparing the accuracy of different distribution differences.
The experimental results indicate that utilizing MK-MMD as the loss function demonstrates superior performance in classifying target domain data compared to MMD and the Domain-Free Adaptive approach. Specifically, in the 12 transfer tasks, the transfer tasks utilizing the MK-MMD method achieves an average accuracy rate of 98.1% in classifying target domain data, significantly higher than those using MMD (86.2%) and the Domain-Free Adaptive Approach (77.8%). This outcome demonstrates that MK-MMD, through a combination of multiple kernel functions, can more effectively capture the diverse characteristics of data from the source and target domains in industrial robots, thereby enhancing the classification capability of the cross-process fault diagnosis model. Furthermore, MK-MMD exhibits good scalability, as it can adapt to new domains or features by adjusting the combination of kernel functions. This adaptability enables the easy integration of new domains and features, thereby enhancing the generalization ability of the DCFD model across domains.
The accuracy of the proposed DCFD model in 12 categories of migration tasks can be referred to in the blue bar part in Fig. 5. The blue bar chart shows that the classification accuracy of all 12 categories of migration tasks reaches more than 91%, and the highest classification accuracy reaches 99.9%, which means that the proposed model is reliable for cross-process migration fault diagnosis tasks. In order to more intuitively demonstrate the effectiveness of the proposed method, four typical migration tasks are selected in Table 1: A-B migration task with adjacent axis faults, B-C migration task with the same speed, and C-D and D-A migration tasks with significant distribution differences. Figure 6 displays scatter plot visualizations for four types of transfer tasks. These tasks represent the visualization results of the proposed cross-process fault diagnosis model, which classifies the source domain data of industrial robots before and after classifying the target domain. The scatter plot visualizations show the mingling of different fault characteristics before classifying the target domain. However, after classifying with the DCFD model, the distribution of distances between similar fault characteristics in the source and target domains narrows. In contrast, the distances between different categories widen.es between different categories widen. It further demonstrates that the proposed method can achieve good classification results when applied to cross-process industrial robot fault diagnosis.

Visual scatter plot of classification results for four groups of typical migration tasks. (a) Migration tasks A-B, (b) Migration tasks B-C, (c) Migration tasks C-D, and (d) represent Migration tasks D-A. (a1), (b1), (c1), and (d1) represent After source domain classification, (a2), (b2), (c2), and (d2) represent Target domain before classification, and (a3), (b3), (c3), and (d3) represent After target domain classification.
To demonstrate the superiority of the DCFD, we compare it with three typical migration learning methods, which are Transfer Component Analysis (TCA) [20], Joint Distribution Adaptation (JDA) [9], and Correlation Alignment (CORAL) [14].
TCA is a domain adaptation technique based on feature space transformation, which can map the source and target domain data into the same feature space and perform feature transformation by maximizing the similarity of data distribution. The advantage of TCA is that it can handle the distance between multiple source domains and one target domain, and the model is simpler and easier to implement. However, TCA also has some disadvantages, such as higher computational costs for high-dimensional data and extensive sample data, which may lead to an increase in model complexity and running time.
JDA is a domain adaptation technique based on subspace transformation. It can transform features by mapping source and target domain data into different subspaces separately and maximizing the distance between them. The advantages of JDA include its ability to enhance the model’s generalization performance by utilizing label information from both the source and target domains. However, JDA also has some disadvantages. For example, nonlinear data may lead to the loss of feature information after transformation.
CORAL is a domain adaptation technique based on covariance matrix alignment, which can perform feature transformation by minimizing the difference in the covariance matrix between source and target domains. The advantage of CORAL is that it can model computation faster and can quickly handle large-scale datasets. However, CORAL also has some disadvantages. For example, more complex data distribution may cause the transformed feature information to become unstable. It cannot use the label information to improve the model’s generalization performance.
The classification results of the above three methods and the DCFD fault diagnosis method on 12 types of migration tasks are shown in Table 3. The best classification results for each method are marked in bold, and the average value of fault diagnosis accuracy corresponds to the last row of the table.
Comparison of accuracy of different experimental methods
Comparison of accuracy of different experimental methods
As observed from Table 3, when the distribution difference between the source and target domains is minimal, such as in the mutual transfer tasks between Process A and B, the accuracy of the three methods, TCA, JDA, and CORAL, all exceed 90%, achieving commendable results. However, the transfer effect noticeably declines in other transfer tasks, facing a significant distribution difference between the two domains. For instance, in the mutual transfer tasks between A and D, the accuracy of the three methods did not surpass 90%. It indicates that the generalization ability of traditional transfer learning methods in feature extraction is affected by process variations. The proposed DCFD method significantly outperforms the other three typical transfer learning methods in all transfer tasks, achieving high accuracy in fault diagnosis classification regardless of whether the domain difference is large or small, with an average accuracy reaching 98.1%. It further demonstrates the proposed method’s effectiveness and superior classification performance in cross-domain transfer learning.

Radar diagram of the experimental comparison results of different methods.
In Fig. 7, the experimental results of 12 transfer tasks are compared using different methods via radar charts. The results show that the DCFD model achieves the highest accuracy in all 12 cross-domain fault diagnosis transfer tasks, outperforming the other three transfer learning methods significantly. It indicates that the DCFD model extracts transferable features more clearly, thus reducing the domain shift between the source and target domains. The model demonstrates better generalization capability in cross-domain fault diagnosis, showcasing its potential solid application value in cross-domain fault diagnosis.
In this paper, to investigate the applications of Multi-Kernel Maximum Mean Discrepancy and multi-scale attention convolution for feature extraction across working conditions, we propose a cross-process fault diagnosis model for industrial robots based on transfer learning named DCFD. To improve the quality of feature extraction for the fault diagnosis model, DCFD uses the Triplet MS-SECNN network to extract features in the source domain, apply the MS-SECNN network as the feature extractor for the target domain, and use Triplet Loss to improve the model’s accuracy. DCFD minimizes the MK-MMD between multiple layers of the two domains, adapting the learning representation of the source domain in the target domain. Next, the effectiveness of DCFD on a dataset with the six-axis industrial robot is verified by comparing DCFD with typical migration learning methods TCA, JDA, and CORAL. The experimental results demonstrate that DCFD performs better in cross-process fault diagnosis and achieves higher accuracy.
However, knowledge obtained from a single domain is insufficient. In actual industrial production, there may be situations with multiple source domains and multiple target domains, and acquiring knowledge from multiple source domains can better achieve cross-domain fault diagnosis. Therefore, cross-process fault diagnosis with few data samples in multiple source and target domains holds significant practical importance.
Footnotes
Acknowledgement
This study was funded partly by the National Key R&D Program Project under Grant 2021YFB3301802, the Guangdong Natural Science Fund Project under Grant 2021A1515011243, and the National Natural Science Foundation of China for Key Program under Grant 62237001.
Conflict of interest
The authors declare that they have no conflict of interest.
