A semi-supervised fault diagnosis method based on dynamic decay learning strategy and hypergraph attention network

Abstract

To solve the fault diagnosis difficulties in autonomous underwater vehicle (AUV) thrusters, a semi-supervised AUV fault diagnosis method based on dynamic decay learning strategy and hypergraph attention network (HGAN) is proposed. Firstly, an attention mechanism is introduced into hypergraph convolutional networks (HGCN) to construct HGAN. Then, the HGAN and graph convolutional network (GCN) are designed in parallel architecture to capture both the dynamic and static features of the input graph signal simultaneously. Finally, a dynamic decay learning strategy is introduced, which improves the training efficiency of the proposed model. The diagnostic precision of the proposed approach is verified through experiment analysis. Besides, its superiority over the other related methods is also verified through comparison study.

Keywords

HGAN HGCN semi-supervised fault diagnosis dynamic decay learning strategy AUV

Introduction

As a commonly used tool working underwater, it will cause incalculable economic and safety losses due to the complexity of the underwater environment when failure arises in the propulsion system of AUV. In addition, AUV technology is developing towards greater range, greater depth, higher speed, and higher intelligence with the demand of ocean exploration. Studying effective monitoring and diagnostic methods for AUV thrusters has important safety and economic significance. Fault diagnosis based on vibration signal processing and shallow machine learning are two kinds of traditional methods. Common signal processing methods such as FFT,¹ empirical mode decomposition,² wavelet transform,³ cyclostationary analysis,⁴ sparse dictionary learning,⁵ variational mode decomposition,⁶ Lempel–Ziv complexity,^7,8 etc., have been applied widely in the field of fault diagnosis. However, due to the complex environments such as strong interference from underwater currents and noise, it is inevitable increasing the fault diagnosis difficulty based on vibration signal processing. In addition, fault diagnosis methods based on vibration signal processing require strong professional background knowledge, which lack of directness. In the field of fault diagnosis based on shallow machine learning, Yin et al. proposed a fault recognition method based on the fusion of time-frequency features and support vector data description, which addressed the problem of low classification accuracy in AUV thruster fault recognition.⁹ Zhang et al. proposed an improved SVDD fault mode classification method to solve the problem of difficulty in obtaining optimal kernel function parameters of traditional SVDD.¹⁰ Wang et al. improved dynamic recurrent neural networks and radial basis function networks, and obtained overall fault information by comparing the output values of the AUV motion state model with the measured values of actual velocity and angle.¹¹ The literature proposed an equivalent model based on fuzzy neural network structure for system state monitoring, whose output is the state or fault degree of certain parts of the system.¹² Sun et al. proposed an improved dynamic recurrent neural network for AUV thruster fault diagnosis.¹³ Tang and Gang proposed an intelligent fusion system by combining feedforward neural network with artificial neural network for fault detection and fault-tolerant control of AUV propulsion system.¹⁴ Shi and Zhang proposed a fault diagnosis software based on Bayesian networks. Furthermore, a task context based irrelevant node segmentation method was proposed based on the characteristics of AUV, which further simplified the Bayesian network, reduced the complexity of consequence calculation, and improved the ability of real-time fault diagnosis effectively.¹⁵ DAS proposed a method by combining genetic algorithms with ensemble learning for fault detection of AUVs.¹⁶ Nascimento evaluated a data-driven fault diagnosis scheme based on recurrent neural networks using empirical data. The nominal dynamics of the thruster were modeled by employing control inputs, voltage, rotational speed, and current signals.¹⁷ Costa proposed a diagnostic method using wavelet coefficient energy and SVDD to address the dependence of Fourier diagnosis on the mother wavelet.¹⁸ However, the performance of fault diagnosis based on shallow machine learning often depends on the effectiveness of feature extraction, which limits its further application in AUV fault diagnosis.

The rapid development of deep learning methods in recent years have been applied widely in fault diagnosis. Unlike shallow machine learning, the deep learning methods do not require the feature extraction step, and could realize end-to-end diagnostic results.¹⁹ Sang et al.²⁰ used the collected AUV acoustic signals as inputs of the improved convolutional neural network (CNN) to monitor the health status of in-situ thruster, and diagnosed the fault status of thruster correctly. Xia et al.²¹ proposed a hierarchical attention multi-source data fusion method (HAMFD), which improved the accuracy of fault assessment through a multi-layer attention mechanism and achieved effective fault identification of Qianlong-2 AUV. Moghaddam et al. used a consciousness attention model based on dynamic core assumptions, and combined it with CNN and transfer learning to improve the fault diagnosis accuracy successfully.²² SUBHA combined phase space reconstruction with ELM for predicting sensor outputs of AUV.²³ Chaos et al. presented a fault-tolerant control for AUV in the presence of critical failure.²⁴ Usually, increasing the number of labeled training samples is the most effective way to enhance the generalization ability of the aforementioned artificial intelligence models. However, obtaining a large number of labeled training samples often faces many challenges in practical engineering applications.²⁵ In order to solve the above-mentioned difficulty in obtaining a large number of labeled training samples, researchers have explored various semi-supervised learning methods aiming at mining the information contained in unlabeled samples by using limited labeled samples. Liao et al.²⁶ developed a deep semi-supervised domain generalization network (DSDGN) specifically for fault diagnosis of rotating machinery running under variable speed, and further generalized the model on fault diagnosis tasks at unknown speeds successfully. Lin et al.²⁷ proposed an improved semi-supervised meta learning (ISSML) model, which can be adjusted flexibly according to the different diagnostic scenarios. Tang et al.²⁸ proposed an innovative multi-scale recursive autoencoder (MRAE) training framework, which can achieve accurate feature extraction in environments with scarce labeled training samples. In addition to the inability to obtain sufficient labeled training samples, there are also noise interference caused by various factors such as wind, rain, ships, ocean currents, marine organisms, and industrial activities in the marine environment.²⁹ These noise interference includes both the natural sound sources and anthropogenic sound sources.³⁰ AUVs will be disturbed inevitably by these ocean environmental noises during operation. Faced with the dual challenges of scarce label samples and ocean noise interference, finding effective solutions has become particularly crucial. The introduction of HGCN has overcome the limitations of traditional graph neural networks in handling complex relationships and high-order interactions.³¹ In the latest research progress, HGCN has demonstrated excellent denoising ability. Zhang et al.³² developed an innovative hyperspectral image denoising technique, which utilized hypergraph convolution to capture block level high-order correlations. Lei et al.³³ encoded the sample signal into a hypergraph structure, which captured the high-order correlations in the sample effectively by introducing hyperedges, and then utilized the denoising characteristics of hypergraph convolution to improve signal quality and feature extraction accuracy. Similarly, Zhao et al.³⁴ designed a model assisted by multi-source fusion hypergraph convolutional neural network (MAMF-HGCN), which aimed at solving the intelligent fault diagnosis problem of electro-hydraulic brakes with a small number of samples.

In summary, a semi-supervised AUV fault diagnosis method based on dynamic decay learning strategy and HGAN is proposed, whose aim is to address the issues of scarce label samples and severe interference noise in the engineering application fault diagnosis of AUV thruster. The specific innovations are as follows:

Attention mechanism is introduced into HGCN to construct HGAN, which reveals the intrinsic correlation between network nodes effectively.

A parallel hybrid network architecture model based on HGAN and HCN is constructed, which captures the dynamic and static features of input graph signals synchronously.

A hybrid network training method based on dynamic decay learning strategy is proposed to achieve the effectiveness of model training.

The remaining chapters of the paper are arranged as follows: Chapter 2 is the basic theory and improvement methods related to the paper. Chapter 3 describes the method and specific diagnostic process. Chapter 4 is about experimental verification and comparative research. Conclusion is presented in Chapter 5.

Related theories

HGAN

GCN

Graph neural network (GNN) is a special type of neural network model, which excels at utilizing the inherent relational information in graph structures for learning and processing graph data.³⁵ As an advanced form of GNN, GCN integrates distant contextual information by expanding the global receptive field of the model effectively, thereby improving the performance of the model.³⁶

The standard structure of GCN is shown in Figure 1(a). A typical graph could be represented as $G = G (X, A, E)$ , in which $X \in R^{n * d}$ represents the node feature matrix, $E$ represents the edge set, $n$ represents the number of nodes, and $d$ represents the feature length. $A \in R^{n * n}$ represents the adjacency matrix, and $A_{ij} = (v_{i}, v_{j}) \in E$ . The graph convolution of graph signal $x \in R^{n}$ could be defined as following:

{({x *}_{G} g)}_{θ} = U ((U^{T} x) ⊙ (U^{T} g)) = U g_{θ} U^{T} x

(1)

where $*_{G}$ represents the graph convolution operator, $g_{θ} = diag (θ)$ is the filter parameterized by $θ$ . ⊙ is the dot product between two elements. $U$ is the normalized eigenvector matrix of $L = I_{n} - D^{- 1 / 2} A D^{- 1 / 2}$ . $I_{n}$ is an identity matrix and $D$ is degree matrix. Chebyshev polynomials are used to approximate the filter in order to obtain a more efficient convolution operator, and the graph convolution can be obtained as following:

h = θ_{0} x + θ_{1} (L - I_{n}) x = θ_{0} x - θ_{1} D^{- 1 / 2} A D^{- 1 / 2} x

(2)

Meanwhile, let $θ = θ_{0} = - θ_{1}$ , and the above equation could be revised as following:

h = θ (I_{n} + D^{- 1 / 2} A D^{- 1 / 2}) x

(3)

Further revising $I_{n} + D^{- 1 / 2} A D^{- 1 / 2}$ as $D^{- 1 / 2} A D^{- 1 / 2}$ is needed in order to improve the tendency of gradient explosion existing in GCN, and the last version of GCN could be expressed as following:

H = Leaky Re LU (D^{- 1 / 2} A D^{- 1 / 2} X Θ)

(4)

in which $Θ$ represents the learnable parameter, $H$ is the convolutional signal matrix, and $Leaky Re LU$ represents the nonlinear activation function. The adjacency matrix $A$ dictates the structural connectivity between nodes, determining how diagnostic information is routed among neighboring sensor signals. The degree matrix $D$ is a diagonal matrix utilized to symmetrically normalize the adjacency matrix as $D^{- 1 / 2} A D^{- 1 / 2}$ . This symmetric normalization is crucial because it prevents the scale of the node feature vectors from uncontrollably amplifying or vanishing after repeated graph convolution operations, thereby ensuring numerical stability and consistent message passing across the network.

Figure 1.

Graph structures: (a) the structure of GCN and (b) the structure of HGAN.

HGAN

HGCN structures data into hypergraphs, and incorporates hypergraph convolution operations to capture multidimensional and deep level connections in the data more accurately,³⁷ thereby identifying complex high-order interactions between nodes more effectively. Similarly, suppose that $G = G (X, A, E)$ is a hypergraph with $n$ nodes and $m$ hyperedges. Assign a positive weight $W_{ε ε}$ to each hyperedge $ε \in E$ and store all weights in a diagonal matrix $W \in R^{m * m}$ . Hypergraph G uses an association matrix $H \in R^{n * m}$ . When a hyperedge $ε \in E$ is associated with a node $v_{i} \in X$ , its node degree matrix is defined as following:

D_{ii} = \sum_{ε = 1}^{m} W_{ε ε} H_{i ε}

(5)

The hyperedge degree matrix is as following:

B_{ε ε} = \sum_{i = 1}^{n} H_{i ε}

(6)

Subsequently, the hypergraph convolution is defined as following:

x_{i}^{(l + 1)} = LeakyReLU (\sum_{j = 1}^{n} \sum_{ε = 1}^{n} H_{i ε} H_{j ε} W_{ε ε} x_{j}^{(l)} P)

(7)

in which $x_{i}^{(l)}$ is the feature of the $i th$ node on the $(l th)$ layer. $LeakyReLU$ is the non-linear activation function. $P \in R^{F (l) * F (l + 1)}$ is the weight matrix between the $(l th)$ and $(l + 1 th)$ layers. Furthermore, the matrix form of hypergraph convolution can be expressed as following:

X^{(l + 1)} = LeakyReLU (HW H^{T} X^{(l)} P)

(8)

in which $X^{(l)} \in R^{n * F^{(l)}}$ and $X^{(l + 1)} \in R^{n * F^{(l + 1)}}$ are the inputs of the $(l th)$ and $(l + 1 th)$ layers, respectively.

However, the spectral radius of $HW H^{T}$ is not constrained, which means that the scale of $X^{(l)}$ may change. Stacking multiple hypergraph convolutional layers in equation (8) can lead to numerical instability and increase the risk of gradient explosion/disappearance while optimizing the neural networks. Therefore, obtain the following final formula of HGCN by applying symmetric normalization:

X^{(l + 1)} = LeakyReLU (D^{- 1 / 2} HW B^{- 1} H^{T} D^{- 1 / 2} X^{(l)} P)

(9)

where $D$ and $B$ are the degree matrices of nodes and hyperedges in the hypergraph, respectively.

The schematic diagram of HGAN is shown in Figure 1(b), and its main purpose of constructing HGAN is to learn a dynamic correlation matrix by establishing an attention mechanism, thereby obtaining a dynamic transformation matrix. This matrix can reveal the intrinsic relationships between nodes more effectively. The attention scores are obtained by applying $W$ after hypergraph convolution. Besides, attention mechanisms is used after $LeakyReLU$ layer in order to perform dynamic attention on key features. This simple improvement has resulted in significant improvement of HGCN’s dynamic feature extraction ability.

e_{ij} = F \cdot LeakyReLU (W • [h_{i} ‖ h_{j}])

(10)

where $e_{ij}$ represents the attention score, indicating the feature importance of node $h_{i}$ to $h_{j}$ . ‖ represents connection. In addition, all attention scores are normalized using the following SoftMax function for easy comparison between different nodes:

α_{ij} = SoftMax (e_{ij}) = \frac{\exp (e_{ij})}{\sum_{j_{J} \in N_{i}} \exp (e_{i_{J}^{j}})}

(11)

where $α_{ij}$ refers to the normalized attention coefficient between nodes $h_{i}$ and $h_{j}$ , and $N_{i}$ is the set of neighbors of $x_{i}$ .

Dynamic decay learning strategy

Global cyclic dynamic decay learning strategy

This article proposes an innovative global cyclic dynamic decay learning strategy,^38,39 which not only improves the traditional periodic learning rate, but also retains the idea of global optimization. Furthermore, the paper combines it with the designed network to improve its training effectiveness. A learning rate decay strategy was integrated into the global cyclic learning strategy. Meanwhile, a dynamically changing exponential function was designed in order to better adapt to network model’ needs. Through the above coordinated strategy, a more flexible learning rate adjustment mechanism is provided for the network, allowing the learning rate to adjust dynamically.

In the design of learning rate, the loop length of the learning rate is set as $2 Δ T$ . Specifically, the learning rate $Δ T$ of the previous step is defined as the dynamic linear increasing stage, while the learning rate $Δ T$ of the subsequent step is defined as the dynamic linear decreasing stage. So it can be concluded that the number $m$ of iterations in the learning rate after Nth iteration can be expressed as following:

m = ⌊ 1 + \frac{N}{2 Δ T} ⌋

(12)

where ⌊·⌋ representing the number of learning rate cycles in the network.

The learning rate after its Nth iteration is as following:

α_{t} = α_{min}^{m} + \frac{1}{5^{(α_{max}^{m} - α_{min}^{m}) (max (0, 1 - b) * 0.001)}}

(13)

in which the length of the cycle is $Δ T$ , and the number of cycle periods is $m$ , $α_{max}^{m}$ , and $α_{min}^{m}$ refers to the maximum and minimum values of the learning rate in the mth cycle. The calculation formula was $b = | \frac{N}{Δ T} - 2 m + 1 |$ when $b \in [0, 1]$ .

Based on the set initial value, the learning rate of the model will decay gradually and change during the training process. The dynamic variation of learning rate decay is presented in Figure 2. The benefits of this change are self-evident, as it enables the model to have interference suppression when subjected to strong noise interference, thereby improving the classification accuracy significantly.

Figure 2.

Learning rate variation of global cyclic dynamic decay learning strategy.

Improved loss function

The main focus of this paper is dealing with a supervised learning classification problem, so the improved cross entropy loss function is used as the loss function to optimize the model. The cross entropy loss function is mainly used to measure the difference between the probability distribution of the model’ output and the actual label, and then evaluate the prediction accuracy of the model. In classification tasks, the smaller the value of the cross entropy loss function, the higher the classification accuracy of the model. The formula is as following:

loss = - \sum_{i} p (i) \log q (i)

(14)

where $i$ represents the learning parameters, $p (i)$ represents the probability of correct labeling, and $q (i)$ represents the probability of classification prediction. This article constructs a new scaled loss function based on the cross entropy loss function, which mainly uses smooth target labels ( $Label smoothing = 0.1$ ) instead of traditional single hot encoding labels, reducing the impact of label noise and uncertainty on the model. In addition, the L2 regularization penalty term ( $L 2 regularization = alpha = 0.001$ ) is integrated to penalize the size of model parameters and prevent overfitting. Therefore, the newly constructed scaled loss function can be expressed as following:

\begin{matrix} Improved loss = (loss + alpha * \sum_{i} i^{2}) \\ * (1 - Label smoothing) \end{matrix}

(15)

where $alpha$ is the coefficient of the regularization penalty term, and $Label smoothing$ is the parameter of the smoothed target label. The scaling loss has been achieved through the above formula, which can adjust the numerical range of the loss function to make it be more suitable for the current training task and network structure.

By comparing the performance of these two loss functions (as shown in Figure 3), it can be observed that the cross entropy loss function becomes stable from the 55th round and reaches its optimal effect. Besides, the newly proposed loss function began to stabilize in the 19th round, and showed a more significant improvement in performance. Furthermore, the new loss function is more stable in the first 50 iterations compared to the cross entropy loss function. The comprehensive experimental results show that the proposed method improves the training performance of the model significantly and shortens the convergence period.

Figure 3.

Comparison of loss functions.

The proposed hybrid model and flow chart of the proposed method

The design of hybrid model

As shown in Figure 4, HGAN and GCN worked together through a parallel architecture design, which can capture the dynamic and static features of the input graph signal simultaneously. The HGAN–GCN model further refines the expression of node features by mining the features of nodes and their neighborhoods deeply. HGAN utilizes attention mechanism to adjust information flow dynamically, assigning higher weights to nodes and edges that are more critical for improving fault diagnosis accuracy. This not only improves computational efficiency, but also reduces the interference of redundant information effectively. GCN can transmit and integrate node features in the network, enabling the model to learn and embed features from both dynamic and static dimensions simultaneously. The so-called ‘static’ refers to the graph convolutional networks, which mainly rely on the inherent properties of nodes, and propagate these properties through network structure without involving additional dynamic information. In contrast, the ‘dynamic features’ refers to the captured evolving high-order correlations between nodes. These features are continuously updated during the training iterations via the attention mechanism. This allows the model to adaptively adjust the information flow based on the changing relationships among sensor data. A more refined and efficient model has been constructed by fully utilizing the complementary advantages of HGAN and GCN. Therefore, when facing strong noise interference, the combination of HGAN and GCN can extract fault features more effectively and enhance the anti-interference ability of the diagnostic process.

Figure 4.

The architecture design of the constructed model.

Besides, in order to handle the frequency offset and amplitude variation characteristics of samples under interference, the paper introduces batch normalization (BN) operation into each HGAN layer and GCN layer. Subsequently, there is a non-linear layer. Therefore, the formula for node update is expressed as following:

h_{i}^{'} = Leaky Re LU (BN (\sum_{j \in N_{i}} H_{ij} h_{j}))

(16)

Then construct the dynamic hyperedges with node-hyperedge attention. As for the feature $H^{(l)}$ of the L-layer, the corresponding attention scores of mode i and candidate hyperedge $e_{m}$ are calculated by using the following equation:

e_{im}^{(l)} = \frac{\exp (Leaky Re LU (α^{T} [{Wh}_{i}^{(l)} ‖ {Wh}_{jm}^{(l)}]))}{\sum_{e_{k} \in ε_{i}} \exp (Leaky Re LU (α^{T} [{Wh}_{i}^{(l)} ‖ {Wh}_{jk}^{(l)}]))}

(17)

in which $α \in R^{2 d_{l + 1}}$ and $W \in R^{d_{l + 1} \times d_{l}}$ represent the learnable parameters. $j_{m}$ is the centroid node of the hyperedge $e_{m}$ . During the attention calculation process, the candidate hyperedge. $e_{m}$ actively participates by evaluating its connectivity importance with node i. The calculated attention score $e_{im}^{(l)}$ quantifies the high-order dependency between the centroid node of the hyperedge and its associated nodes. To reduce computational redundancy and filter out environmental noise, a dynamic retention strategy is applied: the hyperedge $e_{m}$ is exclusively retained in the updated matrix if its attention score exceeds the predefined threshold $(τ = 0.15)$ . This mechanism guarantees that only the most robust and informative high-order topological structures contribute to the fault diagnosis.

Subsequently, the simultaneous extracted dynamic and static feature outputs by HGAN and GCN are fused adaptively. The specific expression is as following:

Q^{'} = Cat [HGAN, GCN]

(18)

In the above formula, $Cat$ is the instruction to fuse the extracted dynamic and static feature outputs adaptively. After obtaining $Q^{'}$ , input it into a fully connected (FC) layer to obtain a predicted label set $Z = {z_{1}, \dots, z_{n}}$ , which can be represented as following:

Z = FC (Q^{'})

(19)

The architectural design of the proposed HGAN–GCN framework is specifically tailored to address the unique vulnerabilities of AUV thrusters operating in harsh marine environments. By simultaneously extracting dynamic and static features, and integrating the consistency regularization, the network is inherently equipped to suppress complex, non-Gaussian marine interferences—such as texture noise caused by biological activities and impulse noise from ocean current disturbances. This targeted interference suppression capability prevents model degradation and ensures highly reliable fault diagnosis even when labeled training data is critically scarce.

Flow chart of the proposed method

The overall framework of the proposed method is shown in Figure 5, and details are as follows:

Collect vibration signals of AUV thrusters under various health conditions, including limited labeled samples and a large number of unlabeled samples. Then, the limited labeled samples are combined with some unlabeled samples as the training set, and the remaining unlabeled samples are used as the testing set.

Design an HGAN based on HGCN, and construct the HGAN-GCN model by parallel connection between HGAN and GCN.

Input the training set into the HGAN module and GCN module simultaneously, and extract both dynamic and static features of the input graph signal. Here, the training process is optimized by the dynamic decay learning strategy. The label set predicted by equation (18) can be obtained from two FC layers. Then calculate the loss using equation (15) and update the model using the Adam algorithm.

Input the unlabeled test set into the trained model to obtain diagnostic results, and compare them with ablation experimental models and other semi-supervised fault diagnosis methods based on graph structure models.

Figure 5.

Flow chart of the proposed method.

Experiment

The experimental dataset is collected from the submersible Qianlong-II, and the test rig is shown in Figure 6. The Qianlong-II AUV is a 4500-meter class AUV, measuring 3.5 meters in length, 1.5 meters in height, and 1.5 tons in weight. It is equipped with four main thrusters and one lateral auxiliary thruster on the servo, which allows the AUV to perform exploration tasks flexibly. The proposed method was validated using monitoring data from the South China Sea trials conducted on the submersible AUV Qianlong-II.⁴⁰ Five types of faults are conducted in the experiment: Fault 1: interference from the depth altimeter (corresponding to data V4). Fault 2: stall of the left front motor (corresponding to data V10). Fault 3: incorrect feedback of the left bow motor speed (corresponding to data V11). Fault 4: CTD failure (corresponding to data V49). Fault 5: low battery level (corresponding to data V17).

Figure 6.

Qianlong-2 AUV.

In the experiment, vibration signals were collected from the side thrusters. By artificially simulating various fault conditions, the DH5902N acquisition system captured these signal data at a sampling frequency of 12.8 kHz and a sampling duration of 10 s. By setting 1024 data points as the length of one sample, 204,800 data points were extracted to construct 200 samples. Among these, 80 samples were used for training and 120 for testing. In the training samples, a 10% labeling ratio was set, meaning 20 labeled samples and 180 unlabeled data points. During the training process, the remaining unlabeled samples strictly serve as unannotated data to compute the unsupervised consistency regularization loss and the hypergraph smoothness loss. It must be explicitly emphasized that while the true labels of the testing samples are utilized post-training to evaluate the final diagnostic accuracy, these labels remain strictly hidden and are entirely inaccessible to the model throughout the entire training and validation phases. This ensures a rigorous and unbiased evaluation of the semi-supervised learning framework.

The time-domain and frequency-domain waveforms corresponding to each fault state are shown in Figure 7, based on which it is difficult to distinguish the five states. The reason for this is that under the influence of strong background noise, conventional signal processing methods are difficult to achieve good diagnostic results.

Figure 7.

Time domain waveforms of the five status with their corresponding frequency domain waveforms: (a) time domain waveform of the status 1, (b) frequency domain waveform of the status 1, (c) time domain waveform of the status 2, (d) frequency domain waveform of the status 2, (e) time domain waveform of the status 3, (f) frequency domain waveform of the status 3, (g) time domain waveform of the status 4, (h) frequency domain waveform of the status 4, (i) time domain waveform of the status 5, and (j) frequency domain waveform of the status 5.

The specific configuration of the experimental operating environment is as follows: the operating system is Windows 11, the central processor is an Intel Core i5-12500H, the programming environment is based on the PyTorch framework with Python 3.9, and the runtime is 57.2 s.

The parameter settings of the hybrid model are detailed in Table 1.

Table 1.

The parameter settings of the hybrid model.

Layer	Size
HypergraphConv1	Feature, 1024
BatchNorm1	1024
HypergraphConv2	1024, 1024
BatchNorm2	1024
HypergraphConv3	Feature, 1024
BatchNorm3	1024
HypergraphAttentionConv4	1024, 1024
BatchNorm4	1024
GraphConv1	1024, 1024
BatchNorm1	1024
GraphConv2	1024, 1024
BatchNorm2	1024
Epoch	400
Label smoothing	0.01
Weight decay	0.0005

Convergence speed analysis

The convergence speed of an intelligent diagnosis model is one of the important criteria for evaluating its quality. Visual comparative analysis is conducted on the accuracy curve (ACC) and loss curve (Loss) under 50 dB signal-to-noise level to verify the superiority of the proposed model. The pertinent experimental findings are capable of being referred to Figures 8 and 9, based on which it could be noticed that the presented approach achieved the utmost accuracy and lowest loss after 60 iterations, and its convergence speed is also better than the other three methods (WDCNN, LSTM, and CNN), which also reaches a steady state firstly after 60 iterations.

Figure 8.

Diagnostic accuracy of various methods under 50 dB signal-to-noise level.

Figure 9.

Diagnostic loss of various methods at 50 dB signal-to-noise level.

Classification accuracy analysis

As illustrated in Figure 10, a contrastive examination of the diagnostic accuracy of various models under different levels of marine environmental interference is presented. Through comparison, it is observable that the approach put forward in this study has significant benefits in diagnostic accuracy compared to other comparative methods. Specifically, the approach put forward in this paper is still capable of attaining a high level of diagnostic accuracy of 95.668% while facing 2 dB ocean interference. In the ideal state of interference-free (None), its diagnostic precision is as high as 98.662%. Further comparative analysis shows that the proposed method exhibits significant performance improvement under various levels of interference compared with the WDCNN model, LSTM model, and CNN model.

Figure 10.

Comparison of the accuracy of each model under different ocean disturbance levels.

Stability performance analysis

The result shown in Figure 11 displays the stability verification results of various methods through multiple experiments at a ratio of signal to noise of 50 dB. It is possible to observe that the method proposed in this study still maintains a low level of fluctuation in its results after multiple experiments, demonstrating good stability. This is particularly prominent compared to the other three compared methods.

Figure 11.

The stability of multiple experiments of various methods under 50 dB signal-to-noise level.

Visualization analysis

Figures 12 and 13 demonstrate the t-SNE feature visualization and confusion matrix analysis results of different methods in the 50 dB interference environment. By observing Figures 12 and 13, it is noticeable that the approach put forward in this research performs well in feature point classification, with the minimum number of misclassified feature points and samples. In contrast, other methods have experienced varying degrees of misclassification under the same conditions. These misclassifications not only affect the accuracy of classification, but also expose the limitations of the compared methods in dealing with complex marine disturbance environments. This discovery directly indicates that the approach put forward in this study is capable of addressing the challenges brought by ocean interference effectively, providing a reliable solution for research and application in related fields. Meanwhile, this result indicates indirectly that the comparative method may not have sufficient effectiveness in handling diagnostic tasks under ocean interference conditions. This may be due to the shortcomings of these methods in feature extraction, classification algorithms, or model structures.

Figure 12.

Visualization analysis of various methods at 50 dB signal-to-noise level by t-SNE: (a) proposed method, (b) WDCNN, (c) LSTM, and (d) CNN.

Figure 13.

Confusion matrices of various methods at 50 dB signal-to-noise level: (a) proposed method, (b) WDCNN, (c) LSTM, and (d) CNN.

Comparison analysis with advanced methods

Finally, the proposed method was compared and analyzed with the most advanced diagnostic methods in noise-free environment. Among them, one compared model named as AM-FD⁴¹ is an automatic feature engineering method based on bidirectional gated recurrent unit networks, which can extract the temporal dynamic characteristics effectively, and achieve fault detection through multi-layer perceptrons. HAMFD²¹ is a hierarchical attention multi-source data fusion method, which is designed specifically for fault diagnosis. The comparative results show that the AM-FD method achieves a diagnostic accuracy of over 90% for different types of faults, while the HAMFD method achieves a diagnostic accuracy of over 96% under the same conditions. In comparison, the proposed method has an average diagnostic accuracy of up to 98.662% under the same testing conditions, significantly better than these two existing advanced methods. In addition, we also adopted three important indicators, Precision, Recall, and F1 Score, to further evaluate the effectiveness and performance of the proposed method comprehensively. These indicators not only reflect the accuracy of the model in fault diagnosis, but also reveal its reliability in practical applications. As shown in Figure 14, the proposed method can achieve fault diagnosis effectively under ocean interference.

Figure 14.

Precision, Recall, and F1 Score.

Ablation experiment

Conduct ablation experiments on the collected raw vibration signals without interference. The ablation experimental model is shown in Table 2.

Table 2.

The ablation experiment models.

No.	Model name	Brief description
1	The proposed method
2	HGAN	The single HGAN
3	GCN	The single GCN
4	HoGCN	The variant high-order graph convolutional network of GCN, which could be referred to Huang et al.⁴²
5	DGAT-LPS	A new semi-supervised fault diagnosis method, which could be referred to Yan et al.⁴³
6	GIN	The variant graph isomorphic network of GCN, which could be referred to Xu et al.⁴⁴

Comparison result of radar accuracy in ablation experiments without interference through number of test is shown in Figure 15, based on which it could be observed that the proposed method demonstrates excellent performance, with diagnostic results not only superior to the compared ablation experimental models, but also surpassing recognized advanced methods in the current field. Furthermore, the proposed method has been tested multiple times, and its diagnostic results have relatively small fluctuations compared to other ablation models, proving that it has both high accuracy and strong stability.

Figure 15.

Comparison of radar accuracy in ablation experiments without interference.

Conclusion

To address the collection challenges of training and labeling samples faced by AUV thrusters during operation, as well as overcoming interference challenges in complex marine environments, a Semi-supervised AUV fault diagnosis method based on dynamic decay learning strategy and hypergraph attention network is proposed. This method can explore fault information deeply under the condition of limited labeling and training samples. The designed parallel strategy extracts signal features effectively under complex interference conditions, and can capture both dynamic and static features of the input graph signal simultaneously.

The effectiveness of the proposed method is verified through experiment analysis, and the following conclusion could be obtained:

HGAN–GCN can fully utilize the fault information in limited labeling and training samples.

The designed parallel strategy can extract both dynamic and static features from nodes more effectively in degree of interference environment.

The proposed dynamic decay learning strategy can improve the training performance of the hybrid model effectively and ensure its real-time application.

Besides, the following conclusion could be obtained by comparison study:

Compared to the other two related advanced models, the proposed method demonstrates superior performance in terms of Precision, Recall, and F1 Score.

Ablation experiments demonstrate that each module in the proposed method plays an indispensable role in enhancing the diagnostic accuracy and overall performance of the proposed method.

Though the proposed method has great potential for application in fault diagnosis under the condition of lacking labeled samples, and subsequent research will attempt to apply it on fault diagnosis of rotating machinery such as gears and rolling shafts under the condition of lacking labeled samples. However, there are still some shortcomings in this study: due to the difficulty of data acquisition, the proposed scheme has not been fully validated in actual marine environments. In the future, we plan to apply the methods of this study on actual marine environments, and explore unsupervised learning techniques to better adapt to the working environment and actual interference situations of AUVs.

Footnotes

Handling Editor: Aarthy Esakkiappan

Author contributions

Shuai Zheng is the theoretical researcher and writer of the paper and Hongchao Wang is the program programmer in the paper.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The paper is supported by Henan Province’s New Key Discipline-Machinery (grant no. 0203240011).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Sun

Yan

. Application of Fourier transform and wavelet transform to signal fault diagnosis. J Liaoning Inst Technol 2005; 25(3): 155–160.

Yang

Deng

Kang

. Early warning method of gearbox fault based on EEMD and broad learning algorithm. Comput Integr Manuf Syst 2022; 28(6): 1835–1843.

Yan

Gao

Chen

. Wavelets for fault diagnosis of rotary machines: a review with applications. Signal Process 2014; 96: 1–15.

Sun

Yang

, et al. Cyclostationary analysis of irregular statistical cyclicity and extraction of rotating speed for bearing diagnostics with speed fluctuations. IEEE Trans Instrum Meas 2021; 70: 3514011.

Guo

Zheng

. A method of rolling bearing fault diagnose based on double sparse dictionary and deep belief network. IEEE Access 2020; 8: 116239–116253.

Zhao

Liu

, et al. Research on a fault diagnosis method of rolling bearing using variation mode decomposition and deep belief network. J Mech Sci Technol 2019; 33(9): 4165–4172.

Wang

. Interpolation multiscale amplitude sensitivity permutation Lempel–Ziv complexity and its application to mechanical fault diagnosis. IEEE Trans Instrum Meas 2026; 75: 6502112.

Yue

Zhao

. Multiscale similarity fuzzy Lempel-Ziv complexity and its application in bearing fault diagnosis. IEEE Trans Instrum Meas 2026; 75: 6503613.

Yin

Lin

Tang

, et al. Thruster fault identification for autonomous underwater vehicle based on time-domain energy and time-frequency entropy of fusion signal. Intell Robot Appl 2019; 11742: 264–275.

10.

Zhang

Chu

. Multi-fault diagnosis for autonomous underwater vehicle based on fuzzy weighted support vector domain description. China Ocean Eng Soc 2014; 28(5): 599–616.

11.

Wang

Zhang

, et al. Research of the fault diagnosis method for the thruster of AUV based on information fusion. In: The 3rd International conference on intelligent computing, Qingdao, China, 2007, pp. 1014–1023. IEEE.

12.

Zhang

. Study of fuzzy neural networks model for system condition monitoring of AUV. J Mar Sci Appl 2002; 1(2): 42–45.

13.

Sun

Zhang

, et al. Actuator fault diagnosis of autonomous underwater vehicle based on improved Elman neural network. J Cent South Univ 2016; 23(4): 808–816.

14.

Tang

Gang

. An intelligence fusion system for ship fault-tolerant control. IFAC Proc Vol 2004; 37(10): 83–87.

15.

Shi

Zhang

. Software fault diagnosis model of AUV based on Bayesian Networks and its simplified method. In: The 9th World Congress on intelligent control and automation, Taipei, Taiwan, 21–25 June 2011, pp.122–132. New York: IEEE.

16.

Das

Birant

. GASEL: genetic algorithm-supported ensemble learning for fault detection in autonomous underwater vehicles. Ocean Eng 2023; 272: 113844.

17.

Nascimento

Valdenegro-Toro

. Modeling and soft-fault diagnosis of underwater thrusters with recurrent neural networks. IFAC-PapersOnLine 2018; 51(29): 80–85.

18.

Costa

. Fault-induced transient detection based on real-time analysis of the wavelet coefficient energy. IEEE Trans Power Deliv 2014; 29(1): 140–153.

19.

Demetgui

Yildiz

Taskin

, et al. Fault diagnosis on material handling system using feature selection and data mining techniques. Measurement 2014; 55: 15–24.

20.

Sang

Woen

Suk

, et al. Enhanced convolutional neural network for in situ AUV thruster health monitoring using acoustic signals. Sensors 2022; 22(18): 7073.

21.

Xia

Zhou

Shi

, et al. A fault diagnosis method with multi-source data fusion based on hierarchical attention for AUV. Ocean Eng 2022; 266: 112595.

22.

Moghaddam

Chen

Deshmukh

. A neuroinspired computational model for adaptive fault diagnosis. Expert Syst Appl 2020; 140: 112879.

23.

Subha

Subash

Jane

, et al. Autonomous under water vehicle based on extreme learning machine for sensor fault diagnostics. Mater Today Proc 2020; 24(2): 2394–2402.

24.

Chaos

Moreno-Salinas

Aranda

, et al. Fault-tolerant control for AUVs using a single thruster. IEEE Access 2022; 10: 22123–22139.

25.

Shao

Lin

Min

, et al. Improved semi-supervised prototype network for cross-domain fault diagnosis of gearbox under out-of-distribution interference samples. J Mech Eng 2024; 60(4): 212–221.

26.

Liao

Huang

, et al. Deep semisupervised domain generalization network for rotary machinery fault diagnosis under variable speed. IEEE Trans Instrum Meas 2020; 69(10): 8064–8075.

27.

Lin

Shao

Min

, et al. Cross-domain fault diagnosis of bearing using improved semi-supervised meta-learning towards interference of out-of-distribution samples. Knowl-Based Syst 2022; 252: 109493.

28.

Tang

Wang

Zhou

, et al. Multi-scale recursive semi-supervised deep learning fault diagnosis method with attention gate. Machines 2023; 11(2): 153.

29.

Liu

Cui

Liu

, et al. Research progress of underwater detection based on ocean ambient noise. Digit Ocean Underwater Warfare 2021; 5(6): 518–523.

30.

Tian

Wang

Liu

, et al. Thruster fault diagnostics and fault tolerant control for autonomous underwater vehicle with ocean currents. Machines 2022; 10(7): 582.

31.

Feng

Zhang

Piao

, et al. Traffic anomaly detection based on spatio-temporal hypergraph convolution neural networks. Phys A Stat Mech Appl 2024; 646: 129891.

32.

Zhang

Tan

Wei

. Exploring high-order correlation for hyperspectral image denoising with hypergraph convolutional network. Signal Process 2025; 227: 109718.

33.

Lei

Chen

Luo

, et al. AHFormer: hypergraph embedding coding transformer and adaptive aggregation network for intelligent fault diagnosis under noise interference. Adv Eng Inform 2024; 611: 102518.

34.

Zhao

Zhu

Liu

, et al. Model-assisted multi-source fusion hypergraph convolutional neural networks for intelligent few-shot fault diagnosis to electro-hydrostatic actuator. Inf Fusion 2024; 104: 102186.

35.

Zhang

, et al. Graph Batch Coarsening framework for scalable graph neural networks. Neural Netw 2025; 183: 106931.

36.

Zhou

, et al. The emerging graph neural networks for intelligent fault diagnostics and prognostics: a guideline and a benchmark study. Mech Syst Signal Process 2022; 168: 108653.

37.

Zhang

. HCGCCDA: prediction of circRNA-disease associations based on the combination of hypergraph convolution and graph convolution. J Comput Sci 2023; 74: 102176.

38.

Yang

Liu

Sun

, et al. PACL: piecewise arc cotangent decay learning rate for deep neural network training. IEEE Access 2020; 88: 112805–112813.

39.

Wang

Xue

. An adaptive model for time-varying speed fault diagnosis under strong noise interference. J Mech Sci Technol 2024; 38(6): 2831–2844.

40.

Yang

Wang

, et al. Fault diagnosis of AUV propulsion system based on multi-scale dilated convolutional neural network. In: 2024 Prognostics and system health management conference (PHM), Stockholm, Sweden, 28–31 May 2024, pp.398–404. New York: IEEE.

41.

Xia

Zhou

Shi

, et al. A fault diagnosis method based on attention mechanism with application in Qianlong-2 autonomous underwater vehicle. Ocean Eng 2021; 233: 109049.

42.

Huang

, et al. Higher-order smoothness enhanced graph collaborative filtering. IEEE Trans Big Data 2024; 10(6): 731–741.

43.

Yan

Shao

Xiao

, et al. Semi-supervised fault diagnosis of machinery using LPS–DGAT under speed fluctuation and extremely low labeled rates. Adv Eng Inform 2022; 53: 101648.

44.

Leskovec

, et al. How powerful are graph neural networks?. arXiv preprint: 181000826. 2019.