Physical education teaching quality evaluation system integrating cluster analysis and neural network

Abstract

This study proposes a hybrid Physical Education (PE) teaching quality evaluation system that integrates a K-medoids clustering algorithm with an enhanced CNN-LSTM neural network. Traditional evaluation methods are often subjective and inconsistent, failing to capture the complex, time-varying nature of student behavior in PE classes. The proposed model preprocesses historical classroom data through feature correlation analysis and PCA-based dimensionality reduction, followed by K-medoids clustering to improve data structure and training efficiency. It then takes Ensemble Empirical Mode Decomposition (EEMD) to enhance the input representation of the LSTM model. Experimental results demonstrated that the improved CNN-LSTM achieved an F1 value of 0.98 and an RMSE of 0.11 with 1000 training samples, significantly outperforming baseline models including CNN, LSTM, and GRNN. The model showed peak accuracy of 97.6% during 10:00–12:00 time slots, with an average recall of 90.4% across-varied student states. User evaluation by PE teachers indicated an average satisfaction score of 93.7. This model has been proven to be effective in handling nonlinear and non-stationary classroom data and can achieve real-time, objective, and personalized evaluation. Future improvements include expanding the dataset with classroom sensor data and incorporating cognitive and emotional engagement indicators.

Keywords

big data cluster analysis teaching evaluation LSTM CNN

Introduction

The development of big data has led to the gradual application of technologies such as artificial intelligence in campus construction. In the education environment. If big data is utilized and analyzed, it can significantly improve the quality of teaching.¹ Traditional Physical Education (PE) evaluation methods heavily rely on teachers’ subjective judgments and accumulated experience, often resulting in inconsistencies, low reliability, and inefficiencies. These methods typically lack standardized evaluation criteria and fail to account for the multi-dimensional aspects of students’ physical performance, learning engagement, and health conditions. Such limitations hinder the ability to deliver fair, timely, and data-driven feedback in PE settings.² Accordingly, developing an efficient and accurate Physical Education Teaching Quality Evaluation Model (PETQM) is crucial by combining modern data analysis techniques and machine learning algorithms. Some scholars have conducted research on teaching quality, but some have overlooked the impact of big data on teaching quality when conducting surveys and research. The model used is also relatively simple, which cannot accurately evaluate the state of students. To address these shortcomings, this study introduces an intelligent PETQM that integrates K-medoids clustering and an improved Long Short-Term Memory (LSTM) neural network. The clustering algorithm is used to filter and group large-scale historical classroom data, thereby enhancing the representation of key features. This data is then fed into a deep learning model that captures temporal patterns and nonlinear dependencies, improving prediction accuracy and evaluation fairness. The combination of unsupervised clustering and sequential modeling enables the system to adapt to individual student differences and dynamic classroom contexts. These capabilities are essential for building an objective, scalable, and accurate evaluation framework in modern PE environments. To enhance the clarity of the proposed framework, the core technical components are briefly introduced. K-center clustering, especially K-medoid clustering, is an unsupervised learning algorithm used to group similar data points by selecting actual observations as clustering centers. In this study, it helps filter and organize large-scale PE data into meaningful patterns before prediction. LSTM is a recurrent neural network well-suited for processing sequential data, such as time-series performance or engagement trends in PE classes. It captures temporal dependencies and variations in students’ learning behaviors. CNNs are deep learning models typically used for image or feature extraction. CNNs are adapted to enhance the input feature representation before feeding into the LSTM, improving the model’s ability to detect complex patterns in multi-dimensional PE data. The contribution of the research lies in its ability to enable teaching managers, teachers, and students to timely and accurately grasp the real teaching situation and continuously improve teaching quality. The research content has four parts. The first provides an overview about the teaching quality. The second briefly describes the algorithm. Next is the results obtained through algorithm research, and the results are analyzed. The fourth section presents the research content and research direction.

Related works

The quality of teaching has crucial impacts on the classroom learning quality, which is not only limited to students’ concentration and understanding ability in the classroom, but also directly related to their learning outcomes and motivation. Wang proposed a distance-based IVIF-CODAS method based on the traditional CODAS to fill the new demand for English language professionals in response to economic globalization. This method objectively evaluated the English teaching quality in colleges. It indicated that this method could effectively reduce subjective randomness, improve evaluation accuracy, and provide strong support for cultivating English major composite talents that adapt to the economic and social development.^3–5 Hou proposed an improved gradient descent method based on BPNN to address the shortcomings of online education evaluation models in handling small-scale datasets. Meanwhile, a deep neural network model was proposed, which utilized support vector regression to process complex high-dimensional data. It performed well in evaluating large-scale data.⁶ Yuan et al. proposed a strategy based on facial feature recognition to objectively evaluate the teaching quality of online classrooms. The system utilized an improved multitasking Convolutional Neural Network (CNN) to recognize facial features of students. By combining the optical AlexNet classification of the Ghost module, eye and mouth states were detected. The fatigue analysis was performed using the PERCLOS index. The system could also estimate student posture and learning concentration. It could assess the teaching quality of online courses.⁷ Guo analyzed the PE quality evaluation. A universal PETQM based on expert systems, knowledge bases, and fuzzy set thinking was proposed. This method emphasized flexibility, fairness, and interactivity. The research results indicated that this evaluation system was of great significance for improving the PETQM.⁸ Liu et al. combined deep CNN and weighted Naive Bayes algorithm to overcome the defects of PETQM and improve the evaluation comprehensiveness and accuracy. The correlation probability of class attributes was used to estimate the weight of each evaluation feature, achieving objective, realistic, and comprehensive teaching quality evaluation. It could promote the standardization and comprehensibility of teaching evaluation, improving the quality of teaching.⁹ Zhou et al. proposed an intelligent PE classroom model based on the Internet of Things, cloud computing, and big data technology to improve the quality of PE teaching in universities. The research results indicated that this model effectively activated the atmosphere of PE classrooms, stimulated students’ learning interests and positive attitudes, helped them complete pre-class preparation, classroom teaching, and post class practice, and cultivated lifelong sports habits.¹⁰

Massive scholars have carried out detailed research on clustering algorithms. Ushakov et al. proposed a parallel distributed primal dual heuristic strategy to solve the k-medoids clustering problem on large-scale datasets. This algorithm combined effective parallel subgradient column generation and high-quality core selection techniques, significantly reducing computational burden and memory requirements by approximating dissimilarity matrices. Experimental results showed that this algorithm could approach the optimal solution and achieve almost linear parallel acceleration on large-scale datasets.¹¹ Chen et al. proposed a parallel adaptive reliability analysis strategy based on importance sampling and K-medoids clustering to overcome the challenges in Kriging-based adaptive structural reliability analysis. It evaluated candidate samples through global convergence conditions and optimal importance sampling functions, and implemented parallel operations using clustering algorithms. It could effectively improve computational efficiency, balance accuracy with reduced iteration times, and verify its effectiveness and robustness.¹² Zhang et al. proposed a K-medoids fast collaborative spectrum sensing method based on Riemannian distance to improve spectrum utilization efficiency in cognitive radio networks. The sensing data of single and multiple antenna secondary users was fused. This method could effectively identify the status of the main user. Experimental results showed that the proposed method reduced computational complexity while ensuring convergence. The effectiveness was validated under different conditions.¹³ Sado proposed a method based on the raw materials of different qualities and performance verification to develop high-quality and low-cost MgO-C refractory materials to cope with corrosion and reduce costs. By using principal component analysis and K-medoids algorithm to screen formulas, and testing them in a medium frequency induction furnace, MgO-C materials with similar corrosion resistance could be selected.¹⁴ To solve the poor communication quality, chain prone to interruption and low data transmission reliability caused by frequent changes in the topological structure of vehicular AD hoc networks, Shu et al. proposed an adaptive K-medoids algorithm based on greedy peripheral stateless routing in this paper for multi-path vehicle networking communication in urban scenarios. The research results showed that this algorithm could form high-quality link communication, improved the reliability of data transmission, and had good performance and strong applicability.¹⁵

To sum up, although numerous scholars have applied machine learning and clustering methods in educational evaluation, few have directly focused on PE teaching scenarios, which involve complex temporal, behavioral, and physiological data. Most existing studies emphasize general classroom environments, where facial expressions, text-based assessments, or survey data dominate. These approaches often lack sensitivity to the non-linear, sequential, and multi-dimensional nature of PE data. In addition, previous models tended to use clustering for data grouping or neural networks for prediction but rarely explored the synergistic integration of the two. For example, although models based on BPNN and CNN have improved their generalization ability to structured data, they cannot effectively capture sequence dependencies. In addition, pure LSTM models without prior feature screening may suffer from noise and redundancy in large datasets. Therefore, this study proposes a hybrid PETQM framework that first employs K-medoids clustering to reduce feature redundancy and enhance interpretability, and then applies an improved CNN-LSTM network for precise sequential prediction. This integration not only improves accuracy, but also enhances adaptability to PE-specific data such as motion records, exercise performance, and engagement states.

Design of PETQM integrating cluster analysis and neural network

The first section proposes a filtering model based on the K-medoids clustering algorithm, which can filter information in big data and optimize historical classroom data participating in training, thus preliminarily judging the teaching quality. The second section proposes an evaluation model based on LSTM and introduces CNN to improve the model.

Teaching data processing model based on cluster analysis

PE teaching has its particularity, which is not the same as teaching in other disciplines. Therefore, big data generated in PE is subjected to data processing. These data include students’ physical indicators, exercise performance, classroom performance, and health status. Processing and analyzing these data can better analyze the learning and physical conditions, thereby guiding teaching practice and improving teaching effectiveness. Correlation analysis is applied to determine the correlation among variables in a dataset. It is usually used to discover the interrelationships, patterns, or trends between variables to better make predictions or decisions. It includes two categories: correlation analysis and regression analysis. The former aims to measure the linear correlation between two or more variables. Correlation analysis can determine whether there is a correlation in variables, as well as the direction and strength of the correlation. Regression analysis aims to develop a mathematical relationship model in one or more independent variables and a dependent variable. This relationship can be linear or non-linear. Through regression analysis, independent variable information can be used to predict the dependent variable value, or to discuss the impact of independent variables on the dependent variable. The study uses correlation analysis to analyze the correlation between historical teaching data and influencing factors. The Pearson Correlation Coefficient (PCC) analysis method is used to screen the main factors affecting teaching quality. This method is a statistical method that measures the strength and direction of the linear relationship between two variables and is commonly used to explore the degree of correlation between two continuous variables. PCC is the quotient of covariance and standard deviation between two groups of variables. Its expression is shown in equation (1).

ρ_{x, y} = \frac{cov (X, Y)}{σ_{X} σ_{Y}} = \frac{\sum (X - \bar{X}) (Y - \bar{Y})}{\sqrt{{\sum (X - \bar{X})}^{2} \sum {(Y - \bar{Y})}^{2}}}

(1)

In equation (1),

X

and

Y

represent random variables.

σ_{X}

and

σ_{Y}

stands for the standard deviations of variables

X

and

Y

cov (X, Y)

represents the covariance between variables

X

and

Y

. The range of Pearson coefficient is −1 to 1. A value close to 0 demonstrates a small correlation. The opposite also holds true. For variables undergoing PCC analysis, the standard deviation between the two groups of variables is not 0. Otherwise, the PCC is meaningless.¹⁶ There are several characteristics that need to be met for the variables being analyzed. Firstly, the variables involved in PCC analysis belong to a unimodal normal distribution. The dimensions between the two groups of variables are the same, with no correlation between different values. There is a continuous linear relationship between each group of variables. There are many load influencing factors selected for PCC analysis and has rich raw data. When building a clustering analysis model, unifying these data into the clustering model results in slow calculation speed, low efficiency, which has a certain impact on the clustering effect parameters. Therefore, the original data is subjected to dimensionality reduction.¹⁷ The study uses Principal Component Analysis (PCA) to decline the original data dimensionality. PCA is adopted in this study to address two challenges in processing PE data: high dimensionality and feature redundancy. The original teaching data contains many influencing factors, including students’ physiological indicators, behavioral performance, and engagement signals. These variables are often correlated or partially overlapping, which increases the computational complexity of clustering and affects the convergence of subsequent models. Therefore, PCA is introduced to reduce the dimensionality of the feature space and extract the most informative principal components. The application process of PCA includes the following steps. Firstly, the original feature data matrix is standardized to ensure that all indicators are on the same scale. Then, the covariance matrix of the standardized data is constructed to reflect the relationships between different variables. Eigenvalues and eigenvectors of the covariance matrix are calculated. Each eigenvalue represents the variance captured by its corresponding eigenvector. The principal components are sorted according to the magnitude of eigenvalues. The data is then projected onto the selected eigenvectors to form the reduced feature set, which is used as the input of the K-medoids clustering model. The principle of this method is to find a vector basis that maximizes the variance of the projected values of the data in the dataset on that vector basis. The process is shown in equation (2).

Y_{l} = {\begin{cases} Y_{1} = μ_{11} x_{1} + μ_{12} x_{2} + μ_{13} x_{3} + \dots + μ_{1 n} x_{n} \\ Y_{2} = μ_{12} x_{1} + μ_{22} x_{2} + μ_{23} x_{3} + \dots + μ_{2 n} x_{n} \\ ⋮ \\ Y_{n} = μ_{n 1} x_{1} + μ_{n 2} x_{2} + μ_{n 3} x_{3} + \dots + μ_{n n} x_{n} \end{cases}

(2)

In equation (2),

Y_{l}

represents the composite variable.

Y_{1}

Y_{2}

, and

Y_{n}

represent the first, second, and n-th dominant factors of the original data. The correlation coefficient between each dominant factor is 0. From

Y_{1}

Y_{n}

, as the number increases, the initial information in the corresponding composite variable decreases. When the dominant factor in the data is the same as the number of initial data, it can be considered that the dominant factor can reflect all the information in the initial data.¹ The specific steps for dimensionality reduction of raw data using PCA method are shown in Figure 1.

Figure 1.

PCA dimensionality reduction process for data.

From Figure 1, firstly, the PE teaching feature data is composed into the original data matrix and standardized. Then, the covariance matrix of the original data matrix is displayed in equation (3).¹⁸

{\begin{cases} R = {(r_{i j})}_{n \times n} \\ r_{i j} = \frac{cov (x_{i}, x_{j})}{\sqrt{V a r (x_{i}) V a r (x_{j})}} \end{cases}

(3)

In equation (3),

R

signifies the covariance matrix.

cov (x_{i}, y_{i})

signifies the covariance between variables

x

and

y

in the original matrix.

V a r (x_{i})

represents the variance of each variable.¹⁹ The eigenvectors and eigenvalues of the covariance matrix are calculated. Next, the variance contribution rate of the principal component eigenvalues is displayed in equation (4).

a_{k} = \frac{λ_{k}}{\sum_{i = 1}^{n} λ_{i}}, k = 1, 2, \dots, n

(4)

In equation (4),

a

stands for the variance contribution rate.

λ

represents the principal component eigenvalues. Then, the main factors are sorted based on their variance contribution rate. The cumulative contribution rate of their main factors is displayed in equation (5).

M = \sum_{i = 1}^{m} a_{k} = \frac{\sum_{i = 1}^{m} λ_{i}}{\sum_{i = 1}^{n} λ_{i}}

(5)

In equation (5),

M

represents the cumulative contribution rate.

m

stands for the dominant factors.

a_{k}

stands for the variance contribution rate. PCA data dimensionality reduction can improve computational efficiency with minimal loss of feature information in the original data. The final factor score is shown in equation (6).^20,21

Y_{k} = v_{k}^{'} X

(6)

In equation (6),

Y_{k}

stands for the rating of the

i

-th influencing factor on the

k

-th principal component. K-medoids is a clustering algorithm, which is a variant of K-means. Unlike K-means, K-medoids does not use the average value of cluster centers to represent clusters, but instead selects actual observed data points as representatives, which are called medoids. The K-medoids is to partition data points by minimizing the distance sum from the data points to their respective medoids. Compared with other clustering algorithms, K-medoids is particularly suitable for processing noisy, non-Gaussian, and high-dimensional educational data, as it selects actual data samples as cluster centers instead of relying on means. This is a key advantage over K-means, which is highly sensitive to outliers and assumes separation based on Euclidean centroids, making it less robust in sports datasets with non-common features. K-medoids is selected due to its ability to preserve data integrity, reduce sensitivity to extreme values, and generate cluster centers that correspond to real student samples. These properties ensure more meaningful feature grouping before LSTM modeling, thereby enhancing prediction performance and model generalization in varied classroom contexts. The clustering process is displayed in Figure 2.

Figure 2.

Clustering of K-medoids.

In Figure 2, firstly, samples are selected as cluster centers in the dataset. Then, the distance between other data in the dataset and the current cluster center is obtained. Each sample point is distributed with its nearest cluster center. The calculation is shown in equation (7).

d (x, y) = \sqrt{{(x_{1} - y_{1})}^{2} + {(x_{2} - y_{2})}^{2} + \dots + {(x_{n} - y_{n})}^{2}} = \sqrt{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}

(7)

In equation (7),

x_{i}

represents the current cluster center.

y_{i}

stands for the sample point value. The sample points outside the cluster center are calculated. When the sample point serves as the cluster center, the sum of the distances from other points is used to find the nearest distance as the new cluster center. It is repeated until the cluster center no longer changes or reaches the maximum number of iterations to determine the clustering result.²² In this study, raw classroom data includes a range of multi-dimensional features such as student physical fitness metrics, behavioral engagement indicators, and psychological states inferred from survey data. Before clustering, these heterogeneous data types are standardized to ensure numerical comparability. Next, a PCC analysis is applied to eliminate highly redundant or weakly correlated features. To further reduce computational complexity, PCA is used to project the high-dimensional features onto a lower-dimensional orthogonal space, preserving the majority of data variance with fewer components. The refined dataset is then clustered using the K-medoids algorithm, which selects actual student records as cluster centers rather than relying on mean values such as K-means. The algorithm initializes by randomly choosing k representative samples. Each data point is assigned to the nearest medoid based on Euclidean distance. The medoid is iteratively updated by minimizing the total intra-cluster distance until convergence or a preset iteration threshold is reached. This process effectively groups students with similar physical and behavioral traits, allowing the model to tailor subsequent learning predictions accordingly. Once clustering is complete, each sample is labeled with its cluster ID and associated dominant features. These cluster-augmented samples are then fed into the LSTM prediction network. The K-medoids preprocessing ensures that the network receives noise-filtered, structure-enhanced input, which improves training stability, reduces over-fitting, and enhances generalization across student types and class contexts. In the K-medoids clustering model, the number of clusters significantly affects the quality and granularity of the classification results. A value that is too small may lead to overgeneralization, while a value that is too large can cause over-fitting or excessive fragmentation of meaningful patterns. Therefore, to obtain the clusters under the optimal clustering effect, the maximum cluster quantity is applied to empirical equation, as shown in equation (8).

C_{\max} = 2 \ln (N)

(8)

In equation (8),

N

stands for the sample point quantity in the dataset. After determining the range of clustering numbers, the clustering results of the model are calculated using clustering algorithms to obtain the clustering results at convergence. 2²³ Once the upper bound is established, the model performs iterative clustering for each candidate value of clusters within a predefined range. The clustering quality is evaluated using the average silhouette coefficient, which measures how similar a sample is to its own cluster compared to other clusters. The value of clusters with the highest silhouette score is selected as the optimal number of clusters. This ensures that the final clustering configuration achieves both intra-cluster cohesion and inter-cluster separation, enhancing the interpretability and effectiveness of the data preprocessing stage. Then, the effective values of the clustering results are calculated. Based on the clustering analysis results, a prediction model is established and the effectiveness of clustering screening is verified. The study uses an LSTM to process data. The model structure is displayed in Figure 3.

Figure 3.

Architecture of the Km-LSTM-based preprocessing and prediction pipeline.

From Figure 3, the influencing factors are first determined through PCC analysis to obtain the teaching quality characteristics. Then, it be predicted and the teaching quality characteristics are input into PAC for dimensionality reduction. Then, the clustering results are obtained through K-medoids clustering. Each result is input into the LSTM. The predicted teaching data is then input to obtain the processed data.

PETQM based on improved LSTM network

After processing teaching data using the Km-LSTM model, the teaching features are decomposed using Ensemble Empirical Mode Decomposition (EEMD) to generate new feature components as new training features.²⁴ EEMD is an extension of Empirical Mode Decomposition (EMD). It reduces the inherent mode-mixing problem in EMD. Modal mixing refers to the possibility that the Intrinsic Mode Function (IMF) obtained from empirical mode decomposition may be contaminated by the components of different oscillation modes. EEMD solves this problem by adding white noise to the signal and repeating the EMD process multiple times. EMD is an adaptive method that can automatically adjust the decomposition process based on local features of data. This makes the method very effective in handling nonlinear and non-stationary data. Unlike traditional frequency based decomposition methods, EMD does not require prior knowledge or assumptions about the frequency characteristics of the signal. Therefore, it is possible to better process various types of signals, including nonlinear, and non-stationary signals. EMD captures the local features and nonlinear components of a signal by decomposing it into multiple local feature components. Each component represents a scale or frequency range in the signal, better preserving the local structure of the signal. This generates a set of IMFs. The average value of these IMFs often reduces modal mixing.²⁵ Its expression is shown in equation (9).

s (t) = \sum_{i = 1}^{N} c_{i} (t) + r (t)

(9)

In equation (9),

s (t)

represents the original teaching data.

r (t)

stands for the residual component.

c_{i} (t)

stands for the

i

-th intrinsic mode function. In the PE evaluation, the data collected-such as student engagement patterns, physical performance metrics, and classroom behavior sequences-are typically non-linear and non-stationary. These characteristics pose significant challenges to conventional predictive models, which often assume stable distributions or fixed temporal dynamics. Therefore, the study employs EEMD as a signal preprocessing technique prior to neural network modeling. EEMD is an advanced time-frequency analysis method designed to decompose complex signals into a series of IMFs. Compared with traditional decomposition methods, EEMD introduces white noise perturbation and multiple ensemble iterations to mitigate the mode-mixing problem inherent in classic Empirical EMD. This can extract more stable and distinguishable oscillation components from the original teaching data. The significance of EEMD lies in its ability to capture hidden temporal fluctuations, irregular learning behaviors, and micro-trends in PE learning sequences. This leads to improved model robustness, reduced residual noise, and enhanced generalization, particularly when applied to small or heterogeneous datasets in real-world PE scenarios. The process of EEMD decomposition is shown in Figure 4.

Figure 4.

EEMD process.

In Figure 4, firstly, the parameters of the EEMD are set. The threshold for the amplitude of white noise and the number of integrations are initialized. White noise is added to the data sequence. The data with added white noise is subjected to EMD to obtain the intrinsic mode components. After obtaining the integration count, the process is ended.^26,27 Finally, the mean of IMF components for all groups is calculated. The solution result is used as the decomposition result of EEMD. The decomposed IMF and residual components are used as inputs for the prediction model. Then they are input into the LSTM network structure for further prediction.^28,29 The LSTM is displayed in Figure 5.

Figure 5.

LSTM structure diagram.

From Figure 5, the LSTM neuron structure is based on CNN with added gating, including three parts. LSTM has an additional transfer state compared to CNN, with two transfer states, namely, long-term state and short-term state.³⁰ Considering that LSTM can process continuous time series, while CNN can obtain feature information from high-level data, CNN is used to improve the LSTM. CNN extracts effective features from high-dimensional datasets to obtain teaching quality feature vectors, which are then input into the LSTM network to complete prediction. In LSRM, the update gate controls how much information is retained from the Hidden State (HS) of the previous time step to update the current time step’s HS. The output of the update gate is in [0,1], demonstrating how much information from the previous time step needs to be retained. The update gate is shown in equation (10).

z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}])

(10)

In equation (10),

z_{t}

stands for the update gate.

W_{z}

stands for the model parameters.

h_{t - 1}

represents the HS of the previous time step.

x_{t}

represents the time step’s input. The reset gate controls how the HS of the previous time step is combined with the current input to generate candidate values for the updated HS. The output of the reset gate is between 0 and 1, demonstrating how much current input information needs to be retained. The reset gate is shown in equation (11).¹⁷

r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}])

(11)

In equation (11),

r_{t}

stands for the reset gate.

W_{r}

stands for the model parameters.

x_{t}

stands for the input of the time step. The candidate HS calculated based on the reset gate is a linear combination of the previous HS and the current input, as shown in equation (12).³¹

{\tilde{h}}_{t} = \tanh (W_{h} \cdot [r_{t} ⊙ {h_{t}}_{- 1}, x_{t}])

(12)

In equation (12),

W_{h}

stands for the model parameters.

⊙

stands for the element multiplication. The HS of the current time step through the update gate control is shown in equation (13).

h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ {\tilde{h}}_{t}

(13)

In equation (13),

z_{t}

represents the update gate.

h_{t - 1}

represents the HS of the previous time step.

x_{t}

represents the time step’s input. In the context of PE, many variables exhibit strong temporal dependencies. For instance, a student’s engagement level, physical performance, and class participation often fluctuate over time and are influenced by prior sessions. Capturing these trends is critical to understanding learning trajectories and predicting teaching effectiveness. To model such temporal patterns, this study employs LSTM networks, which are specifically designed to learn from sequential data. LSTM retains both short-term and long-term contextual information through its memory cells and gated mechanisms. This allows the model to account for how previous class states or behaviors impact current performance, making predictions more dynamic and responsive to change over time. When integrated within the CNN-LSTM structure, the LSTM layer receives high-level features extracted by CNN and processes them as time-ordered sequences. This can not only identify static behavior patterns but also track the progress and temporal changes of multiple classroom conversations or activity intervals. In the original training feature combination, the numerous load influencing factors screened have weak factor correlation. Therefore, high correlation components are added to improve the model accuracy. The addition process is displayed in Figure 6.

Figure 6.

High correlation component selection process.

In Figure 6, the IMF component and residual component are obtained through EEMD. Two components and raw data are subjected to PCC analysis. The PCC is used as an indicator. The classification with a correlation coefficient greater than the IMF component is considered as a high correlation component for training and then filtered to the low correlation component.³² In this study, CNN is introduced to enhance the feature representation capability of the LSTM model. LSTM networks are well-suited for capturing temporal dependencies and modeling time-series relationships in sequential data, such as students’ behavioral engagement over multiple class sessions. However, LSTM alone is limited in its ability to extract spatial correlations or local feature patterns from high-dimensional inputs, especially when the input includes multiple physical, behavioral, and psychological indicators. CNN excels at extracting localized and hierarchical features through convolution operations. By placing the CNN layer before the LSTM, the model first applies convolutional filters to capture salient patterns across input dimensions, such as co-occurring changes in physical performance and class participation. These feature maps are then flattened and passed to the LSTM layer, which models the sequential dynamics. The integration of CNN and LSTM allows the model to jointly exploit spatial structure and temporal evolution in PE data. CNN serves as a front-end encoder that transforms raw features into compact representations, while LSTM acts as a temporal decoder that learns long-term dependencies across time steps. This design improves model generalization, reduces training complexity, and enhances prediction accuracy in PE teaching evaluation. Due to the adaptability of EEMD, it is not possible to generate a specified number of components when decomposing data sequences. Therefore, a threshold needs to be set to filter the decomposed IMF components. The final model structure is shown in Figure 7.

Figure 7.

Overall prediction workflow of the PETQM framework integrating K-medoids, EEMD, and CNN-LSTM.

From Figure 7, the influencing factors to be predicted are first inputted into the K-medoids clustering model for judgment. Then the results are inputted into the LSTM prediction model to obtain preliminary prediction results. The preliminary prediction results are inputted into the EEMD model for decomposition to obtain data components. Then, the features are learned to obtain highly correlated feature components. Lastly, the obtained features are inputted in the CNN-LSTM for judgment to obtain the prediction results.

Design of PETQM integrating cluster analysis and neural network

In the first section, the General Regression Neural Network (GRNN), CNN, and LSTM are compared with the data preprocessing model in terms of F1 values and other aspects. The second section introduces CNN and LSTM for comparison with PE teaching evaluation models.

Performance analysis of teaching data processing model based on cluster analysis

The dataset used in this study is the Youth Behavioral Surveillance Survey (YBSS), a large-scale monitoring project led by the Centers for Disease Control and Prevention (CDC) in the United States. It aims to assess health-related behaviors among adolescents through nationwide questionnaire-based sampling. In this study, data related to PE and student health are extracted, including the following dimensions: frequency of participation in PE courses and extracurricular physical activity, self-reported physical condition (e.g., sleep quality, body image, and exercise fatigue), engagement and satisfaction levels with school PE classes, and basic demographic information such as age, gender, and grade. The data consist of both ordinal and categorical variables, which are standardized before further processing. These indicators represent students’ behavior and health status in multiple dimensions, enabling a more comprehensive assessment of PE. The CPU used in the experimental hardware configuration is Intel Core i5-8750H. The GPU is NVIDIA Geforce GTX2080Ti, with 8 GB of graphics memory and 16 GB of memory. The F1 score and Root Mean Square Error (RMSE) in evaluating the performance of the PETQM model provides a more comprehensive assessment beyond basic accuracy. The F1 score reflects the model’s balance between precision and recall, which is particularly useful in educational evaluation where both overestimation and underestimation of quality levels can mislead decision-making by administrators or teachers. A higher F1 score means that the system can not only correctly identify high-quality or low-quality sessions but also minimize false positives and false negatives in these judgments. Meanwhile, RMSE provides an interpretable indication of the average prediction error magnitude, measured in the same units as the original quality score scale. In the context of PE teaching, a lower RMSE suggests that the model can reliably predict nuanced quality differences between instructional sessions. The GRNN, CNN, and LSTM are introduced. Figure 8 displays the results.

Figure 8.

Comparison of F1-score and RMSE values across different models as training set size increases.

Figure 8(a) shows the F1 values of each model. Figure 8(b) displays the RMSE values of each model as the size of the training set adds. In Figure 8(a), as the size increased, the F1 of each model also increased. When the training set size was 1000, the F1 values of CNN, LSTM, GRNN, and Km-LSTM were 0.88, 0.90, 0.95, and 0.98. In Figure 8(b), as the training set increased, the RMSE values of each model has decreased. When the training set size was 1000, the RMSE values of CNN, LSTM, GRNN, and Km-LSTM were 0.27, 0.16, 0.13, and 0.11. The data is divided into datasets with different sizes, arranged in ascending order from dataset 1 to dataset 4. The processing time of each model is compared. The results are shown in Figure 9.

Figure 9.

Processing time comparison of four models across training and validation datasets.

Figure 9(a) shows the processing time of each model on the same validation set under different training set sizes. Figure 9(b) shows the processing time of each model on different validation sets under the same training set size. In Figure 9(a), the processing time decreased. In training set 4, the processing time of CNN, LSTM, GRNN, and Km-LSTM was 0.8 s, 1.0 s, 0.9 s, and 0.4 s, respectively. In Figure 9(b), in different size validation sets, as the validation set increased, the processing time of each model increased. The proposed Km-LSTM had the least incremental processing time among the four models. In validation set 4, the processing time of CNN, LSTM, GRNN, and Km-LSTM was 3.8 s, 2.5 s, 2.0 s, and 1.4 s, respectively. The Km-LSTM can handle large datasets and exhibits good performance in larger datasets. The performance in different iterations is analyzed. The results are displayed in Figure 10.

Figure 10.

Accuracy and MAE comparison of four models across different iteration counts.

Figure 10(a) shows the accuracy. Figure 10(b) shows the MAE values of each algorithm at different iterations. In Figure 10(a), as the iterations increased, the accuracy of each model also has increased. When the number of iterations reached 100, all models converged. The accuracy of CNN, LSTM, GRNN, and Km-LSTM was 0.76, 0.87, 0.92, and 0.98. In Figure 10(b), as the iterations increased, the MAE values of each algorithm have decreased. When the number of iterations was 100, the MAE values of CNN, LSTM, GRNN, and Km-LSTM were 0.55, 0.44, 0.34, and 0.21, respectively. In the dataset, different student states are selected and the comprehensive performance is compared. Table 1 displays the results.

Table 1.

Comparative performance of four models under different student activity states.

Type	CNN			LSTM			GRNN			Km-LSTM
Type	ACC	FAR	Time (s)	ACC	FAR	Time(s)	ACC	FAR	Time	ACC	FAR	Time(s)
Conscientious	79.3	37.4	0.82	83.4	31.3	0.69	90.7	12.7	0.57	95.8	8.8	0.49
Speak	80.6	38.7	0.76	84.7	28.8	0.63	92.2	14.1	0.51	98.3	10.1	0.43
Discuss	77.5	34.4	0.88	80.3	24.3	0.75	87.5	9.5	0.63	94.4	5.6	0.51
Stare blankly	69.4	26.4	0.97	73.6	16.5	0.84	79.6	6.4	0.64	87.1	3.8	0.52
Doing exercises	80.7	37.6	1.24	84.4	27.3	0.85	90.4	11.2	0.65	96.2	5.4	0.53

In Table 1, ACC represents the recognition accuracy. FAR represents the false alarm rate of the model. According to Table 1, the CNN showed poor performance. The feature recognition accuracy for different student states was 79.3%, 80.6%, 77.5%, 69.4%, and 80.7%, respectively. It showed a high false alarm rate. The Km-LSTM had a high recognition accuracy and exhibited a low false alarm rate. The feature recognition accuracy of the Km-LSTM for different student states was 95.8%, 98.3%, 94.4%, 87.1%, and 96.2%, respectively, with false alarm rates of 8.8%, 10.1%, 5.6%, 3.8%, and 5.4%. The Km-LSTM had better performance than other models. Another important consideration is the scalability of the proposed model in real-world applications. Experimental results in this study demonstrate that the Km-LSTM and CNN-LSTM structures maintain stable accuracy and low error rates as the training set expands up to 1000 samples. Specifically, the F1-score and RMSE results indicate that the model continues to perform robustly with increasing data volume, and its processing time remains acceptable due to the dimensionality reduction and clustering mechanisms applied in preprocessing. This suggests that the model can be extended to larger PE datasets, such as those collected across multiple schools, semesters, or even districts. In addition, the modular design of the system features decoupled clustering and prediction stages, enabling it to adapt to different teaching environments, whether in secondary schools, universities, or professional training centers. For broader deployments, the framework can be integrated into cloud-based or edge-computing systems, where clustering and LSTM inference can be parallelized or distributed. This provides an opportunity for real-time and scalable evaluation of PE teaching in different educational backgrounds, while maintaining high prediction accuracy and processing efficiency.

In terms of computational efficiency, the proposed PETQM framework is designed to balance prediction accuracy with real-time applicability in PE scenarios. The framework consists of several stages: data preprocessing, clustering using K-medoids, signal decomposition through EEMD, and sequential prediction via CNN-LSTM. Each of these modules has its own computational footprint. The PCA component operates efficiently on medium-scale datasets and significantly reduces feature dimensionality, thereby speeding up downstream processing. Although K-medoids is computationally more intensive than simpler clustering algorithms like K-means, it offers better robustness to outliers and works well given that the number of clusters is relatively small and controlled. The EEMD module, while computationally demanding due to its ensemble nature, benefits from high parallelizability and can be executed efficiently on multi-core systems. The CNN-LSTM model, which integrates spatial and temporal feature extraction, adds modest computational overhead but ensures high-quality predictive performance. The empirical results demonstrate that the full system remains highly efficient in practice. The average processing time per sample stays well below one second, even when scaling up training iterations or data volume. This indicates that PETQM is not only accurate but also computationally feasible for deployment in real-time or near-real-time PE evaluation applications.

Performance analysis of teaching quality evaluation model based on improved LSTM

The YBSS dataset is applied as the training set and the TIMSS dataset as the validation set to analyze the PETQM based on the improved LSTM. The results are shown in Figure 11.

Figure 11.

Performance of CNN-LSTM, LSTM, and CNN under different training dataset sizes.

Figure 11(a) displays the accuracy. Figure 11(b) shows the F1 values. In Figure 11(a), as the training set increases, the accuracy of all three models has added. The proposed CNN-LSTM exceeded other strategies. When the size was 500, the accuracy of the CNN-LSTM, LSTM, and CNN were 0.96, 0.83, and 0.79, respectively. In Figure 11(b), as the training set increased, the F1 has continued to increase. When the dataset size was 500, the F1 values of the three models were 0.97, 0.90, and 0.76. The proposed CNN-LSTM exhibited good accuracy and F1 value among the three models. The CNN-LSTM can achieve good performance on smaller datasets. The performance and efficiency of various models in practical applications are displayed in Figure 12.

Figure 12.

Execution time of different models under increasing iterations and PE data volume.

Figure 12(a) displays the processing time of three algorithms at various iterations. Figure 12(b) displays the processing time under various teaching data size. In Figure 12(a), the CNN-LSTM at 100, 300, 500, and 700 iterations were 723 s, 654 s, 591 s, and 576 s, which were the lowest. In Figure 12(b), the processing time of the CNN-LSTM for sports teaching data size of 100, 200, 300, and 400 was 341 s, 389 s, 412 s, and 445 s, respectively. It had good ability under various iterations and task quantities. Taking the PE curriculum in a certain region as an example, the quality of PE teaching at different periods is displayed in Table 2.

Table 2.

Evaluation accuracy and recall of different models during distinct PE class time periods.

Time	Accuracy (%)			Recall rate (%)
Time	CNN-LSTM	LSTM	CNN	CNN-LSTM	LSTM	CNN
8:00–10:00	97.6	55.4	52.1	89.2	44.6	44.6
10:00–12:00	97.6	80.4	77.3	89.7	64.5	64.6
14:00–16:00	64.3	55.4	52.5	91.6	46.7	45.9
16:00–18:00	70.8	30.4	27.3	91.5	87.1	76.9
AVG	82.6	55.4	52.3	90.4	60.7	58.0

According to Table 2, among the four time periods, the highest accuracy was from 10:00 to 12:00, followed by the period from 14:00 to 16:00. The accuracy of the CNN-LSTM in each period was 97.6%, 97.6%, 64.3%, and 70.8%, with an average accuracy of 82.6%. The recall rate was 89.2%, 89.7%, 91.6%, and 91.5%, with an average recall rate of 90.4%. The accuracy and recall of the other two models were lower than those of the CNN-LSTM. The proposed CNN-LSTM also outperformed other models in practical applications. 50 PE teachers are separated into 5 groups. After the practical application, the model developed in this study is rated, as displayed in Table 3.

Table 3.

User evaluation form.

/	Group 1	Group 2	Group 3	Group 4	Group 5	AVG
CNN-LSTM	94.2	93.4	99.8	90.9	90.1	93.7
LSTM	82.1	81.3	85.0	83.9	86.9	83.8
CNN	76.6	77.7	82.2	81.9	84.2	80.5

From Table 3, the five groups on the CNN-LSTM were 94.2, 93.4, 99.8, 90.9, and 90.1, respectively, with an average score of 93.7. The CNN-LSTM was more highly praised by users. Ablation experiments are adopted for analysis, and each part of the model is analyzed. The results are shown in Table 4.

Table 4.

Ablation experiment analysis table.

Model Variant	Accuracy (%)	F1 score	RMSE	Inference time (s)
Full model	95	0.944	0.032	0.49
w/o PCA	91.7	0.906	0.047	0.52
w/o clustering (no K-medoids)	89.6	0.881	0.053	0.48
w/o CNN	87.5	0.855	0.06	0.45
w/o LSTM	84.2	0.802	0.072	0.41

To assess the contribution of each component, an ablation study is conducted by removing one module at a time from the full CNN-LSTM framework. As shown in Table 3, excluding PCA led to a noticeable drop in accuracy and increased RMSE, indicating its role in noise reduction and decorrelating inputs. When clustering was removed, performance further declined, as the lack of structured pre-grouping degraded temporal learning. Removing CNN caused the model to rely solely on LSTM, weakening its spatial feature extraction, while omitting LSTM greatly impaired the model’s ability to learn sequential dependencies. This confirms that each module contributes uniquely to the model’s final performance. The synergy between spatial, structural, and temporal modeling is essential for high-quality PE teaching evaluation.

Conclusion

With the advancement of big data in education, objective, accurate, and scalable models for evaluating PE quality have become increasingly necessary. This study proposes a novel evaluation framework integrating K-medoids clustering with an improved CNN-LSTM model, enhanced further by EEMD-based decomposition. The model effectively filtered and compressed noisy classroom data, captured spatiotemporal features, and delivered high prediction accuracy. Then, the data was input into CNN-LSTM for teaching quality evaluation. The data was then input into CNN-LSTM for assessment. The experimental results showed that when the training set size was 1,000, the F1 of CNN, LSTM, GRNN, and Km-LSTM was 0.88, 0.90, 0.95, and 0.98, respectively. The RMSE values were 0.27, 0.16, 0.13, and 0.11, respectively. In training set 4, the processing time of CNN, LSTM, GRNN, and Km-LSTM was 0.8 s, 1.0 s, 0.9 s, and 0.4 s, respectively. When the iterations reached 100, all models converged. The accuracy of CNN, LSTM, GRNN, and Km-LSTM was 0.76, 0.87, 0.92, and 0.98, respectively. The MAE values were 0.55, 0.44, 0.34, and 0.21, respectively. The feature recognition accuracy of the Km-LSTM for different student states was 95.8%, 98.3%, 94.4%, 87.1%, and 96.2%, respectively, with false alarm rates of 8.8%, 10.1%, 5.6%, 3.8%, and 5.4%. The proposed CNN-LSTM was better. When the dataset size was 500, the accuracy of the CNN-LSTM, LSTM, and CNN was 0.96, 0.83, and 0.79. It achieved good performance on smaller datasets. However, there are still shortcomings in the current research. A significant limitation lies in the composition of the dataset used. Although the YBSS and TIMSS datasets provide valuable large-scale survey information, a portion of the data originates from non-classroom environments, such as general physical health records or extracurricular activity logs. These records may not accurately reflect students’ real-time behavior, engagement, or instructional context within PE classes. This discrepancy may have introduced semantic noise or domain mismatch, which in turn could affect the prediction accuracy of the neural network. For example, certain features, such as physical health indicators, might exhibit strong correlations in general health assessments but show weak predictive value in assessing moment-to-moment teaching quality. To overcome this issue, future research should consider constructing or utilizing specialized, high-resolution classroom datasets tailored specifically for PE instruction. These datasets should include synchronized data streams from actual PE settings, such as in-class motion tracking, wearable sensor readings, real-time participation logs, and instructor evaluations. Additionally, improved data annotation protocols and multi-modal sensor integration (e.g., video, posture, and audio cues) can help capture richer contextual signals. These enhancements will allow future models to better align with the realistic dynamics of PE instruction, reduce generalization bias, and improve the interpretability and reliability of the evaluation outcomes.

Beyond theoretical performance, the proposed PETQM model also holds practical significance for teaching managers and PE instructors. By integrating this evaluation system into classroom environments, teachers can obtain real-time feedback on students’ physical engagement, behavioral performance, and learning outcomes. This enables instructors to identify at-risk students, adjust teaching strategies dynamically, and allocate instructional time more effectively across different learner groups. For teaching administrators, the model facilitates data-informed decision-making regarding curriculum planning, teacher performance reviews, and overall quality control in PE programs. The clustering-based preprocessing enables the system to adapt to diverse student populations, while the neural network component delivers consistent evaluation results without the variability caused by subjective observation. In practical deployment, the model can be embedded into smart classroom platforms or mobile PE monitoring systems, continuously collecting and evaluating multi-modal data such as movement patterns, test scores, and participation logs. Over time, this could lead to a shift from traditional one-size-fits-all assessment approaches toward personalized and adaptive PE instruction, supporting more inclusive and scientifically grounded PE environments.

Footnotes

ORCID iD

Long Zhang

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

, et al. Research on particle swarm optimization in LSTM neural networks for rainfall-runoff simulation. J Hydrol X 2022; 608(14): 171–183.

Che

Zuo

Zhang

. Application of three-layer stacked LSTM model assisted by educational robots in children's learning. Int J Hum Robot 2022; 19(3): 89–107.

Wang

. Research on teaching quality evaluation of college English based on the CODAS method under interval-valued intuitionistic fuzzy information. J Intell Fuzzy Syst 2021; 41(1): 1–10.

Luo

Chen

Guo

, et al. Teaching reform of machine learning course based on the integration mode of PBL and LBL. Adv Ind Eng Manag. 2023; 12(2): 74–78.

. The use of situational teaching in English teaching from the perspective of socio-cultural theory. Adv Ind Eng Manag. 2023; 12(2): 104–109.

Hou

. Online teaching quality evaluation model based on support vector machine and decision tree. J Intell Fuzzy Syst 2021; 40(2): 2193–2203.

Yuan

Nie

. Online classroom teaching quality evaluation system based on facial feature recognition. Sci Program 2021; 24(14): 7374846.1–7374846.10.

Guo

. Research on the construction of the quality evaluation model system for the teaching reform of physical education students in colleges and universities under the background of artificial intelligence. Sci Program 2022; 32(11): 6556631.1–6556631.9.

Liu

Ning

. Deep convolutional neural network and weighted Bayesian model for evaluation of college foreign language multimedia teaching. Wirel Commun Mob Comput 2021; 2021(3): 1–7.

10.

Zhou

Tang

. Construction of a six-pronge intelligent physical education classroom model in colleges and universities. SciPy 2022; 22(11): 9003864.1–9003864.11.

11.

Ushakov

Vasilyev

. Near-optimal large-scale k-medoids clustering. Inf Sci 2021; 545(3): 344–362.

12.

Chen

, et al. A new parallel adaptive structural reliability analysis method based on importance sampling and K-medoids clustering. Reliab Eng Syst Saf 2022; 218(2): 108124.1–108124.14.

13.

Zhang

Wang

Zhang

, et al. Riemannian distance-based fast K-medoids clustering algorithm for cooperative spectrum sensing. IEEE Syst J 2021; 32(99): 1–11.

14.

Sado

. Method of raw materials selection for production of the MgO‐C bricks of comparable properties using PCA and K-medoids. Int J Appl Ceram Technol 2024; 21(2): 1242–1258.

15.

Shu

Nie

. AK-GPSR: an adaptive K-medoids-based greedy perimeter stateless routing algorithm for multi-channel vehicular network communication. IEEE trans Intell Transp Syst 2024; 25(11): 19100–19109.

16.

Wang

Xiao

Zhu

, et al. Multi-view fuzzy clustering of deep random walk and sparse low-rank embedding. Spatial Inf 2022; 586(21): 224–238.

17.

Bhosle

Musande

. Evaluation of deep learning CNN model for recognition of Devanagari digit. Artif Intell Appl 2023; 1(2): 114–118.

18.

Xia

Han

, et al. Infrared and visible image fusion using a shallow CNN and structural similarity constraint. IET Image Process 2020; 14(3): 3562–3571.

19.

Zhao

Zhang

Hai

, et al. Intelligent fault diagnosis of rolling bearings based on normalized CNN considering data imbalance and variable working conditions. Knowl Base Syst 2020; 199(8): 321–336.

20.

Fang

Zhuo

Yan

, et al. Free-LSTM: an error distribution free deep learning for short-term traffic flow forecasting. Neurocomputing 2023; 526(14): 180–190.

21.

Zheng

Wang

Tian

, et al. A real-time transformer discharge pattern recognition method based on CNN-LSTM driven by few-shot learning. Elec Power Syst Res 2023; 219: 1–12.

22.

Ren

Zhang

Chen

, et al. Exploiting spectrum access ability for cooperative spectrum harvesting. IEEE Trans Commun 2019; 67(3): 1845–1857.

23.

Lam

Sheng

, et al. Agent-based spectrum management scheme in satellite communication systems. IEEE Trans Veh Technol 2021; 70(3): 2877–2881.

24.

Rao

Dhillon

Marojevic

, et al. Underlay radar-massive MIMO spectrum sharing: modeling fundamentals and performance analysis. IEEE Trans Wirel Commun 2021; 20(11): 7213–7229.

25.

Lindsey

. Transmission of classical information over noisy quantum channels–A spectrum approach. IEEE J Sel Areas Commun 2020; 38(3): 427–438.

26.

Guo

. Multi-agent deep reinforcement learning based spectrum allocation for D2D underlay communications. IEEE Trans Veh Technol 2020; 69(2): 1828–1840.

27.

Liu

Sun

, et al. Big-data-based intelligent spectrum sensing for heterogeneous spectrum communications in 5G. IEEE Wirel Commun 2021; 27(5): 67–73.

28.

Gao

Xing

Cheng

, et al. Spectrum prediction for supporting IoT applications over 5G. IEEE Wireless Commun 2021; 27(5): 10–15.

29.

Zhang

, et al. Radar detector in uncoordinated communication interference plus partially homogeneous clutter. IEEE Commun Lett 2021; 25(6): 1999–2003.

30.

Liu

Huang

Chang

, et al. Generalized complementary coded scrambling multiple access for MIMO communications. IEEE Trans Veh Technol 2021; 70(12): 13047–13061.

31.

Liang

. Value analysis and realization of artistic intervention in rural revitalization based on the fuzzy clustering algorithm. Sci Program 2022; 3(1): 1–9.

32.

Zhao

Zhang

Xue

, et al. Encryption transmission verification method of IT operation and maintenance data based on fuzzy clustering analysis. Mob Netw Appl 2022; 27(4): 1386–1399.