Performance of long short-term memory networks in predicting athlete injury risk

Abstract

Conventional approaches to forecasting the risk of athlete injuries are constrained by their narrow scope in feature extraction, often failing to adequately account for temporal dependencies and the effects of long-term memory. This paper enhances the Long Short-Term Memory (LSTM) network, specifically tailoring it to harness temporal data pertaining to athletes. This advancement significantly boosts the accuracy and effectiveness of predicting the risk of injuries among athletes. The network structure of the LSTM model was improved, and the collected data was converted into the temporal data form of the LSTM input. Finally, historical data labeled with injury labels were used to train the improved LSTM model, and gradient descent iterative optimization was used to adjust the parameters of the improved LSTM model. The improved LSTM network model was compared with the traditional athlete injury risk prediction model in terms of performance. The incorporation of enhanced LSTM networks for the analysis of temporal athlete data holds significant research significance. This approach has the potential to substantially enhance the accuracy and effectiveness of athlete injury risk prediction, contributing to a deeper understanding of the temporal dynamics influencing injuries in sports.

Keywords

Athlete injury risk prediction long short-term memory network performance analysis temporal dependence

1. Introduction

Sports have developed rapidly, and athletes often face various injury risks during the training and competition process due to factors such as excessive exercise, improper exercise techniques, and sports venues. Once an athlete is injured, it would have a profound impact on their performance in sports competitions and their career [1, 2]. Athletes are injured and their training is interrupted, requiring long-term rehabilitation. Predictive analysis of injury risk in athletes is needed to reduce sports injuries and safeguard athletes’ professional development. Traditional methods for predicting athlete injury risk cannot capture the dynamic relationship between different sports data well, and the accuracy and reliability of athlete injury risk prediction still need to be improved. LSTM networks can capture temporal dependencies and effectively utilize long-term memory effects. Employing these advanced networks enables a comprehensive analysis of various key factors, including athletes’ historical data, current training conditions, and physiological metrics. This approach facilitates precise predictions of injury risks, offering critical decision-making support to athletes, coaches, and medical staff. Armed with this information, they can implement effective strategies to mitigate the likelihood of injuries, thereby enhancing the overall safety and well-being of athletes.

The health and training status of athletes are gradually receiving attention, and the issue of athlete injuries has become an urgent problem to be solved in the sports industry. Many people predict injury risk by establishing regression models, using injury risk as the dependent variable. To enhance the prediction performance of bone stress injury, Kraus Emily used a mixed effect Poisson regression model to conduct accurate prediction and analysis of bone stress injury in long-distance runners [3]. Machine learning has been widely applied in the field of data prediction and analysis, and many people have used traditional machine learning methods for athlete injury risk prediction and analysis. In order to effectively evaluate, predict, and detect sports injuries, Meng Linsheng proposed a new dual feature fusion neural network model for athlete injury assessment, which used 1 $\times$ 1 convolution and hyperlinks to form a dual fusion structure and had excellent sports injury prediction performance [4]. In order to predict and analyze hamstring tendon injuries in football players, Ruddy Joshua D used machine learning technology for injury risk prediction, collecting hamstring strength, injury history data, etc., at the beginning of the preseason to accurately predict and analyze hamstring tendon injuries in football players [5]. In order to predict the injury risk of young football players, Ma Guoshuai applied a decision tree (DT) classifier for data-driven sports injury prediction, which could accurately predict about half of the sports injuries and effectively reduce the injuries of young football players [6]. In order to predict the knee joint injury after long-distance running, Mei Qichang used OpenSim personalized musculoskeletal modeling to analyze the kinematics of amateur runners’ lower limb joints after a 5-kilometer run. He conducted knee joint injury prediction analysis by analyzing changes in foot posture and related muscle strength data [7]. Conventional machine learning approaches for athlete injury risk prediction often struggle to account for the dynamic nature of temporal data and the intricate interplay among various influencing factors.

LSTM networks have excellent ability to capture temporal information and long-term dependencies, and LSTM network models are used to predict temporal data. To enhance the performance of time series data prediction, Hua Yuxiu applied the LSTM network model to the field of telecommunications networks, accurately predicting and analyzing traffic and user mobility [8]. In order to accurately predict changes in the financial market, Fischer Thomas used a LSTM network model for sequence learning, with better predictive performance than the random forest (RF) algorithm [9]. In order to predict the room temperature climate in the greenhouse, Liu Yuwen constructed a greenhouse climate prediction model on the basis of LSTM, and used the LSTM model to capture the correlation between historical climate data, which could accurately predict the room temperature climate of various vegetables [10]. To further increase the accuracy of short-term traffic flow prediction, Ma Changxi proposed a short-term traffic flow prediction model based on traffic flow time series analysis and improved LSTM networks. He performed time series analysis of traffic flow data with smoothing and normalization to obtain a stable time series as model input data to improve the accuracy of model training [11]. The application of LSTM network models in forecasting athlete injury risk has been somewhat constrained. However, by harnessing LSTM models to capture temporal exercise data, significant enhancements in the precision and dependability of athlete injury risk prediction can be achieved.

To tackle the challenge of incomplete consideration of temporal data in athlete injury risk prediction, this study made enhancements to the LSTM network model. The approach involved partitioning lengthy sequences into multiple subsequences using sliding windows and introducing Dropout techniques to reduce inter-neuronal dependencies. The physiological data, training status data, and sports environment data of basketball players were collected from March 1, 2022 to August 1, 2022. The collected data was processed for missing and outliers, and the feature vector sequence was organized into a three-dimensional (3D) tensor form suitable for LSTM model input. A 5-fold cross validation method was utilized to divide the training and testing sets. The findings indicated that: (1) the improved LSTM network model had excellent accuracy in predicting athlete injury risk. (2) After dividing the test samples into 10 sample sizes, the improved LSTM network model had high injury risk prediction stability. (3) Compared with random forests, decision trees (DT), support vector machines (SVM), and LSTM network models, the improved LSTM network model had higher prediction accuracy and generalization ability. (4) The improved LSTM network model could handle temporal dependencies well and had excellent temporal performance.

2. Methods for predicting athlete injury risk

2.1 Data collection and preprocessing

Athletes often suffer injuries during sports activities, and the occurrence of sports injuries poses various hazards to athletes. Athlete injuries cause serious physical health problems, which can have negative psychological effects on athletes, cause financial pressure on them, and affect their physical training and competition [12, 13].

Predicting injury risk for athletes can help reduce the probability of injury occurrence and maintain high-level performance [14, 15]. For effective and precise prediction of athlete injury risks, it is imperative to comprehensively gather athlete-related data, encompassing physiological metrics, training status information, and data related to the sporting environment.

The data of 200 basketball players from a certain university in China is collected. Among the 200 basketball players collected, 50 have sports injuries, and the other 150 have no sports injuries. Sports data collection is conducted every morning at 8:30 am. Physiological data measurement includes heart rate measurement, blood pressure measurement, and body temperature measurement. Training status data includes maximum exercise speed, training frequency, and training duration. Environmental data includes temperature data, humidity data, and altitude data. The process of data collection is illustrated in Fig. 1.

Figure 1.

Process of data collection.

The chest strap heart rate monitor is correctly worn on the athlete’s chest to ensure that it closely fits the athlete’s body, and real-time heart rate data during exercise is recorded. Automatic inflatable cuff sphygmomanometers are used in a quiet state to record athletes’ diastolic and systolic blood pressure. The forehead temperature gun is used to record the body temperature of athletes during rest.

Sports trackers are employed for capturing data on athletes’ maximum movement speed, training duration, and weekly training frequency. Additionally, meteorological stations are utilized to measure crucial environmental factors, including temperature, humidity, and altitude within sports environments.

During the data collection process, the accuracy and stability of the collection equipment is ensured and the timestamps of the data are recorded to maintain the timing information. The time period for collecting athlete injury related data is from March 1st, 2022 to August 1st, 2022, and the collection time is 8:30 am. Data collection is conducted every day. The partial data of a certain athlete is described in Table 1.

Table 1

Partial data table of an athlete

Date	2022-3-1	2022-4-1	2022-5-1	2022-6-1	2022-7-1	2022-8-1
Time	8: 30	8: 30	8: 30	8: 30	8: 30	8: 30
Heart rate (bpm)	75	88	92	78	77	79
Systolic blood pressure (mmHg)	120	122	120	118	119	125
Diastolic blood pressure (mmHg)	80	82	85	84	78	79
Body temperature (degrees Celsius)	37.8	37.6	37.8	38.1	38.3	37.9
Maximum speed (m/s)	7.2	7.8	6.6	5.4	7.8	7.1
Training duration (min)	62	78	36	48	72	64
Training frequency this week	5	4	3	4	5	4
Environmental temperature (degrees Celsius)	28	26	32	22	26	25
Environmental humidity (%)	60	55	70	72	45	55
Altitude (meters)	50	120	288	67	52	54

In Table 1, some data information collected by a certain athlete is recorded. All data is given time information, recording the movement data of athletes between March 1st, 2022 and August 1st, 2022. Each column of data represents the record of each data collection, including information such as collection time, heart rate, blood pressure, body temperature, etc. Comprehensive collection and analysis of physiological data, training status, and possible environmental factors of athletes can help predict the risk of injury for athletes.

Upon gathering data from all athletes, the collected dataset undergoes a crucial preprocessing phase. This data preprocessing encompasses data cleaning, with the primary goal of ensuring the quality of the collected data and addressing missing or anomalous data points. The data cleaning process includes a thorough examination, processing, and rectification of physiological data, training status records, and potential environmental variables.

Missing and abnormal values in the collected athlete data are checked. During the data collection process, data loss may occur due to collection equipment or human factors. For missing values, the average value is utilized to fill in the missing values, and for abnormally large or small data, the correctness of the data is verified for manual correction. If the outlier cannot be explained or is unreasonable, it should be excluded.

Preprocessing the athlete data that has been collected plays a pivotal role in preserving data integrity and quality. It establishes a robust data foundation for subsequent athlete injury risk prediction, mitigating the risks of errors stemming from inaccurate or inconsistent data.

2.2 Construction of LSTM network model

LSTM neural network is a variant of recurrent neural network suitable for processing time series data, with built-in long-term dependent memory [16, 17]. In LSTM neural networks, the smallest unit is the LSTM neuron, which can remember information within a certain period of time, and capture long-term dependencies in data, thus effectively overcoming the defect of long-term dependence [18, 19].

LSTM neurons are controlled by three gates to control cell state. The input gate determines the addition of input information to the cell state and controls the weight of the input by using the sigmoid function [20, 21]. The forgetting gate determines the removal of some information from the cellular state, controlling the forgetting and retention of information. The output gate determines the extraction of information from the cell state as the output of the LSTM model.

Cell state is the main memory part of LSTM, responsible for transmitting information to different time steps. The structure of LSTM neurons is illustrated in Fig. 2.

Figure 2.

Structure of LSTM neurons.

In Fig. 2, the structure of LSTM neurons is described. Firstly, through the forgetting gate, the discarded information is determined. In the second step, the information added to the cell state is determined, and the old cell information is updated to the new cell information. The third step is to update the cell status and output it.

Within the athlete data collected over an extended period, the neural memory state accumulates a substantial volume of information. Some of this information may become obsolete or irrelevant over time. By incorporating forgetting gates, it becomes possible to filter out less important data, preserving clarity in the memory state.

The working principle of the forgetting gate is as follows:

$\displaystyle f_{t}=s({W_{f}\cdot[{h_{t-1},x_{t}}]+b_{f}})$ (1)

In Eq. (1), $s$ is the sigmoid function.

The formula for the sigmoid function is:

$\displaystyle{s}({x})=1/(1+{e}^{x})$ (2)

The step of adding new information to the cell state is divided into two parts, including obtaining a candidate cell information and updating new cell information.

The calculation process of candidate cell information is shown below.

$\displaystyle i_{t}=s({W_{i}\cdot[{h_{t-1},x_{t}}]+b_{i}})$ (3) $\displaystyle k_{t}=\textit{tanh}({W_{c}\cdot[{h_{t-1},x_{t}}]+b_{c}})$ (4)

In Eq. (4), using the tanh function, $h_{t-1}$ and $x_{t}$ are utilized to obtain candidate cell information $k_{t}$ .

Old cell information is updated:

$\displaystyle c_{t}=f_{t}\ast c_{t-1}+i_{t}\ast k_{t}$ (5)

New cell information is output:

$\displaystyle u_{t}=s({W_{u}\cdot[{h_{t-1},x_{t}}]+b_{u}})$ (6) $\displaystyle h_{t}=u_{t}\ast\textit{tanh}({c_{t}})$ (7)

The output gate uses the sigmoid function for judgment, passing the cell state through the tanh layer to obtain a vector of values between $-$ 1 and 1.

LSTM networks can analyze athletes’ injury risk based on temporal dependencies and historical data. However, when capturing long-term dependencies, the large time span involved often results in excessive computational burden [22, 23]. The LSTM network is improved by using sliding windows to divide long sequences into overlapping subsequences. On each subsequence, LSTM is run and the results are merged or aggregated to obtain the final prediction.

The process of using a sliding window for data capture is shown in Fig. 3.

Figure 3.

Process of data capture using sliding windows.

In Fig. 3, the process of using a sliding window for data capture is described. Firstly, the window size and window movement direction are initialized, sliding from the starting position of the long sequence, with L time steps each time. At each time step, the sub sequence in the current window is intercepted.

In the analysis of injury risk in long sequence athletes, due to the lack of regularization process in traditional LSTM networks, the overfitting risk of traditional LSTM networks is high. To reduce the risk of overfitting and increase the generalization ability of the model, regularization methods are added on the basis of traditional LSTM networks.

Dropout regularization is a random inactivation technique that randomly sets the output of a portion of neurons to zero during data training, thereby reducing dependencies between neurons and reducing overfitting [24]. In the LSTM network model, Dropout is introduced and placed between time steps within the LSTM layer. The probability of Dropout is selected as 0.3. During the data training process, Dropout is applied to reduce the dependency between neurons. During the data testing phase, Dropout is turned off.

The improved LSTM network model is illustrated in Fig. 4.

Figure 4.

Improved LSTM network model.

In Fig. 4, an improved LSTM network model is described. Sliding windows are used to shorten long sequence data and help the model capture long-term dependencies. During model training, Dropout regularization techniques are introduced to reduce the dependency between neurons and enhance the model’s generalization ability.

2.3 Feature representation

The prediction and analysis of injury risk for athletes is combined with time series information, and the physiological data, training status, and environmental factor data of athletes are labeled with time. The injury risk prediction and analysis are carried out through feature vectors. The relevant characteristics of athlete injury risk include physiological characteristics, training status characteristics, and environmental factors characteristics, which contain time information.

The size of the sliding window is set to 7 days, and the time step of the sliding window is 2 days. The feature vector sequence is the feature values within each time step. The sequence is labeled. In order to facilitate subsequent supervised learning, each sequence is assigned a label, which is set as a binary label. The setting of the label is whether the athlete has suffered any injuries within the next time interval. If an athlete experiences sports injury, the label is set to 1. If the athlete does not experience any injury, the label is set to 0.

The feature vector sequence is organized into a three-dimensional tensor form suitable for LSTM model input. The three dimensions of the three-dimensional tensor form are: sample size, time step size, and feature quantity. The three-dimensional tensor representation of athlete injury risk is:

[f1, f2, f3, …, fn], $<-$ feature vectors of time step 1 , $<-$ feature vectors of time step 2 … $<-$ feature vectors of time steps, $<-$ the first time series sample , $<-$ feature vectors of time step 1 … , $<-$ feature vectors of time step 2 … $<-$ feature vectors of time steps, $<-$ the second time series sample … , $<-$ feature vectors of time step 1 , $<-$ feature vectors of time step 2 … $<-$ feature vectors of time steps, $<-$ the m-th time series sample …

In the three-dimensional tensor of athlete injury risk, the long athlete injury data sequence is divided into multiple subsequences and arranged according to a certain time sequence, enabling the LSTM network model to fully capture temporal dependencies and utilize long-term memory effect data.

2.4 Model training and optimization

Athletes often experience sports injuries during their physical activity. One of the main goals in predicting the risk of injury for athletes is to predict in advance which athletes may be injured, in order to take preventive measures to reduce the risk of injury [25, 26]. Predicting the risk of athlete injury effectively reduces the risk of sports injury, thereby improving sports performance.

To accurately predict the risk of athlete injury, the improved LSTM network model is fully trained. Temporal data with time labels is used for training; binary cross entropy is used as the loss function; a random gradient descent optimizer is used to minimize the loss function.

Firstly, the neural weights and biases of the improved LSTM network model are initialized, and the training data is input for model training. In model training, there are two forms of data propagation: forward propagation of data and backward feedback propagation of errors. The predicted results are compared with the actual injury situation of athletes, and the performance of athlete injury risk prediction is reflected through the loss value.

Based on the error in predicting injury, the contribution of the parameters in the model to the loss is reflected, and the model parameters are adjusted in a timely manner through gradient descent to optimize the athlete injury risk prediction model. Through repeated iterative optimization, the parameters of the improved LSTM network model are continuously adjusted, and the loss value is updated and calculated until the loss function converges to an acceptable range, completing the training.

In order to prevent overfitting in the improved LSTM network model during training, Dropout is regularized to reduce the risk of overfitting.

3. Performance evaluation of athlete injury risk

3.1 Data partitioning

The prediction of athlete injury risk is closely related to the historical data of athletes. The experiment collects data from 200 basketball players from March 1st, 2022 to August 1st, 2022, with a sliding window size of 7 days and a sliding window time step of 2 days.

The collected athlete related data is divided into multiple subsequences based on a 7-day time interval, with each subsequence containing data points collected during that time interval, totaling 34800 subsequences. To comprehensively train and test the collected data, a 5-fold cross validation method is utilized for dataset partitioning in the experiment.

Firstly, the 34800 subsequences are divided into 5 subsets of the same size, and the data partitioning for different subsets is random. The subsequence data from one subset is selected as the test set, while the data from the other four subsets are utilized as the training set. A total of 5 iterations are conducted, with each iteration replacing the test subset until all data can be tested and trained.

3.2 Model parameters and experimental environment

To effectively analyze the performance of athlete injury risk prediction, relevant experimental environments are constructed and model parameters are set. The improved LSTM network model is a deep learning method, and the Python programming language is utilized to construct the LSTM network model.

The experimental environment is presented in Table 2.

Table 2
Experimental environment table

Types	Item	Parameter
Hardware environment	Memory	16 GB
	Central processing unit	12th Gen Intel(R) Core(TM) i9-12900H 2.50 GHz
	Graphics card	NVIDIA GeForce RTX 3060 Laptop GPU
Software environment	Programming language	Python
	Programming software	PyCharm
	Deep learning framework	TensorFlow
	Operating system	Linux

In Table 2, the experimental environment is described. In Python, the TensorFlow library is imported. The LSTM network model is constructed and the graphics processing unit is used to accelerate model training.

Due to the large amount of data on athlete injuries in the dataset, it is difficult to input all the data into the model at once. The batch size is set to 12; the learning rate of the model is set to 0.0001; the training cycle is set to 50 times. The learning rate determines the step size of the model at each parameter update.

3.3 Assessment of injury risk performance

The experiment is refined base on the LSTM network model. Sliding windows are used to partition long sequences into overlapping subsequences, and Dropout regularization techniques are introduced. To effectively analyze the performance of the improved LSTM network in predicting athlete injury risk, the performance of athlete injury risk prediction is evaluated.

After sufficient training of the collected dataset, the data from the test set is used for validation analysis. The results of athlete injury risk prediction and the actual injury situation of athletes are analyzed to reflect the performance of injury risk prediction. There are four scenarios for predicting and actual outcomes of athlete injury risk.

TP indicates that the prediction of athlete injury risk is a positive case and the actual one is also a positive case; FP indicates that the prediction of injury risk for athletes is positive, but it is actually negative; FN indicates that the predicted injury risk for athletes is negative, but it is actually positive; TN indicates that the prediction of injury risk for athletes is negative, and it is also a negative case in reality.

The accuracy, recall, precision, and F1 values of athlete injury risk prediction are reflected through the results of TP, FP, FN, and TN.

The formula for accuracy is expressed as:

$\displaystyle A=({TP}+{TN})/({TP}+{FP}+{FN}+{TN})$ (8)

The accuracy is the proportion of correctly predicted samples to the total sample size in athlete injury risk prediction.

The formula for precision is expressed as:

$\displaystyle{P}=TP/({TP}+{FP})$ (9)

The precision is the proportion of positive cases in the sample of athletes’ injury risk prediction.

The formula for recall rate is expressed as:

$\displaystyle{R}=TP/({TP}+{FN})$ (10)

The F1 value is calculated based on the results of recall and precision.

$\displaystyle F1=2({P}\times{R})/(P+R)$ (11)

In order to more intuitively reflect the predictive performance of athlete injury risk, receiver operating characteristic (ROC) curves are drawn to evaluate the predictive performance [27, 28]. By observing the trend of the ROC curve and the AUC (Area under Curve) value, the accurate performance of the model is determined [29, 30].

In order to comprehensively analyze and improve the performance of LSTM networks in predicting athlete injury risk, the experiment compares the improved LSTM network model with RF, DT, SVM, and LSTM network models. Model generalization ability refers to the performance of a model on unprecedented data, which is reflected by comparing the injury risk prediction performance of the model in training and testing sets. The smaller the difference in injury risk prediction performance between the training and testing sets, the better the generalization ability of the model. In athlete injury risk prediction analysis, temporal performance evaluation is conducted. Past time steps are utilized as inputs to predict future time steps and analyze whether there is a time lag in the model’s predictions.

To better analyze the model’s prediction of time lag, the date time is converted into a timestamp. By using a timestamp, the time and date of the athlete’s injury occurrence are reversed.

To analyze the stability of the model, the testing performance is observed under different sample size conditions by changing the data size in the test set. The samples in the test set are randomly divided into 10 equal parts to test the injury risk prediction performance under different equal part test sample conditions.

4. Results

4.1 Accuracy and loss of improved models

The accuracy of a model is used to measure its predictive accuracy on test data, and loss analysis is used to measure its performance on training data. On the test set, an improved LSTM network was used for accuracy and loss analysis. The analysis results of accuracy and loss values are illustrated in Fig. 5.

Figure 5.

Accuracy and loss values.

In Fig. 5, the model accuracy and loss analysis results are described, and the horizontal axis represents the number of iterations in model training. The left vertical axis represents the accuracy of athlete injury risk prediction in the test set, and the right vertical axis represents the loss value of the model. It can be learned that the accuracy of the improved LSTM network model was greater than 80.0% at epochs $=$ 2 and thereafter. The accuracy of the model ultimately stabilized at around 91.0%. The loss value of the improved LSTM network model was 0.86 when epochs $=$ 1, and 0.05 when epochs $=$ 50. Therefore, the application of an improved LSTM network model for predicting athlete injury risk has excellent accuracy and low loss values.

4.2 Stability of the improved model

In actual athlete injury risk prediction analysis, the number of samples analyzed is often different, which requires the model to have strong stability. To test the predictive performance of different numbers of samples, the samples in the test set were divided into 10 small portions. The stability results of the improved model under different test sample conditions are shown in Fig. 6.

Figure 6.

Stability of the improved model.

In Fig. 6, the stability analysis results of the improved model are described. The horizontal axis represents different proportions of test samples, and the vertical axis represents the numerical values of the accuracy and precision of the improved model’s injury risk prediction. If the accuracy and precision of the model did not change significantly in different proportions of test samples, it indicated that the model was relatively stable. It can be seen that out of the 10 test samples, 5 of them had an accuracy rate of 89.4% for injury risk prediction, and 5 of them had an accuracy rate of 89.5% for injury risk prediction. The accuracy deviation of athlete injury risk prediction under different sample conditions was 0.1%. Among the 10 test samples, 3 had an precision rate of 89.9% in predicting injury risk; the precision rate of injury risk prediction for one test sample was 90.0%; the precision of injury risk prediction for 6 test samples was 90.1%. The precision deviation of athlete injury risk prediction under different sample conditions was 0.2%. The improved LSTM network model maintained very high stability in predicting athlete injury risk in different test sample environments.

4.3 Performance of injury risk prediction for different models

To comprehensively analyze the performance of the improved LSTM network for athlete injury risk prediction, the improved LSTM network model was compared in performance with a variety of other models. The experiment selected RF, DT, SVM, and LSTM network models as comparative models. The accuracy, precision, recall, and F1 value results of injury risk prediction for different models are presented in Table 3.

Table 3
Performance of injury risk prediction for different models

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 value (%)
RF	75.4	76.2	78.3	77.2
DT	76.5	78.4	78.9	78.6
SVM	82.2	80.2	81.2	80.7
LSTM	86.2	85.1	84.2	84.6
Improved LSTM	89.4	91.2	91.9	91.5

In Table 3, the performance of different models in predicting injury risk is described, and the average prediction accuracy, precision, recall, and F1 values of five models are presented. It can be learned that the performance of injury risk prediction is improved LSTM network model, LSTM network model, SVM, DT, and RF, with average prediction accuracy of 89.4%, 86.2%, 82.2%, 76.5%, and 75.4%, respectively. The improved LSTM network model had better performance in predicting injury risk than the LSTM network model. This is mainly due to the improved LSTM network model using sliding windows to divide long sequence data into subsequence data, which is beneficial for obtaining long-term dependencies. Additionally, the improved LSTM network model introduces Dropout regularization to reduce overfitting. Therefore, the application of an improved LSTM network model for predicting athlete injury risk has high performance.

4.4 ROC curve and AUC value

In order to more intuitively analyze the performance of athlete injury risk prediction using different models, ROC curves were utilized to evaluate the performance of injury risk prediction. By observing the trend of ROC curves and AUC values, the performance of different models could be compared very intuitively. The ROC curves of different models are illustrated in Fig. 7.

Figure 7.

ROC curves of different models. A. ROC curve of the RF model. B. ROC curve of the DT model. C. ROC curve of the SVM model. D. ROC Curve of the LSTM network model. E. ROC curve of the improved LSTM network model.

The ROC curves of different models are described in Fig. 7, and the AUC values of each curve are calculated. The horizontal axis in the ROC curve indicates the rate of false positive cases and the vertical axis indicates the rate of true positive cases. The closer the curve is to the point (0, 1), the higher the performance of the model corresponding to the curve in predicting athlete injury risk. The area enclosed below the ROC curve is the AUC value, which ranges from 0 to 1. The closer the AUC value is to 1, the better the model’s performance is. The closer it is to 0.5, the closer it is to random guess. The closer it is to 0, the worse the model’s performance is. The AUC values of RF, DT, SVM, LSTM network model, and improved LSTM network model were 0.74, 0.78, 0.84, 0.86, and 0.91, respectively. The ROC curve of the improved LSTM network model was closest to the point (0, 1). The LSTM network model was improved, and sliding windows were used for long sequence partitioning, effectively improving the performance of athlete injury risk prediction.

4.5 Generalization ability of different models

The generalization ability of the model is also the main indicator for evaluating the performance of athlete injury risk prediction. By analyzing the injury risk prediction performance of different models in training and testing sets, the generalization ability of the model was reflected. The generalization ability of different models is shown in Fig. 8.

Figure 8.

Generalization ability of different models. A. The generalization ability of the random forest model. B. The generalization ability of the decision tree model. C. The generalization ability of the support vector machine model. D. The generalization ability of the LSTM network model. E. The generalization ability of the improved LSTM network model.

Figure 9.

Temporal performance.

In Fig. 8, the generalization ability of different models is described, and the horizontal axis represents four evaluation indicators, including accuracy, precision, recall, and F1 value. The performance of athlete injury risk prediction in the training and testing sets was compared. It can be learned that the performance of the five models in predicting athlete injury risk in the training set was generally higher than that in the test set. This may be due to the model overfitting the training data and losing its generalization ability to unseen data. It can be seen that the performance difference in predicting athlete injury risk between the training set and the test set in the improved LSTM network model was the smallest. The accuracy difference was 0.4%; the precision difference was 1.0%; the recall difference was 0.1%; the F1 value difference was 0.3%. Therefore, the improved LSTM network model effectively reduces overfitting on training data by introducing Dropout regularization, thereby improving the generalization ability of the athlete injury risk prediction model.

4.6 Temporal performance evaluation

Athlete injury risk prediction requires temporal performance, and accurately predicting the time of injury occurrence based on past athlete data can help protect athletes. Temporal performance evaluation mainly analyzes the lag of injury risk prediction, which can lead to an increased risk of injury for athletes. In the collected dataset, the actual injuries of an athlete occurred on April 5, May 7, May 9, June 7, and July 4, 2022, respectively. The temporal performance of athlete injury risk prediction using the LSTM network model and the improved LSTM network model is illustrated in Fig. 9.

In Fig. 9, the temporal performance analysis results of athlete injury risk prediction are described. The horizontal axis represents the 5 injuries that athletes have experienced, and the vertical axis represents the timeline. The data in the timeline is displayed in the form of a timestamp. The timestamps corresponding to the actual 5 injuries are April 5th, May 7th, May 9th, June 7th, and July 4th, 2022. The timestamps corresponding to the five injuries predicted by the LSTM network model are April 8th, May 10th, May 12th, June 10th, and July 6th, 2022, respectively. The improved LSTM network model predicts that the timestamps corresponding to the five injuries are April 5th, May 6th, May 9th, June 7th, and July 5th, 2022, respectively. It can be clearly learned that the predicted injury time of the improved LSTM network model was closer to the actual injury time, and the predicted time of the first, third, and fourth injuries was on the same day as the actual injury occurrence time. Therefore, the application of an improved LSTM network for predicting athlete injury risk has excellent temporal performance.

5. Conclusions

Accurately predicting injury risk in athletes necessitates an understanding of temporal dependencies and the utilization of long-term memory effects. This is due to the dynamic nature of athletes’ physical conditions and the evolving risk of injuries over time, where historical data plays a crucial role in forecasting future risks. In this study, we have compiled and time-stamped past data of basketball players. Leveraging the Long Short-Term Memory (LSTM) network model as a foundation, we incorporated Dropout techniques and employed sliding windows to segment extensive sequential data effectively. Subsequently, this enhanced LSTM network model underwent meticulous optimization and training, aiming to improve the precision and reliability of injury risk predictions. The experimental findings indicated that the improved LSTM model had very high accuracy in predicting athlete injury risk and could maintain stable predictive ability in different test sample environments. The improved LSTM model was compared with the LSTM network model, SVM, DT, and RF. The improved LSTM model had higher accuracy in predicting injury risk and higher generalization ability, and effectively reduced the time lag of injury risk prediction. The application of an improved LSTM model in predicting injury risk for athletes helps to protect their physical health. However, when analyzing the risk of athlete injury, this article has only analyzed the data of basketball players, and the sports covered are too single. Predicting and analyzing the risk of athlete injury in various sports would be the direction of future research.

Footnotes

Funding

This work was supported by: (1) Supported by the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJZD-K202302901); project name: Optimization of key technology of visual object detection based on complex moving image and its application in teaching and training. (2) Chongqing Preschool Education College High-level Talent Research Workstation Project: Children’s physical health and sports ability promotion workstation (Grant Number: 2023GZZ-001).

References

Emery

Pasanen

. Current trends in sport injury prevention. Best Practice & Research Clinical Rheumatology. 2019; 33(1): 3-15.

Bolling

Mechelen

Pasman

Verhagen

. Context matters: Revisiting the first step of the ‘sequence of prevention’ of sports injuries. Sports Medicine. 2018; 48(10): 2227-2234.

Kraus

Tenforde

Nattiv

Sainani

Kussman

Deakins-Roche

Singh

Kim

Barrack

Fredericson

. Bone stress injuries in male distance runners: Higher modified Female athlete triad cumulative risk assessment scores predict increased rates of injury. British Journal of Sports Medicine. 2019; 53(4): 237-242.

Meng

Qiao

. Analysis and design of dual-feature fusion neural network for sports injury estimation model. Neural Computing and Applications. 2023; 35(20): 14627-14639.

Ruddy

Shield

Maniar

Williams

Duhig

Timmins

Hickey

Bourne

Opar

. Predictive modeling of hamstring strain injuries in elite australian footballers. Med Sci Sports Exerc. 2018; 50(5): 906-914.

. Research on the design of a sports injury prediction model for teenage football players. Automation Technology and Applications. 2018; 37(7): 141-144.

Mei

Xiang

Sun

. Increasing the medial contact force of the knee joint with a “foot valgus” posture after long-distance running: A study based on OpenSim muscle bone modeling and machine learning prediction. Sports Science. 2019; 39(9): 51-59.

Hua

Zhao

Chen

Liu

Zhang

. Deep learning with long short-term memory for time series prediction. IEEE Communications Magazine. 2019; 57(6): 114-119.

Fischer

Krauss

. Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research. 2018; 270(2): 654-669.

10.

Liu

Wan

Wang

Dou

. A long short-term memory-based model for greenhouse climate prediction. International Journal of Intelligent Systems. 2022; 37(1): 135-151.

11.

Dai

Zhou

. Short-term traffic flow prediction for urban road sections based on time series analysis and LSTM_BILSTM method. IEEE Transactions on Intelligent Transportation Systems. 2021; 23(6): 5615-5624.

12.

Lin

Casey

Herman

Katz

Tenforde

. Sex differences in common sports injuries. PM&R. 2018; 10(10): 1073-1082.

13.

Donaldson

Callaghan

Bizzini

Jowett

Keyzer

Nicholson

. A concept mapping approach to identifying the barriers to implementing an evidence-based sports injury prevention programme. Injury Prevention. 2019; 25(4): 244-251.

14.

Bond

Dorman

Odney

Roggenbuck

Young

Munce

. Evaluation of the functional movement screen and a novel basketball mobility test as an injury prediction tool for collegiate basketball players. The Journal of Strength & Conditioning Research. 2019; 33(6): 1589-1600.

15.

Gledhill

Forsdyke

Murray

. Psychological interventions used to reduce sports injuries: A systematic review of real-world effectiveness. British Journal of Sports Medicine. 2018; 52(15): 967-971.

16.

Kratzert

Klotz

Brenner

Schulz

Herrnegger

. Rainfall-runoff modelling using long short-term memory (LSTM) networks. Hydrology and Earth System Sciences. 2018; 22(11): 6005-6022.

17.

Sahoo

Jha

Singh

Kumar

. Long short-term memory (LSTM) recurrent neural network for low-flow hydrological time series forecasting. Acta Geophysica. 2019; 67(5): 1471-1481.

18.

Sang

Pierro

. Improving trading technical analysis with tensorflow long short-term memory (LSTM) neural network. The Journal of Finance and Data Science. 2019; 5(1): 1-11.

19.

Andre

Sousa

Clarke

Couto

. BO-LSTM: Classifying relations via long short-term memory networks along biomedical ontologies. BMC Bioinformatics. 2019; 20(1): 1-12.

20.

Yang

Kim

. An accident diagnosis algorithm using long short-term memory. Nuclear Engineering and Technology. 2018; 50(4): 582-588.

21.

Lees

Buechel

Anderson

Slater

Reece

Coxon

Dadson

. Benchmarking data-driven rainfall-runoff models in Great Britain: A comparison of long short-term memory (LSTM)-based models with four lumped conceptual models. Hydrology and Earth System Sciences. 2021; 25(10): 5517-5534.

22.

Wunsch

Liesch

Broda

. Groundwater level forecasting with artificial neural networks: A comparison of long short-term memory (LSTM), convolutional neural networks (CNNs), and non-linear autoregressive networks with exogenous input (NARX). Hydrology and Earth System Sciences. 2021; 25(3): 1671-1687.

23.

ArunKumar

Kalaga

Kumar

CMS

Kawaji

Brenza

. Comparative analysis of gated recurrent units (GRU), long short-term memory (LSTM) cells, autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA) for forecasting COVID-19 trends. Alexandria Engineering Journal. 2020; 61(10): 7585-7603.

24.

Luo

Zhou

. Advancing non-negative latent factorization of tensors with diversified regularization schemes. IEEE Transactions on Services Computing. 2020; 15(3): 1334-1344.

25.

Chalmers

Debenedictis

Zacharia

Townsley

Gleeson

Lynagh

Townsley

Fuller

. Asymmetry during functional movement screening and injury risk in junior football players: A replication study. Scandinavian Journal of Medicine & Science in Sports. 2018; 28(3): 1281-1287.

26.

Shi

Dai

Zhang

Song

Liao

, et al. Research on the validity of functional movement screen (FMSTM) in predicting sports injury of high level short track speed skating athletes. Journal of Chengdu Institute of Physical Education. 2019; 45(2): 103-109.

27.

Nahm

. Receiver operating characteristic curve: Overview and practical use for clinicians. Korean Journal of Anesthesiology. 2022; 75(1): 25-36.

28.

Gao

Wang

. Analysis of k-partite ranking algorithm in area under the receiver operating characteristic curve criterion. International Journal of Computer Mathematics. 2018; 95(8): 1527-1547.

29.

Bowers

Zhou

. Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes. Journal of Education for Students Placed at Risk (JESPAR). 2019; 24(1): 20-46.

30.

Narkhede

. Understanding AUC-ROC curve. Towards Data Science. 2018; 26(1): 220-227.

Performance of long short-term memory networks in predicting athlete injury risk

Abstract

Keywords

1. Introduction

2. Methods for predicting athlete injury risk

2.1 Data collection and preprocessing

2.4 Model training and optimization

3. Performance evaluation of athlete injury risk

3.1 Data partitioning

3.2 Model parameters and experimental environment

Table 2 Experimental environment table

4.1 Accuracy and loss of improved models

Table 3 Performance of injury risk prediction for different models

5. Conclusions

Footnotes

Funding

References

Table 2
Experimental environment table

Table 3
Performance of injury risk prediction for different models