Abstract
Electrochemical energy storage battery fault prediction and diagnosis can provide timely feedback and accurate judgment for the battery management system(BMS), so that this enables timely adoption of appropriate measures to rectify the faults, thereby ensuring the long-term operation and high efficiency of the energy storage battery system. Based on the idea of data driven, this paper applies the Long-Short Term Memory(LSTM) algorithm in the field of artificial intelligence to establish the fault prediction model of energy storage battery, which can realize the prediction of the voltage difference over-limit fault according to the operation data of the energy storage battery, and introduce the parameter of the difference between maximum voltage and minimum voltage(DMM) at the cluster level to quantitatively determine whether the battery cluster has a fault. It provides powerful guidance and effective methods for the safe and stable operation of electrochemical energy storage power stations.
Introduction
The electrochemical energy storage batteries are crucial in modern energy systems, and the field of lithium-ion battery energy storage is the biggest and fastest-growing in the field of electrochemical energy storage. Because of the unpredictable nature of renewable energy sources like solar and wind power, energy storage technologies have emerged as a vital solution to counterbalance the instability of energy supply. The electrochemical energy storage system can not only balance the load of the grid, but also provide the grid with functions such as peak shaving and valley filling, backup power, and grid frequency regulation. In addition, in distributed energy systems, energy storage batteries can also provide power backup and regulation to ensure system stability and security. However, with the long-term operation of energy storage batteries, various failure types may occur, such as decreased battery capacity, increased internal resistance, oxidation, and so on. The performance and lifespan of the battery can be significantly impacted by these failures, thereby affecting the functioning of the power system. Therefore, predicting and diagnosing the failure of energy storage batteries becomes crucial. Fault prediction and diagnosis can provide timely feedback and accurate judgment for the battery management system, so that corresponding measures can be taken in time to repair the fault, thereby ensuring the long-term operation and high efficiency of the energy storage battery system. The data-driven approach has great advantages in solving these problems because it is able to extract features from a large amount of data and build predictive models for more accurate and reliable fault prediction and diagnosis.
Traditional energy storage battery failure prediction methods are usually based on physical modeling and rule reasoning. Physical modeling methods usually model the battery as a circuit equivalent model to predict physical parameters such as current, voltage, and temperature of the battery. The rule reasoning method is based on an expert system or a rule base, and diagnoses battery faults through rule reasoning.
Zhan [3] applied the method of long-term trend analysis, taking the cell voltage as the eigenvector and the mileage as the independent variable to reflect the failure of the electric vehicle batteries used for a long time. Lu [4] used a knowledge-based method to determine the change trend of parameters, such as battery internal resistance, and used an experience-based method to predict potential battery faults and assess its health status. To a certain extent, the abnormal internal parameters of a battery cell will be reflected in the range curve of a battery cell in the charging process, so Chen [5] suggested utilizing the similarity algorithm to diagnose consistency differences in battery cells by analyzing the range curves during the charging process. This algorithm, particularly the one based on cosine similarity, demonstrates outstanding diagnostic capability as it reflects abnormal internal parameters within the battery cell to some extent. Through experimentation, He [7] utilized the Arrhenius empirical formula, commonly used in chemical kinetics to explain the correlation between reaction activation energy, temperature, and reaction rate constant, to develop a battery long-term trend decline model. Li [8] evaluates the health condition of lithium batteries by utilizing a physical model of battery failure. Furthermore, Li predicts the remaining charge and discharge cycles of the batteries, offering users with helpful maintenance recommendations. You [12] presented the state of nonlinear aging(SoNA)parameter to measure the degree of battery capacity degradation and devised an early warning method for identifying capacity degradation failures using this parameter.
However, these traditional methods have the following limitations and defects: The physical modeling method requires a large amount of prior knowledge and experimental data to establish an accurate battery model, and it is difficult to model complex nonlinear systems. The rule-based reasoning method relies on expert knowledge or a rule base, which is prone to incomplete knowledge or rule contradictions. Traditional methods usually ignore the complex nonlinear relationship inside the battery, making it difficult to predict the occurrence and evolution of battery failure. The fault prediction accuracy and reliability of traditional methods are limited by the quality of the model and rule base, and it is difficult to adapt to the diversity and complexity of battery faults.
Therefore, data-driven methods have important advantages in energy storage battery failure prediction. The data-driven method directly utilizes a large amount of battery operating data for learning and modeling, and can adaptively adapt to the diversity and complexity of battery failures. In addition, the data-driven method can also discover the nonlinear relationship and hidden laws inside the battery, thereby improving the accuracy and reliability of failure prediction.
To summarize, the utilization of data-driven approaches is crucial in forecasting battery failures in energy storage systems. These approaches have the potential to fill the gaps and address the shortcomings of traditional methods, thereby enhancing the efficiency and accuracy of failure prediction.
Liu [1] researched the automatic fault identification method combining Ensemble Empirical Mode Decomposition(EEMD) algorithm and Support Vector Machine (SVM), from the voltage signal of lithium-ion power battery. To improve the accuracy of power battery fault diagnosis, the fault characteristics are analyzed and extracted from the voltage signal of lithium-ion power battery. Chen [5] realizes low SOC fault diagnosis and prediction based on ARIMA and CNN-LSTM. He [7] employed PCA dimension reduction and K-means clustering as preprocessing methods to process and analyze voltage data from a single cell. The goal was to uncover and diagnose different faults in the single battery. Additionally, the total voltage of the electric vehicle’s battery pack was utilized as the learning sample. Using the basic principle of the least squares support vector machine regression algorithm (LS-SVR), a fault prediction model based on the total voltage of the battery pack is established, and the prediction for the overvoltage and undervoltage faults of the total voltage are realized. Wang [9] employed the battery physical model to assess the internal temperature and SOC of the battery. Additionally, the deep learning algorithm LSTM was utilized to forecast the internal and external temperature of the battery. By establishing a threshold, the occurrence of battery thermal runaway failure was predicted, and an analysis of the causes was conducted. Pang [10] applied the fuzzy entropy algorithm in the field of machine learning to realize the online fault diagnosis of the battery. By introducing the fuzzy membership function, the fault diagnosis algorithm can set various parameters of the algorithm more flexibly, and the fault prediction can be realized to adapt to various working conditions of battery. Samadi [11] applied a particle filter algorithm to monitor the health of the battery, and realized fault detection by estimating the states of the battery. They are common methods for estimating battery conditions. What they have in common are that they can estimate and predict the system states, and they are all based on Bayesian formula for reasoning, and both can be used in energy storage battery failure prediction.
The role of artificial intelligence algorithms is crucial in predicting failures of energy storage batteries based on data driven method. Commonly used algorithms include neural networks, support vector machines, decision trees, random forests, etc. These algorithms can adaptively adapt to the diversity and complexity of battery faults, and discover hidden relationships and laws in the data, hence enhancing the precision and dependability of fault forecasting. At the same time, the artificial intelligence algorithm can also classify and diagnose battery faults, helping users quickly and accurately find and solve battery fault problems.
This paper aims to propose a data-driven method using the LSTM algorithm in the field of artificial intelligence to establish the voltage difference over-limit fault prediction model based on the difference between maximum voltage and minimum voltage(DMM) at the cluster level of energy storage batteries for improving the reliability and service life of energy storage batteries. The DMM’s formula is as follows:
where UMax is the maximum voltage of one energy storage battery cluster and UMin is the minimum voltage of one energy storage battery cluster.
The key contributions of this paper are as follows: Firstly, the data utilized in this study can be gathered by commonly available battery management systems(BMS)in the market. These BMS systems acquire various parameters such as current, voltage, temperature, and other data from energy storage batteries through sensors. The impact of noise and abnormal data are effectively reduced by data filtering and cleaning technology. Secondly, this paper proposes an effective feature extraction method. By setting the variance threshold and using random forest feature screening technology, representative features can be extracted from a large amount of original data, which provide a powerful support for subsequent model training. Thirdly, this paper uses the Long Short-Term Memory(LSTM) algorithm to model the fault prediction and diagnosis of energy storage batteries. LSTM is a commonly used recurrent neural network model. Thanks to its nonlinear nature, LSTM can serve as a sophisticated nonlinear unit for building more extensive deep neural networks [2].
The rest of this article is organized as follows: Section II will elaborate on the specific methods of data processing, feature extraction, model building, model training, and model evaluation for failure prediction of energy storage battery. Section III uses two case studies to verify the validity and feasibility of the failure prediction method based on data-driven for energy storage battery proposed in this paper. Finally, in Section IV, this paper will summarize the full text and look forward to the future research direction and development trend.
Data preparation and feature extraction
There are up to 565 columns(variables) in the dataset. If all of these columns of data were used to predict the DMM, the model would be too large and complicated. Enormous time and computational effort must be invested in the training process to complete the model. Meanwhile, among the 565 columns(variables), there are some columns of data like “time,” “date,” and “serial number” which are useless for predicting the DMM. Feeding these data directly to the LSTM model will produce useless predictions and prolong the training time in vain. But even if you cut out these useless columns(variables), the amount of data is still too large. Therefore, it is best to use the algorithm to select the best several columns(variables) from the data for model training.
Because in some columns(variables), the values are almost always the same. These columns(variables)are detrimental to the model’s training process as they can cause overfitting, diminish prediction accuracy, and slow down training. Therefore, it is necessary to set variance threshold to screen out the features of which the columns(variables) have low variance during data processing. Even if variance threshold is used (the threshold is set to 0.01), there is still a large number of features to choose from the dataset.
In order to further determine which feature is more suitable for training the model, random forest was performed on the further feature selection. The concept behind random forest is to determine the importance of each feature in relation to the target by calculating their contribution values in the decision tree. In conclusion, the final result is determined by calculating the average value based on the importance of each decision tree. Random Forest uses out of bag(OOB) data and Gini coefficient to measure the contribution of each feature. Out of bag(OOB) data refers to the data set that was not extracted during the bootstrap process. These data sets are used as test sets for decision trees in the random forest to check for the losses between predicted and target values. The random forest model will calculate the out-of-pocket error based on this data set. The Gini coefficient refers to the probability that a data point is classified into the wrong group. Through Gini coefficient, random forest model can judge the effect of partition. The result was measured by calculating how much the Gini coefficient decreases. The larger the decline, the more significant the contribution of the descriptive feature, and the stronger the correlation between the feature and the target feature. In the feature selection process of this model based ond random forest, the default value of 100 is used for the number of decision trees to save time and prevent overfitting.
To enhance the training process and minimize overfitting, the model selected the top four features in the results generated by the random forest model for training. These characteristics are “Total current”, “Total Voltage”, “Total power”, “Temperature”. In addition, the maximum and minimum voltage difference is added as the target feature required for model training.
Once the necessary features have been chosen, the dataset is then processed. The data sets used for training model are from Project1 battery cluster No. 1, and the data sets used for testing model are from Project1 battery cluster No. 2.
In the process of training, two sets of data were prepared to train the model. The initial batch comprises of the unprocessed data, where a single data point is collected every five seconds. The second data set consists of the processed data derived from the raw data. It takes one point from every 12 points in the original data to form a data set with a frequency of one data point per minute. The data points are filtered using the median selection method for continuous values, while the discrete values are screened using a random selection method.
In the training process of this model, sklearn’s MinMaxScaler is used as a means to normalize data. After normalization, the range of data mapping is [0,1]. Then the normalized data is segmented. The ratio of the split is 8:2, 80% of the data in the data set is used for training, and the remaining 20% of the data set is used to verify the loss of the model, and evaluation metrics used to assess model accuracy include the Mean Absolute Error(MSE) and accuracy index.
In this paper, the original plan was to use confidence intervals to obtain a highest safety threshold of the DMM. However, calculating the confidence interval is challenging due to the data’s non-normal distribution. Through calculation, 99.04% of the DMM of dataset in Project1 battery cluster No. 1 is within the range of 50 mV. Therefore, 50 mV is the maximum safe threshold of normal voltage by default for Project1 battery cluster No. 2.

Feature extraction process.

Architecture of an LSTM cell.
In order to overcome the problem that RNN will forget the data learned in the past after a long time, the LSTM algorithm was employed to build the model and resolve this problem. Figure 4 [9] illustrates the architecture of an LSTM cell. LSTM can address the issue of long-term dependence in RNN by incorporating gate mechanisms to regulate the flow and attenuation of features.
The main equations of LSTM algorithm are as follows:

Fault prediction model structure.
The model has five layers in total, and the distribution of neural units is [12, 16, 32, 64]. The initial layer, known as the input layer, consists of an LSTM layer containing 64 neural units. In this LSTM layer, the return sequences parameter is set to True. This means that the layer outputs a hidden state value for each time step, and these hidden state values wiil be used to train the next LSTM layer. The second and third layers have similar settings to the first layer. The only difference is the third layer only has 32 nerve units. When multiple hidden layers are used and multiple slopes are multiplied together in backpropagation process, the slope of tanh will be too small or too large. The
In addition, three dropout layers have been added to the model. The first dropout layer is between the input layer and the first hidden layer, and two other dropout layers are between the three LSTM hidden layers. It is important to note that these three dropout layers are layer-to-layer dropout.
To accelerate the training process, the batch size of the model is set at 16. During training the model, the initial learning rate lr = 0.01 was set, and the callbacks were set to adjust learning rate values between epochs. In the callback’s setup, if the loss value of the test set does not decrease after an epoch is completed, the learning rate value is automatically reduced to 10%. If this measure still does not reduce the loss of the test dataset for the next epoch, the program terminates the training of the model. In addition, If the predicted value caused by a gradient explosion changes to NAN, the program will automatically stop. When the program automatically stops, the model with the best prediction is saved. The model training is set to run for a total of five epochs.
Model evaluation
Based on the value of the Difference between the Maximum voltage and the Minimum voltage (DMM) predicted by the model, the program grouped these predictions into two categories. The first type of DMM meets or exceeds the maximum safety threshold(True), while the second type of DMM falls below the maximum safety threshold(False). In this way, the transformation from regression problems to classification problems is achieved. In conclusion, the program can create a confusion matrix to analyze the results generated by the model.
True Positive(TP): The actual result is that the DMM is greater than or equal to the maximum safety threshold. The predicted outcome of the model is also higher than or equivalent to the maximum safety threshold.
False Positive(FP): The actual result is that the DMM is less than the maximum safety threshold. The predicted outcome of the model is either greater than or equal to the maximum safety threshold.
False Negative(FN): The actual result is that the DMM is greater than or equal to the maximum safety threshold. The predicted outcome of the model is less than the maximum safety threshold.
True Negative(TN): The actual result is that the DMM is less than the maximum safety threshold. The predicted outcome of the model is also less than the maximum safety threshold.
According to the results generated by the confusion matrix, some evaluation indicators such as recall, precision and accuracy can be obtained.
Recall: Recall refers to the number of the DMM that is greater than or equal to the maximum safety threshold and that can be found by the model. The formula is TP/(TP + FN).
Precision: Precision refers to the percentage of correct predictions when the DMM predicted by the model is greater than or equal to the maximum safety threshold. The calculation formula is TP/(TP + FP).
Accuracy: Accuracy refers to the percentage that the correct predictions in all predictions of the model. The formula is (TP + TN)/Total.
F1 score: The F1 score is the average of recall and accuracy. The formula is 2 * (Precision * Recall)/(Precision + Recall).
Support refers to the count of distinct values within the dataset. False refers to the number of samples for which the DMM is less than the maximum safety threshold. True refers to the number of samples for which the DMM is greater than or equal to the maximum safety threshold.
Results
In this part, we applied two case studies to verify the fault prediction model proposed in this paper, namely project1 with frequency of 5 s and 1 min. The data set of project1 is the real data generated during the actual operation of the energy storage station. The project1 uses the lithium-ion batteries. Project1 selects battery cluster No. 1 for model training and battery cluster No. 2 for model accuracy test. The reason for this selection is that No. 2 battery cluster does have a voltage difference over-limit fault during the actual operation, while No. 1 battery cluster has no fault is found, and it is adjacent to No. 2 battery cluster, which operates under similar conditions to the battery cluster No. 2.
Project1 has 6 months of real operating data, in which the difference between maximum voltage and minimum voltage(DMM) of the normal battery cluster ranges from 0 to 0.05 V, while the DMM of the faulty battery cluster even reaches 1.959 V.
Case Study 1: Analysis of Prediction Results with a Collection Frequency of 5 s for Project1
Prediction results of Project1 cluster No.2 predicting the first point (the fifth second) in future with collection frequency 5s
The accuracy of predicted results for Project1 cluster No.2 predicting the first point (the fifth second) in future with collection frequency 5 s is 0.98.

Location of battery clusters in a battery pack.

Prediction results using the Project1 cluster No.2 as test set with the collection frequency is 5 s.
The accuracy of predicted results for Project1 cluster No.2 predicting the twelfth point (the sixtieth second) in future with collection frequency 5 s is 0.96.
Case Study 2: Analysis of Prediction Results with a Collection Frequency of 1 min for Project1
Prediction results of Project1 cluster No.2 predicting the first point (the first minute) in future with collection frequency 1min
The accuracy of predicted results for Project1 cluster No.2 predicting the first point (the first minute) in future with collection frequency 1 min is 0.95.
Prediction results of Project1 cluster No.2 predicting the tenth point (the tenth minute) in future with collection frequency 1 min
The accuracy of predicted results for Project1 cluster No.2 predicting the tenth point (the tenth minute) in future with collection frequency 1 minis 0.92.
According to the results generated by the model, this model is still some way from accurately predicting the value of the DMM. But the model can now provide a very accurate estimate of whether the predicted future DMMV will exceed the safe value. The model can predict the results of the first point (the fifth second in the future) with 0.99 accuracy. At the twelfth point (the sixtiesth second in the future), the predicted results can also reach the accuracy of 0.96.
Prediction of Project1 cluster No.2 predicting the thirtieth point (the thirtieth minute) in future with collection frequency 1 min
The accuracy of predicted results for Project1 cluster No.2 predicting the thirtieth point (the thirtieth minute) in future with collection frequency 1 min is 0.86.
To ensure accurate predictions, the previous model is trained by forecasting the next 12 steps. So models that predict 30 points ahead are redesigned and trained. Increasing the amount of data inputted into the model during training will raise the likelihood of gradient vanishing. Two measures have been taken to solve this problem. First, the LSTM layers in the model are reduced from four to two. Furthermore, the look-back timestep has been decreased from 64 to 30. After taking the above two measures, the training has been improved obviously. In spite of this, the prediction accuracy of the 30th minute is still lower than that of the 10th minute, which isonly 0.86.

Prediction results using the Project1 cluster No.2 as test set with the collection frequency is 1 min.
The accuracy of predicted results for Project1 cluster No.2 predicting the sixtieth point (the sixtieth minute) in future with collection frequency 1 min is 0.78.
Confusion matrix for Project1 cluster No.2 predicting the first point (the fifth second) in future
Confusion matrix for Project1 cluster No.2 predicting the first point (the fifth second) in future
Like models that predict thirty steps backwards, models that predict sixty steps are redesigned and trained. As before, the number of LSTM layers for the model has been reduced from four to two. But the number of steps to look back has not changed, and it remains at 64 steps. Compared to other predictions, the model’s prediction of the 60th step is very poor. Its accuracy is only 0.78. This model is likely already at its limit.
Evaluation indexes for Project1 cluster No.2 predicting the first point (the fifth second) in future
Confusion matrix for Project1 cluster No.2 predicting the twelfth point (the sixtieth second) in future
Evaluation indexes for Project1 cluster No.2 predicting the twelfth point (the sixtieth second) in future
Confusion matrix for Project1 cluster No.2 predicting the first point (the first minute) in future
Evaluation indexes for Project1 cluster No.2 predicting the first point (the first minute) in future
Confusion matrix for Project1 cluster No.2 predicting the tenth point (the tenth minute) in future
Evaluation indexes for Project1 cluster No.2 predicting the tenth point (the tenth minute) in future
Confusion matrix for Project1 cluster No.2 predicting the thirtieth point (the thirtieth minute) in future
Evaluation indexes for Project1 cluster No.2 predicting the thirtieth point (the thirtieth minute) in future
Confusion matrix for Project1 cluster No.2 predicting the sixtieth point (the sixtieth minute) in future
Evaluation indexes for Project1 cluster No.2 predicting the sixtieth point (the sixtieth minute) in future
This paper proposes a data-driven fault prediction method for energy storage batteries, which can realize the prediction of the voltage difference over-limit fault based on the operation data of the energy storage battery and the parameter of the difference between maximum voltage and minimum voltage(DMM) at the cluster level. The main conclusions and results of this paper are summarized as follows: In this paper, a set of reasonable model evaluation system and model evaluation indexes are designed for effective evaluation of fault prediction model. Two case studies have proved that these model evaluation system and indexes are reasonable and can be extended to other data-driven models. The difference between maximum voltage and minimum voltage(DMM) prediction model based on LSTM has a high accuracy rate in predicting whether the DMM of the battery cluster will exceed the preset safety threshold. However, it is not effective enough in predicting the specific value of the DMM. By adjusting the sliding window and acquisition frequency of the data input to the model, it can realize the adjustment of the forecast future time interval of the predicted result. In this paper, model parameters, model structure and activation function of the LSTM algorithm were optimized, and the model was verified through two case studies. If the data collection frequency is 5 s, it can accurately predict whether the voltage over-limit fault will occur within the next 60 s, and when the data acquisition frequency is 1 min, it can be accurately predicted whether a voltage over-limit fault will occur within the next 10 minutes.
The fault prediction model proposed in this paper is verified by two case studies, which proves that LSTM can effectively predict whether the difference between maximum voltage and minimum voltage(DMM) at the battery cluster level of the energy storage battery will occur the voltage difference over-limit fault. However, this model still has numerous shortcomings. In the future work, further research can be carried out on the precise numerical prediction of the DMM and the extension of the time dimension of the prediction results.
