Abstract
Accurately evaluating the technological improvement effects of wind turbines is crucial for wind farm operators. To this end, this paper proposes an innovative approach that employs a wind power regression model which leverages external environmental information to predict the output power of wind turbines. The effectiveness of technological improvements can be evaluated by comparing the predicted output power with the measured output power. In this paper, a model called stacked LSTM networks with attention mechanisms is designed. In the proposed model, the stacked LSTM networks are used to enhance the nonlinear fitting ability and capture deeper features of the input sequence. Furthermore, temporal attention mechanisms are employed to make the model focus on important time-series information of the data. In addition, a hierarchical attention mechanism is designed to explore the correlation among the outputs of the stacked LSTM networks and enrich the model’s output information. The experiments on the data from a wind farm show that the proposed method outperforms various wind power prediction benchmarks, achieving lower RMSE, MAE, and MAPE values of 142.82, 104.2, and 4.85%, respectively.
Keywords
Introduction
In recent years, the problems of energy crisis and environmental pollution have become more prominent, leading to the wide development and utilization of new energy sources [1]. Given the advantages of wind energy, such as huge reserves, renewable and non-polluting, it is regarded as an extremely valuable new clean energy and has been widely used in power supply systems. According to statistics, by the end of 2022, China will have more than 50,000 wind turbines in service that have been in operation for more than ten years. In this situation, to fully use the advantages of wind power generation and further meet the growing demand for power, it is urgent to achieve technological improvements for in-service wind turbines [2]. Although a great deal of work has been done to try to realize the technological improvements of wind turbines, how to effectively evaluate the effects of these methods is a crucial issue.
Based on the above problem, researchers have proposed various evaluation methods, including the expert evaluation method, the comparative analysis method, the logical framework method, etc. [3]. Due to the clear and concise characteristics of the comparative analysis method, it has become the most widely applied method to evaluate the effects of technological improvements in practice. This method simply compares the improvement degree of the wind power curves before and after the technological improvements to complete the evaluation, ignoring the influence of multiple factors in complex wind farms on the output power of wind turbines, such as wind direction, pressure, and humidity. Moreover, these factors can cause the target wind turbine to have uncertain power values even under the same wind speed. Obviously, this evaluation method that only considers the influence of wind speed on output power cannot be used as an effective evaluation method for the technological improvements of wind turbines.
Inspired by the achievements of deep learning technology, in this paper, a reasonable method is designed for evaluating the effectiveness of technological improvements of wind turbines based on deep learning. The core idea of this method is to achieve the regression prediction of the output power of the wind turbine that needs technological improvements (target wind turbine) based on its external environment data by utilizing deep learning technology. In this way, the output power of the target wind turbine after the technological improvements is obtained through actual measurement, and the environmental data collected at the same time is fed into the deep learning model to predict the output power of the target wind turbine before the technological improvements. Finally, the performance improvement of the target wind turbine is determined by comparing and analyzing the power output values before and after technological improvements.
Compared with the aforementioned methods for evaluating the effectiveness of technological improvements for wind turbines, the approach based on deep learning can model the relationship between the output power of wind turbines and multiple factors in the complex environment of wind farms. After an appropriate number of iterations, the model can accurately predict the output power value of the target wind turbine before technological improvements, which leads to a more reliable evaluation of the technological improvements of wind turbines.
This paper proposes a novel power regression prediction method by using deep learning for the evaluation of the technological improvement effects of wind turbines, the principle of which is shown in Fig. 1. Specifically, the method first collects the historical data of the target wind turbine and its adjacent wind turbines before technological improvements through SCADA (Supervisory Control and Data Acquisition) system. Then, the deep learning model designed in this paper is trained based on the collected data. After technological improvements, the external environment data of the target wind turbine is fed to the trained power prediction model to generate the output power prediction, which can be regarded as the power value of the target wind turbine before technological improvements. Finally, the predicted power value can be compared with the measured power value of the target turbine collected by the SCADA system after technological improvements to evaluate the technological improvement effects.

Illustration of the mechanism of the evaluation of technological improvements effects of the wind turbine.
Although there exists some work on wind power regression prediction, most of the existing power prediction methods focus on
1. This paper proposes a
2. A power regression prediction model, named
3. Experiments conducted on the data from a real-world wind farm demonstrate the effectiveness of the proposed method.
Studies on wind power prediction can be roughly divided into two main categories, i.e., traditional machine learning methods and deep learning methods. Table 1 provides an overview of related literature referenced in this paper, which is either related to the background of this study or has inspired the design of the method in this paper. For each study, the application domain, the key technology, and their features are listed. Additional details of these studies are also provided below.
A summary of the existing methods
A summary of the existing methods
Previously, the performances of traditional power prediction methods were limited by the non-linearity and non-stationarity of the data. Because machine learning methods have great advantages in dealing with nonlinear problems, researchers try to use ma-chine learning to solve the problem of wind power prediction. For example, Bhaskar et al. [4] used the adaptive wavelet neural network (AWNN) to predict wind speed and then adopted the feedforward neural network (FFNN) to further predict the wind power based on the predicted wind speed. Experimental results show that the AWNN improved the fitting ability of the prediction model. Shan et al. [5] presented an improved nuclear extreme learning machine. This method uses the bee colony algorithm to optimize the parameters of the extreme learning machine, so as to achieve more accurate prediction. Naik et al. [6] used non-iterative mixed empirical mode decomposition (EMD) to decompose the wind speed and wind power time series data, then they used the KRR (kernel ridge regression) model to predict wind power. Wang et al. [7] designed a short-term wind power prediction model based on empirical mode decomposition (EMD) and radial basis function neural networks (RBFNN).
These works mentioned in the preceding paragraph have utilized machine learning techniques for predicting wind power. However, due to the large size and high dimensionality of the dataset, conventional machine learning methods were not employed in this study. Instead, a deep learning approach was adopted to achieve greater precision in predictions.
In addition, methods of combining different machine learning models have been proposed to further improve the accuracy of wind power prediction. For instance, Liu et al. [8] developed a wind power ramp prediction model called Orthogonal Test and Support Vector Machine (OT-SVM), which achieved better forecasting accuracy than SVM-related models. Qu et al. [9] combined a long short-term memory (LSTM) model with principal component analysis (PCA) to improve wind power prediction. Compared with BP neural network and support vector machine (SVM) models, the PCA-LSTM power prediction model has achieved higher accuracy.
The literature cited in the preceding paragraph employs ensemble models to improve prediction performance. This paper adopts similar principles, such as using a three-layer stacked LSTM network as the framework for the proposed method.
With the rapid development of deep learning, deep learning-based methods have surpassed traditional machine learning methods in many fields. At present, deep learning has been gradually applied to the field of wind power prediction. For instance, Lin et al. [10] implemented feature engineering on the data of wind turbines. They built a deep learning neural network to determine the nonlinear correlation between features and wind power, which maintained high accuracy at a low computational cost to predict wind power generation. Wang et al. [11] proposed an advanced point forecasting method based on wavelet transform and convolutional neural networks for probabilistic wind power prediction. The residual network based on the idea of residual learning has been applied to the field of wind power prediction. Ceyhun et al. [12] proposed an improved residual-based deep convolutional neural network (CNN) to predict wind power.
The studies referenced in the preceding paragraph employ deep learning techniques for wind power prediction, which are consistent with the approach taken in this paper. However, instead of using conventional convolutional neural networks, this study explores the potential of deep recurrent neural networks (stacked LSTMs) for wind power prediction.
In deep learning-based methods, the utilization of the attention mechanism makes the deep learning models achieve better results in wind power prediction. Peng et al. [13] developed a neural-network prediction model called EALSTM-QR for wind power prediction based on the deep learning technique. Their method extracts the features from the input data by an encoder structure with a multi-head attention layer, and the results show that utilizing the attention mechanism can re-duce the MAPE value by 3.63% in the next 1-hour wind power prediction. Cheng et al. [14] proposed the VMD-AM-WGAN model, which combines variational modal decomposition, attention mechanism, and LSTM as generators and uses convolutional neural networks as discriminators, effectively improving the performance of the model. Niu et al. [15] proposed a wind power forecasting model that uses GRU to embed the correlation among different forecasting steps. Apart from that, an attention mechanism is designed to identify the most important input variables. Similarly, Xiong et al. [16] proposed a model called AMC-LSTM that uses an attention mechanism to dynamically assign the weight of input data.
The studies mentioned in the preceding paragraph introduced attention mechanisms for wind power prediction, which have been found to effectively enhance the predictive capability of the models. Similarly, the approach presented in this paper also incorporates attention mechanisms. However, unlike the previous approach, the attention mechanisms utilized in this paper are designed to enable the LSTM models to focus on critical time steps and establish connections between different layers of the LSTM network.
There is also some literature related to multi-view learning that provides important inspiration for the design of the proposed method in this paper, although their applications are not for wind power prediction. For example, for gender recognition via gesture analysis on mobile devices, Guarino et al. [17] adopted two approaches, i.e., single-view and multi-view learning. Extensive experiments prove the feasibility of the scheme using LOUO-CV (leave-one-user-out cross validation). Similarly, based on touch gestures, Zaccagnino et al. [18] proposed an architecture based on a multi-view learning strategy to classify the users of mobiles into underages and adults, achieving the best ROC AUC (0.92) and accuracy (88%) scores. Additionally, Seeland et al. [19] proposed an image classification scheme that fuses visual information from multiple perspectives of the same object by using convolutional neural networks, effectively improving classification accuracy.
The literature mentioned in the preceding para-graph has utilized multi-view learning methods to accomplish their respective tasks, which have also inspired the stacked LSTM network in this paper. Different layers of LSTMs are used to capture different aspects of the data, similar to the concept of multi-view learning.
In this study, a deep learning-based model is proposed to evaluate the technological improvement effects of wind turbines by predicting their power output based on external environment information. Specifically, the predicted power output is utilized as a reference for the target wind turbine prior to technological improvements and compared with the actual measured power output after technological improvements to assess the effectiveness of the improvements.
To this end, the data collected from a target wind turbine is initially subjected to feature engineering techniques, including data concatenation, feature selection, data cleaning, and data standardization, to enhance the reliability of the input data features and subsequently improve the model’s accuracy. Then, the preprocessed data is fed into a three-layer stacked LSTM network using a sliding window approach, which enables the model to capture more abstract, higher-level information from the input data. To further enhance the model’s performance, a temporal attention mechanism is introduced to the stacked LSTM networks, allowing the model to selectively attend to the most significant information. Moreover, a hierarchical attention mechanism is designed further to process the output of the stacked LSTM networks to give prominence to more salient information. In the final stage, the output of the hierarchical attention mechanism is mapped to the ultimate output by means of a fully connected layer to predict the wind power that approximates the wind power output of the target wind turbine prior to technological improvements. Subsequently, the predicted value is compared to the actual measured power output of the target wind turbine after the technological improvements, thereby assessing the impact of the technological improvements. The schematic diagram of the proposed methodology is presented in Fig. 2. The ensuing sections will describe the individual modules mentioned earlier.

Block diagram of the proposed method.
In order to assess the technological improvement effects on the target wind turbine, historical data is initially gathered from the target wind turbine and two adjacent wind turbines. The data collected comprises various external environmental features, such as wind speed, wind direction, and temperature of the target turbine, as well as wind speed, current, voltage, and power of the adjacent wind turbines.
To prepare the input data for training the proposed network, a preprocessing step is necessary. Firstly, data from multiple wind turbines are concatenated, and the alignment of their timestamps is ensured. Subsequently, a feature selection process is performed by employing Spearman’s rank correlation coefficient [20] to compute the correlation between each feature and the target wind turbine’s output power, resulting in a subset of the most relevant features. The formula for calculating Spearman’s rank correlation coefficient is presented in Equation (1).
Data cleaning is a critical preprocessing step that plays a significant role in model training. To ensure effective training of the model, this study eliminates obvious outliers, including negative power data and data during power rationing. In addition, to address the issue of dimensional inconsistency among features, the Z-score normalization method is employed. The Z-score method transforms the raw data into a standard distribution by scaling each feature to have a mean of 0 and a standard deviation of 1. The mathematical formula for the Z-score method is expressed as Equation (2).
Let the dimension of the input sequence samples be denoted as H. After the raw data is processed by feature engineering, in order to realize the regression prediction of the output power of the target wind turbine, a sliding window approach is employed with a window size of W × H to capture the temporal correlation between the input features. Specifically, at each time step, W samples are fed to the model, which includes the current sample and the W - 1 preceding samples, to predict the output power for the current sample.
The LSTM network has the capability to capture intricate temporal patterns in input sequences. However, as each sample in the input sequence includes numerous features from the target wind turbine and its neighboring turbines, and there exist complex nonlinear relationships among these features, a single LSTM model may not effectively extract all pertinent information due to its limited feature extraction capacity, leading to diminished prediction accuracy. To overcome this challenge, stacked LSTM networks augmented with temporal attention mechanisms have been developed to provide stronger nonlinear fitting capabilities and enhance the ability to focus on significant time steps. This architecture is composed of two parts: the stacked LSTM networks and the temporal attention mechanisms.
As depicted in Fig. 2, the input sequence samples are initially subjected to processing by a linear projection layer, followed by the feeding of the output of the linear projection to an LSTM network. The resulting output is then refined through a temporal attention mechanism. To obtain the final output for wind power prediction, the structure consisting of the LSTM and attention mechanism is repeated three times.
The Long Short-Term Memory (LSTM) network, proposed by Hochreiter et al. [21], has introduced gated structures which effectively address the problems of gradient vanishing and exploding in traditional Recurrent Neural Networks (RNNs). The structure of the LSTM network unit is shown in Fig. 3. The LSTM network unit comprises a forget gate, an input gate, an output gate, and a cell with a self-recurrent connection, which can update the information to the block state. The specific internal process of the LSTM network unit is as follows: the forget gate Zf controls the information to be discarded from the previous cell state. The input gate Zi determines the information to be updated to the current cell state, and the output gate Z0 regulates the output based on the current cell state.

LSTM network unit.
Hermans M et al. [22] have demonstrated that Deep Recurrent Neural Network (DRNN) can extract deeper features by increasing the depth of the model. To enhance the representation ability of the model, a three-layer stacked LSTM structure is designed, which is also inspired by the multi-view learning strategy [17–19]. This approach enables the model to learn different aspects of the sequence through different layers of the LSTM, thereby improving its overall performance.
Additionally, inspired by Hu et al. [23], a temporal attention mechanism is designed to weight the feature vectors corresponding to different time steps of each LSTM network output, thereby enabling the model to focus more on the significant and critical time-series information. The functioning of the attention mechanism can be described as follows.
To begin, let the shape of the output matrix of the LSTM network be D × T, where T represents the number of time steps and D represents the feature dimension. The first step of the attention mechanism is to perform a squeeze operation on the output matrix along the T dimension, which is expressed as Equation (3). This operation squeezes the matrix to a vector with the shape of 1 × T.
Then, as shown in Fig. 4, the obtained vector is passed through a fully connected layer without changing its dimension. After that, the output of the fully connected layer is then transformed into a probability distribution using the Softmax function. Subsequently, the newly obtained vector with weight values can be multiplied by the original matrix to obtain weighted feature vectors of different time steps, the process of which is shown in Fig. 5.

Calculation of the weight for each time step.

Illustration of the temporal attention mechanism.
Considering that different LSTM layers can learn information at varying levels of abstraction, and there are certain correlations among them. In this study, a hierarchical attention mechanism is designed to effectively exploit these interrelationships and augment the information learned by the model.
Specifically, let Ik1, Ik2 and Ik3 be the outputs of the first, second, and third layers of the stacked LSTM networks. These outputs share the same dimensions. The architecture of the hierarchical attention mechanism is graphically depicted in Fig. 6. The hierarchical attention mechanism is mathematically expressed as Equation (4).

Hierarchical attention mechanism.
Firstly, the dot product of Ik1 and Ik2 is computed, followed by the Softmax function, which is applied to the resulting product between Ik1 and Ik2 to derive a weight matrix. This weight matrix is then multiplied with Ik3 to obtain the output. The hierarchical attention mechanism facilitates the interaction between the outputs of the first two layers of the stacked LSTM networks, enabling the weighting of the output from the third LSTM layer. As a result, different levels of information captured by the LSTM networks in different layers are integrated, resulting in a richer and more informative output and improved model performance. In addition, to realize the power regression prediction of the target wind turbine, a fully connected layer is employed after the hierarchical attention mechanism to get the final prediction.
Scene introduction
In this paper, the dataset used for experiments is collected from 10 in-service wind turbines (numbered 1-10) at an offshore wind farm located in China. These ten wind turbines are spatially arranged in a row, as illustrated in Fig. 7. This study employs the data collected by the SCADA systems of wind turbines No.3, No.4 and No.5 in 2019 for research. Among the selected wind turbines, No.4 is designated as the target wind turbine, while No.3 and No.5 are its adjacent wind turbines. No.4 needs to be technologically improved to produce greater output power under the same external conditions, such as wind speed and temperature.

The wind farm used in this paper.
There are three steps to evaluate the effect of technological improvements. Firstly, the proposed model is trained using historical data obtained from the target wind turbine (No.4) and adjacent wind turbines (No.3 and No.5). Secondly, after the technological improvement measures, such as lengthening blades or improving the aerodynamic performance of blades, etc., are implemented on No.4 wind turbine, input the measured data into the model to estimate the output power of No.4 turbine before the technological improvement under the current environmental conditions. Finally, the predicted power is compared with the measured output power to determine the efficacy of the technological intervention.
Each wind turbine was sampled every minute by the SCADA system, yielding features with 222 dimensions that contain both external and internal information of the wind turbine. The external information includes wind speed, wind direction, pressure, and humidity detected by sensors, while the internal information includes parameters of the internal structure of the wind turbine, such as current, voltage, power, and component temperature. In this study, the internal features of the target wind turbine are not allowed to be used as the input of the deep learning model because the technological improvements involve the internal structural transformations of wind turbines.
To identify the features strongly associated with the output power of the target wind turbine, Spearman analysis is employed, resulting in the selection of 123 out of the original 222 dimensions. After data cleaning, 200,000 samples are used to train and test the model. The training set includes data collected from January to November, while the test set comprises data from December. The output power of the test set of the target wind turbine is depicted in Fig. 8, showing that the data cover all operational phases, from the start-up of the wind turbine to its operation at the rated power of 5000 kW, indicating the dataset’s representativeness.

The output power of the target wind turbine in the test set.
In this study, the precision of the proposed model is assessed by employing three performance metrics, namely MAE (mean absolute error), RMSE (root-mean-square error), and MAPE (mean absolute percentage error). These metrics are expressed by Eqs. (5), (6), and (7), respectively.
The model was implemented based on PyTorch and was trained and tested on an NVIDIA GTX 1660Ti GPU. During both the training and testing phases, an optimal window size was crucial for the efficient input of data through the window sliding methodology. Let the window size be H × W, where H represents the number of selected features and W is the number of samples covered by the sliding window each time. Through an analysis of the results presented in Table 2, it was determined that the minimum loss value of the training set is achieved when the window size is set to 4. Therefore, the optimal window size was selected as 4 for the proposed model.
Loss values of the training set for different window sizes
Loss values of the training set for different window sizes
In addition, the study conducted an ablation experiment to evaluate the performance of different components of the model, specifically the temporal attention mechanism and the hierarchical attention mechanism. The results of the experiment are presented in Table 3, which demonstrates that the model composed of stacked LSTM networks has a lower prediction error compared to the single-layer LSTM network. Furthermore, the proposed model incorporating stacked LSTMs with both temporal and hierarchical attention mechanisms significantly enhances accuracy.
Results of the models with different configurations
Fig. 9 and Fig. 10 present the results of the proposed method on the test set. Fig.9 displays the prediction results of all samples in the test set. A visual comparison with the actual values suggests that the model has generally achieved high prediction accuracy. In Fig.10, the scattered points represent the actual and predicted values of the target wind turbine’s output power. Moreover, the curve fitting toolbox in MATLAB R2021a is utilized to fit these points. The two curves fitting the actual and predicted values are nearly identical, with an error of only 0.86%, thereby providing an accurate standard for the output wind power of the target wind turbine after technological improvements. These results validate the proposed method’s effectiveness.

The prediction results of the proposed model.

The fitting results of the proposed model.
To further evaluate the effectiveness of the proposed model, comparisons are made with several other models, namely, SVR, Random Forest, LightGBM, XGBoost, BP, LSTM, 1DCNN-LSTM and 1DCNN-BiLSTM. The results are presented in Table 4, which indicate that the proposed method outperforms the other models in terms of predictive accuracy. The output power of wind turbines is influenced by various factors such as wind turbine area, air density, and wind speed. Among them, wind speed is the most critical factor as it directly relates to the power output, which is proportional to the cube of the wind speed. To facilitate more accurate comparisons between the proposed model and traditional models, this study compares their performance on different intervals wind intervals, including low wind speed (<7m/s), medium wind speed (7-10m/s), and high wind speed (>10m/s). Table 5 presents the prediction results of the classical and proposed models under these three wind speed intervals. As indicated in Table 5, the proposed method demonstrates higher accuracy than the traditional models across all three wind speed intervals, further confirming the superior performance of the proposed method.
Comparison with other models about prediction errors
Comparison with other models about prediction errors
Comparisons of different models in different wind speed intervals
This study proposes a novel approach based on wind power regression prediction to evaluate the technological improvement effects. Firstly, a three-layer stacked LSTMs network is designed to improve the capacity to capture diverse aspects of the input sequence, resulting in enhanced capability to learn complex representations. Then,by using a temporal attention mechanism, the model can assign different weights to the output of each LSTM network at different time steps, which allows the model to better capture important time-series information. Additionally, the hierarchical attention mechanism in the model establishes correlations among the information of different layers of the stacked LSTM networks, further enhancing the model’s suitability for wind power prediction.
Experiments conducted on data obtained from a wind farm demonstrate the effectiveness of the proposed method. Compared to traditional evaluation methods for determining the effects of technological improvements on wind turbines, the proposed method offers greater determinacy and accuracy. As a result, after technological improvements, wind farm operators can more accurately assess the effectiveness or ineffectiveness of such improvements, thereby providing useful guidance for subsequent decision-making.
Although the proposed method demonstrates superior performance to some classic wind power prediction methods, the experimental results indicate that it has lower prediction accuracy in the low wind speed stage. This limitation could potentially lead to deviations in evaluating the technological improvement effect of wind turbines under low wind speed conditions. Furthermore, the lack of interpretability in deep learning-based power prediction methods makes it difficult to identify the impact of each feature on the model. In future work, a more targeted approach could be designed to process data in the low wind speed stage, and other power prediction methods with simpler working principles could be explored to improve both prediction accuracy and interpretability.
