Abstract
The inherent variability of wind energy necessitates precise forecasting for its effective grid integration. This study investigates machine learning techniques for wind speed forecasting, applying Random Forest, AdaBoost, and Support Vector Regression (SVR), alongside a novel two-layer stacking ensemble model developed to leverage their combined strengths. The models were trained and validated using meteorological data from the National School of Electronics and Telecommunications of Sfax (ENET'Com) for October 2024 and May 2025. The ensemble model consistently outperformed the base models, achieving a Mean Absolute Error (MAE) of 0.255, a Root Mean Square Error (RMSE) of 0.334, and an R2 of 0.801 for October. High accuracy was maintained in May, with an MAE of 0.314, an RMSE of 0.429, and an R2 of 0.745. These findings validate the efficacy of advanced machine learning, particularly ensemble methods, for enhancing wind energy’s predictability and reliability.
Introduction
Given the growing demand for electricity all over the world, national and international policies are moving towards the use of renewable energies, namely, wind energy. Truthful predictions of renewable energy as wind are becoming integrated into the power system and distribution infrastructure. In fact, the major challenge for the transmission and distribution operators is to equalize supply and demand in the market. Wind energy has been considered as a sustainable energy that achieves remarkable development due to its efficiency, affordability, and free-pollution power (Perry Sadorsky, 2021). Moreover, the use of wind turbines in generating electricity has rewarded significant attention thanks to their green power outcome and potential contribution to decarbonization efforts. Given its instability and stochastic nature, generated wind energy requires improvement in management before being injected into the electrical grid (Gilles Notton et al., 2018). Nevertheless, efficient management requires wind power prediction so that to avoid energy loss and plan energy exchange with the electric grid (Brahmi and Chaabene, 2025).
While efficiency improvements in wind generation devices are expected to moderate the energy increasing demand, researchers warn that these gains alone may not be enough to offset the continuous electrification of society (Hamed Rahmani et al., 2023). Driven by factors like the electrification of transportation, including electric vehicles, and the emergence of new electronic devices, global energy demand is projected to keep rising year after year (Michaelides, 2021; Tamor and Stechel, 2022). This trend underscores the urgency of developing new energy sources, especially considering estimates from the World Energy Forum that non-renewable resources like coal, gas, and oil will be depleted within 100 years (Xihui Haviour Chen et al., 2023). With fossil fuels currently accounting for 79% of the world’s primary energy consumption (Ruoso et al., 2024), renewable energy sources like wind power offer a promising solution (Cristea et al., 2022; Paraschiv and Paraschiv, 2023).
Over the coming decades, the transition from fossil-based generation to sustainable energy systems will rely not only on the deployment of clean energy technologies but also on the implementation of responsible energy consumption practices (Paraschiv Spiru, 2023). Among renewable resources, wind energy has emerged as a reliable and abundant option, contributing approximately 743 GW to global electricity generation, accounting for more than 6% of the world’s total electricity supply in 2020 (Cristea et al., 2021). The continued development of large-scale wind farms is expected to play a pivotal role in addressing future energy demand (Paraschiv et al., 2020). Nevertheless, the inherent variability and intermittency of wind power pose significant challenges for its integration into existing power grids. Consequently, accurate wind speed forecasting has become essential for enhancing grid integration, improving energy dispatch strategies, and mitigating the adverse impacts of wind power fluctuations on system reliability and stability (Cristea et al., 2021), .As large-scale wind farms are grid-connected, their inherent variability in energy output becomes magnified (Zhou & Wang, 2021). This intermittency, characterized by significant fluctuations in wind speed, involves issues in grid stability (Li and Liu, 2022; Nabiha Brahmi et al., 2017). To ensure smooth integration and minimize the impact of these unforecastable shifts, accurate wind speed prediction and proactive risk management are crucial (Liu and Zhang, 2022; Wang and Chen, 2023). By quantifying the inherent uncertainty of wind power, we can better anticipate and mitigate disruptions, paving the way for seamless grid integration of this clean energy source.
The wind energy produced depends not only on the number of geographic locations but also on the number and size of installed windmills. Besides, the non-stationarity of wind energy amount expects a correct prediction to ease its adoption in energy business. Wind speed is influenced by a variety of factors such as temperature, pressure, and weather conditions, which are difficult to model using conventional mathematical approaches. Furthermore, wind speed exhibits non-linear and non-stationary behavior, making it particularly challenging to capture its patterns and fluctuations accurately.
Conventionally, meteorological forecasting models have relied on physical and statistical approaches. Physical models, such as numerical weather prediction models, attempt to simulate the complex interactions between different weather variables based on the laws of physics. However, these models present limitations related to the representation of atmospheric processes and the availability of accurate initial conditions. Statistical models, on the other hand, exploit historical data patterns to forecast future outcomes. While statistical models have been widely used in wind speed prediction, they often struggle to capture the non-linear relationships and temporal dependencies in the data, leading to limited forecasting performance.
Accurate wind speed prediction seems to be elementary for estimating wind energy potential, despite the inherent challenges posed by its chaotic and stochastic behavior (Nabiha Brahmi et al., 2023). Based on the prediction horizon, wind speed forecasts can be classified into short-term (minutes to hours), medium-term (hours to a week), and long-term (weeks to years). The available forecasting methods fall into three main categories: physical models as weather research and forecasting (WRF) (Li and Liu, 2022; Qin and Li, 2015), statistical models similar to autoregressive moving average (ARMA) (Zhang and Wang, 2020), and machine learning (ML) models such as support vector regression (SVR).
Considering recent advancements in artificial intelligence, the pursuit of enhanced forecasting accuracy has evolved into a complex and demanding challenge. Nevertheless, accurate forecasting remains a fundamental component in ensuring autonomous operation, operational stability, predictive maintenance, and overall efficiency of wind energy systems. Machine learning techniques play a crucial role in the functionality of forecasting to avoid irregularities and correct precision through its algorithms. Literally, those approaches offer advantages such as pattern recognition, flexibility, ensemble learning, feature selection, scalability, and continuous learning. By recognizing complex patterns and non-linear dependencies in wind speed data, machine learning models can enhance prediction accuracy. The ensemble learning methods, such as AdaBoost, combine multiple models to reduce errors and improve overall accuracy. Furthermore, machine learning algorithms automatically select relevant features, eliminate noise, and handle large datasets efficiently, capturing spatial dependencies and enhancing precision. Continuous learning capabilities allow models to adapt to new data, ensuring up-to-date and accurate predictions. Leveraging these capabilities can support better decision-making in renewable energy planning and operational strategies. The aim of this survey is to introduce a comparative analysis of machine learning algorithms used in wind speed forecast. Validation is based on data collected from ENET’COM Sfax, Tunisia. This research employs a comprehensive dataset covering the period from July 2024 to May 2025.
Primarily, the significance of precise wind speed forecasting is emphasized, and the limitations of traditional empirical and numerical models are discussed, positioning machine learning, particularly ensemble learning, as a promising alternative to enhance accuracy. Subsequently, the evolution of wind forecasting is reviewed, comparing conventional methods to modern data-driven approaches, and clearly defining the study’s motivation, objectives, and scope. Thereafter, the proposed ensemble learning-based methodology is detailed, which employs a two-layer stacking framework: SVR, AdaBoost, and Random Forest as base learners, with a Random Forest meta-learner for optimal prediction integration. The model is trained using real meteorological data from ENET’COM Sfax, Tunisia. Following this, the selected models are explained, including their theoretical principles, relevance to wind forecasting, and optimized hyperparameters. Then, experimental results are presented using MAE, RMSE, and R² metrics, demonstrating that the ensemble model consistently outperforms standalone methods, validating the effectiveness of the stacking approach. Ultimately, key findings are summarized, the advantages of ensemble learning in wind energy forecasting are reinforced, and future research directions are suggested to further improve predictive performance.
Advancements in wind forecasting: From conventional methods to machine learning approaches
Accurate wind speed and power generation forecasts are crucial for various sectors, including energy planning, grid management, aviation safety, and disaster response. This study explores the impact of day-night variations and probabilistic forecasting models on improving forecast accuracy. The research also highlights the benefits of using wind farm variability data and offshore wind data for more reliable predictions.
Conventional methods of wind forecasting
Conventional methods of wind forecasting are based on developed meteorological techniques and models that have been established over many years. In this paper, we exhibit empirical models and Numerical Weather Prediction Models.
Empirical models
Various models and algorithms have been employed to enhance wind speed prediction accuracy and wind power forecasts including autoregressive moving average (ARMA), autoregressive and GARCH models, automatic learning and retrieval of weather information, and wind speed and wind power dynamics models. The fundamental-based models proposed for wind power forecasts are more accurate than the statistical-based models. However, the statistical models are the most commercially used forecasting methods for wind power due to their simplicity and high speed. Since the phenomena related to wind speed and wind power are random processes as their models are non-linear and non-Gaussian distributions, the time series methods could not capture the complete complexities of wind speed series. On the contrary, due to the complex, non-stationary, and turbulent nature of the wind over time and space, the physics-based models that have been used in NWP systems could not derive the entire information of the wind data.
Wind speed and wind power forecasts are the most important components of wind energy management and grid power scheduling and dispatch (Liu et al., 2023). Due to the inherent uncertainty and intermittency of wind, power systems with high wind penetrations require more reserve capacities to prominent levels in which frequency and intensity of using regulating services (e.g., frequency regulation and plant level reserves) make the grid less economic operational (Zhang and Wang, 2020).
Since reserve capacities are associated with the security of the grid, operational reserve amount must be estimated correctly. In this regard, reducing the uncertainty of wind power forecasts is necessary. In other words, power system operation with higher penetrations of variable power increases the market value of metrics that characterizes the certainty of the forecast (Wang and Chen, 2023). Long duration variations in wind speed are typically forecasted using Numerical Weather Prediction (NWP) models. These models solve non-linear differential equations that simulate the evolving atmosphere and are the most sophisticated and widely recognized meteorological tools for weather forecasting. However, NWP forecast models are characterized by significant numerical uncertainties because of chaos in the atmosphere and limitations in computational power and grid size (Bauer et al., 2015). Consequently, estimating the value for the wind speed at specific locations and times from NWP output fields often results in large biases and poorly defined variances about the mean forecasts. Post processing, a numerical forecast, or an analyzed perturbed ensemble of forecasts, with observed data to create a high-quality estimate of the true wind speed under given conditions in the atmosphere, is necessary.
Wind forecasting is extremely important for efficient operation of wind energy projects. Forecasting of wind energy resources has evolved into its role from its initial need for short-term power scheduling to filling imbalances between supply and demand and providing reserve power (Li and Liu, 2022). Due to the inherent unpredictability of power generation from intermittent resources, accurate forecasts bring a clear challenge to grid and system operators, energy traders, and marketers. Historically, generated wind power has been unpredictable and wind power plants could only be accommodated up to a fraction of the power system load (Zhang and Wang, 2020). However, with the advent of wind power forecasting, large-scale wind-providing capacity has been integrated to power systems.
Numerical weather forecasting models
The situation for NWP parameterization is increasingly rare: a 40-year-old computational kernel that we still expect to take us for at least another decade. The area of operational numerical weather prediction is minor compared to the full picture of potential horizontal scientific and engineering development, but we may expect that the process of intensifying development already seen in the machine learning and predictive science communities will continue in the NWP modeling community. Because there is increasing recognition that operational numerical weather prediction is important outside of the NWP modeling community, it may be that a future era of parameterization development will be associated with the creation of at least one, probably several, organizations dedicated to their ESM for weather prediction and historical reanalysis.
One subsequent development has been to generalize Lorenz’s notion of a convective scale model and a planetary scale model. This separation into the dynamical core and physical parameterization is increasingly less clear and it is recognized that different scales are increasingly interdependent (Bauer et al., 2015). The physical parameterizations mix slow and fast processes and increasingly inflexible partitioning of the system into these two types of processes adds to the costs of NWP. A flexible, deep, model of the type that has been proposed and studied in recent years is one potential solution.
Weather prediction has improved significantly over the past few decades, particularly on short to medium length timescales. The advent of powerful supercomputers and major developments in mathematical and physical techniques in the early 21st century has meant that the current horizons for weather prediction are much longer than when Bjerknes theorized the numerical weather prediction (NWP) problem in the 1910s (Palmer and Hagedorn, 2006). The most important development over the last decade has been machine learning (ML) techniques applied to NWP model output data: the physical data-intensive predictions made by these models seem perfectly suited to ML and a significant high-profile paper substantiated this belief in 2015 (Shi et al., 2015).
Advanced machine learning techniques for wind speed forecasting
A sustainable, mostly steady wind speed and its concentration have introduced a compelling case for numerous power generation mechanisms. Wind energy is now among the most reliable power supply resources with sustainable growth around the globe, particularly in Korea. The electric power generated by wind energy has developed dramatically, as part of an attempt to lower the further climate change and minimize the emissions of electricity: the speed generated by the wind is highly unpredictable and non-linear (Li and Liu, 2022). The annual performance factor and the boosting of everyday effectiveness and the greatest power level of energy are strongly dependent upon the meteorological conditions, especially on the wind speed (Zhang and Wang, 2020). It is therefore important to build and develop additional procedures of more trustworthy and highly efficient estimates for rapid and precise shifting wind power. To optimize the expenditures and enhance further reliability, the amounts of wind power generated can have great consequences for the rapid and precise forecasting of wind speed.
Wind power forecasting is understood to be one of the important studies for commercial and applied applications. In general terms, wind power forecasting can be separated into two kinds of temporary and continuing forecasting. Short-term wind speed forecasting models concentrate on tasks that are expected to create instantaneous forecasts for more than 6 hours in advance, based on the higher economic processes having bigger numbers. As a result, short-term precise wind speed forecasting is also of deep importance and urgency for energy systems and future network uncertainties taking in compelling need of the wind power generator. A much bigger essential notion is wind speed, a vital parameter that surpasses one- or two-hours forecasting periods which have been used in several industries, and it has an impact on numerous economic and general societal operations.
This paper proposes investigating the wind speed (WS) forecasting. Machine learning techniques will be employed to attain the non-linearity of these elements and improve forecasting accuracy. Accurate WS prediction is crucial for integrating wind energy into power grids and optimizing electricity generation.
Single model forecasting approaches
Single-model approaches rely on training one supervised learning algorithm to map input features directly to wind speed. Commonly used algorithms include: • • •
While each of these single models can achieve good accuracy under certain conditions, their performance may degrade when faced with noisy data, strong seasonal variability, or when trained on limited datasets. No single algorithm consistently dominates all geographic locations and temporal horizons.
Ensemble learning techniques in renewable energy forecasting
Ensemble learning methods improve predictive performance by combining multiple base learners, thereby compensating for individual model weaknesses and enhancing overall robustness. Key techniques include: • •
In the context of wind speed forecasting for renewable energy systems, ensemble methods have proven especially effective at handling heterogeneous data sources and highly variable climatic patterns. By reducing variance (bagging), focusing on difficult cases (boosting), or optimally blending complementary models (stacking), ensembles deliver more accurate and reliable forecasts—critical for the design, operation, and economic planning of hybrid renewable energy installation.
Proposed methodology
The investigation consists of forecasting the wind speed by applying three machine learning algorithms (Figure 1). The selected algorithm that offers the best accuracy will be utilized in various wind energy applications, including sizing, management of wind conversion systems, and control. The meticulously collected and maintained wind data was acquired from a database provided by the real-time acquisition chain installed at ENET’COM, Sfax, Tunisia. Proposed approach.
The dataset consists of several meteorological parameters, including wind speed, radiation, rain rate, relative humidity, and wind direction. These input variables serve as key factors influencing the accuracy of the prediction models.
Machine learning models undergo an AI process selection, where their effectiveness in predicting wind speed is evaluated. To enhance forecasting accuracy, an ensemble learning approach is implemented. This involves a multi-layer structure (Layer 1 and Layer 2) where different algorithms contribute to the final prediction through a voting mechanism.
Finally, the model (or combination of models) that offers the highest accuracy is selected to generate the final forecasted wind speed, ensuring optimal performance for wind energy applications.
Data processing is a critical aspect of machine learning (ML) workflows, involving the transformation and manipulation of raw data to make it suitable for analysis and modeling. Proper data processing is essential for allowing the quality and reliability of data input, which directly affects the performance of machine learning models. Algorithms are evaluated on a real-time database from the acquisition chain installed at ENET’COM.
The data processing steps are mentioned in Figure 2. Data processing steps.
Overview of methodology
This section outlines the proposed methodology for wind speed forecasting using an ensemble learning approach. The strategy is based on a two-layer stacking model, which aims to improve prediction accuracy by combining the strengths of multiple machine learning models. The methodology begins with data preprocessing and feature extraction, followed by training of three base learners: Support Vector Regression (SVR), AdaBoost, and Random Forest (RF). These models generate preliminary predictions that are then passed to a second-layer meta-learner, which integrates their outputs to produce the final forecast. This architecture is designed to capture both linear and non-linear patterns in the wind speed data.
Ensemble learning strategy: Two-layer stacking model
Stacking is a powerful ensemble learning technique that combines multiple predictive models in a layered architecture. The first layer consists of base learners that independently learn from the training data. The outputs of these models serve as input features for a subsequent model layer, referred to as the meta-learner. This hierarchical structure helps to reduce individual model bias and variance, leading to improved generalization performance.
Layer 1: Base learners (SVR, AdaBoost, Random Forest)
In the first layer, three distinct regression models are used to capture different aspects of the wind speed data: • Support Vector Regression (SVR) is effective at handling high-dimensional data and capturing complex relationships by using kernel functions. • AdaBoost improves prediction by focusing on difficult instances, sequentially combining weak learners to form a strong regressor. • Random Forest (RF) provides robustness through the aggregation of multiple decision trees trained on different subsets of data and features.
Each model is trained independently on the same training dataset and generates a separate wind speed prediction.
Layer 2: Meta-learner (Random Forest integration)
The second layer of the stack employs Random Forest as a meta-learner. It takes the outputs of the base learners as input features and learns to weight and combine them optimally. The choice of Random Forest as the meta-model is motivated by its ability to handle non-linear interactions, prevent overfitting through ensemble averaging, and maintain high predictive performance. This integration layer enhances the overall forecast by correcting potential errors from individual base models and synthesizing complementary patterns learned by each.
Model evaluation metrics (MAE, RMSE, R2)
To evaluate the performance of the proposed stacking model and its individual components, three commonly used regression evaluation metrics are employed: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the Coefficient of Determination (R2). These metrics offer complementary insights into the accuracy and robustness of predictions.
These metrics together deliver a comprehensive evaluation of the predictive accuracy, distribution error, and explanatory power of the models under study.
Machine learning models for wind speed prediction
This research investigates three widely used machine learning algorithms—Random Forest, AdaBoost, and Support Vector Regression (SVR) for wind speed prediction. Furthermore, a two-layer stacking ensemble framework is implemented to integrate these base learners and leveraging their complementary strengths to improve predictive performance and robustness.
Support Vector Regression (SVR) algorithm
Support Vector Regression (SVR) is a robust machine learning technique obtained from the principles of Support Vector Machines (SVM), primarily designed for regression tasks. In the context of wind speed predictions, SVR is particularly advantageous due to its ability to handle non-linear relationships and high-dimensional data. It is widely used in time series analysis and environmental forecasting because of its generalization capability and resistance to overfitting, especially in the presence of noise.
The SVR algorithm operates under the principle of finding a function that approximates the relationship between the input features and the target variable (wind speed) while minimizing prediction errors, constrained by a specified margin of tolerance.
The key steps involved in SVR-based wind speed forecasting are as follows: (1) (2) (3) (4)
SVR is especially useful in applications where the dataset is not large but contains complex, non-linear relationships. Its controlled margin and robust generalization ability make it an effective base learner in ensemble architectures, such as stacking models, for enhancing the accuracy and reliability of wind speed forecasting.
AdaBoost regressor algorithm
The AdaBoost Regressor (Adaptive Boosting for Regression) is an ensemble learning method designed to enhance the performance of weak regression models by focusing on difficult-to-forecast data points. In wind speed forecasting, it is particularly effective in handling non-linear patterns and temporal fluctuations by sequentially combining multiple base regressors to produce a more accurate and resilient forecast.
The forecasting process using AdaBoost Regressor involves the following steps: (1) (2) (3) (4) (5)
AdaBoost is particularly well-suited for wind speed time forecasting in environments where patterns are complex and the error distribution is heterogeneous. Its ability to iteratively focus on difficult cases makes it a powerful tool when used alongside other models in a hybrid or stacked ensemble.
Random Forest Regressor Algorithm
Random forest is a specialized ensemble learning algorithm that enhances predictive accuracy and robustness by combining multiple models. Ensemble learning, a powerful machine learning paradigm, is effective for both classification and regression tasks, with this study focusing on regression using time-series data to forecast future values. By training diverse models with different features, algorithms, or hyperparameters, ensemble methods aggregate predictions through techniques like averaging or weighted voting. Common approaches include bagging, which trains models on bootstrapped data; boosting, which iteratively corrects errors from prior models (Figure 3); and stacking, where a meta-model learns to optimally combine base-model predictions. This work employs bagging, adaptive boosting, gradient boosting, extreme gradient boosting (XGBoost), and random forest regressors, with detailed derivations available in Ref. (Asbai and Amrouche, 2017). For comparison, standalone models such as decision trees, LSTMs, and ARIMA are also evaluated to benchmark performance against the ensemble-based approach. Random Forest Regressor algorithm.
Experiments, results, and discussion
This section outlines the experimental setup, performance evaluation, and analysis of the results obtained from the proposed two-layer stacking model for wind speed predictions. The results of the base learners (SVR, AdaBoost, and Random Forest) as well as the meta-learner (stacked Random Forest) are compared using standard evaluation metrics: MAE, RMSE, and R2
Experimental setup and data description
The dataset used in this study was collected using a real-time acquisition located at ENET’com, Sfax, Tunisia. The acquisition chain was specifically designed for renewable energy research and provides reliable, high-frequency meteorological measurements.
The recorded variables include: • Wind speed (m/s)—the primary forecast target, measured at a height of 10 m, • Air temperature (°C), • Solar irradiation (W/m2), • Relative humidity (%), • Timestamp (hourly resolution).
The dataset features two key representative months, October 2024 (characterizing the transition to the cold season) and May 2025 (signaling the shift to the hot season), to illustrate seasonal variability. (See note below regarding the number of months). These data were acquired under real-world operational conditions and accurately reflect local climatic patterns in the Tunis region. The acquisition system ensures precise and consistent readings, serving as a critical testbed for developing and validating intelligent forecasting and energy management systems within renewable energy applications.
Individual model performance
Comparative performance of single and ensemble models for wind speed forecasting (MAE, RMSE, R2).
Performance of the ensemble learning model
Figure 4 presents a comparison between actual wind speed measurements and the predictions generated by four different models—Random Forest, AdaBoost, Support Vector Regression (SVR), and an ensemble learning approach—over the month of October 2024. The actual wind speed is depicted as a continuous reference line, while each model’s prediction is represented by a distinct dashed line. Among the models, the ensemble learning approach demonstrates the closest alignment with the actual wind speed values, effectively capturing both the amplitude and the temporal dynamics of the observed fluctuations. AdaBoost also exhibits a satisfactory performance, particularly in regions of moderate wind speed variation. In contrast, SVR shows more pronounced deviations, frequently overestimating or underestimating peak values. The Random Forest model tends to underestimate wind speed during periods of high variability, suggesting a potential underfitting behavior. Overall, the ensemble model achieves superior predictive accuracy and temporal consistency, supporting its effectiveness in modeling complex wind speed patterns through the integration of multiple base learners. Mode-based forecasting of wind speed in October 2024.
Figure 5 illustrates a zoom-in window of a time period of Figure 4 to show the varying predictive capabilities of different machine learning algorithms for wind speed forecasting. While the Ensemble Learning model appears to offer the most robust performance for much of the observed period, all models face challenges in accurately predicting sudden and drastic changes in wind speed, as evidenced by the significant overestimation during the sharp decline on October 19th. This highlights an area for potential future research, focusing on improving model robustness to extreme or atypical meteorological events. Detailed wind speed forecast for October 2024 (zoomed view).
Figure 6 offers a recent view across a significant portion of the month of May 2025, demonstrating the general efficacy of all implemented models in tracking the actual wind speed fluctuations. Notably, the Ensemble Learning approach consistently exhibits superior predictive accuracy, evidenced by its close alignment with the observed data throughout this broader temporal span. Conversely, Figure 7 provides a granular examination of model performance during a particularly volatile 6-day period within May 2025. This zoomed-in perspective critically reveals a shared limitation across all models: their discernible difficulty in accurately predicting the precise magnitude and timing of abrupt and substantial decreases in wind speed to minimal values, as exemplified by the significant overestimation observed during the sharp trough on May 18th. These visualizations underline the robust capabilities of ensemble-based methods in general wind speed forecasting, while simultaneously highlighting the persistent challenge in accurately forecasting extreme and rapid meteorological shifts, which warrants further research into model resilience and outlier detection. Mode-based forecasting of wind speed in May 2025. Detailed wind speed forecast for May 2025 (zoomed view).

The combined analysis, mentioned in Figure 8, demonstrates that Ensemble Learning consistently outperforms individual models (Random Forest, AdaBoost, SVR) in wind speed forecasting for both October 2024 and May 2025, as evidenced by its lower MAE and RMSE and higher R2 values. However, a recurring limitation across all models and timeframes is their significant struggle to accurately predict rapid and extreme drops in wind speed, suggesting an area for future research and model improvement, potentially through enhanced anomaly detection or the incorporation of more granular meteorological data. Model performance summary.
Comparative analysis: Single models versus ensemble model
This study provides a comprehensive evaluation of three individual machine learning models—Random Forest (RF), AdaBoost (Ad), and Support Vector Regression (SVR)—alongside an Ensemble Stacking approach for wind speed forecasting. The comparative analysis reveals distinct performance characteristics across different seasonal conditions, measured through key metrics including Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and coefficient of determination (R2).
The comparative analysis of single and ensemble models for wind speed forecasting demonstrates the significant advantage of an integrated approach. As shown in Table 1, for October 2024, the Ensemble Learning model consistently outperformed its individual components—Random Forest, AdaBoost, and Support Vector Regression (SVR)—achieving the lowest Mean Absolute Error (MAE) of 0.255, the lowest Root Mean Squared Error (RMSE) of 0.334, and the highest R2 of 0.801. This trend continued into May 2025, where the Ensemble model again led with an MAE of 0.314, RMSE of 0.429, and R2 of 0.745.
However, the ensemble approach is not without limitations. Its training process demands significantly greater computational resources compared to individual models, and its performance may vary when applied to regions with climatic conditions differing from the training data. Despite these challenges, the Ensemble Learning model represents a statistically significant improvement over standalone methods, offering enhanced reliability and accuracy for wind speed forecasting. This makes it particularly valuable for applications in wind energy systems, where precise predictions are critical for operational efficiency and grid stability. Future research could explore dynamic weighting mechanisms to further optimize the ensemble’s real-time performance and adaptability to diverse meteorological conditions.
Conclusion and perspectives
This study confirmed the potential of machine learning techniques to improve the accuracy and reliability of wind speed forecasting, using real-world meteorological data of Sfax, Tunisia. By comparing three widely used models Support Vector Regression (SVR), AdaBoost, and Random Forest and developing a two-layer stacking ensemble, it is demonstrated that combining diverse algorithms can lead to more precise predictions. Among the evaluated methods, Random Forest consistently delivered the highest accuracy during stable weather conditions, while the ensemble model showed strong overall performance, especially during periods of increased atmospheric variability.
Although the ensemble stacking model did not outperform Random Forest in every instance, it offered a more balanced and adaptable forecasting solution across seasons. These results highlight the practical added value of ensemble learning strategies in renewable energy forecasting, especially when managing the uncertainties associated with wind power integration into the electrical grid. However, the added complexity and computational demand of ensemble models remain important considerations, particularly for real-time applications or deployment in regions with different climatic patterns.
Future research should explore ways to make ensemble models more adaptive to seasonal and real-time changes, potentially through dynamic weighting or online learning techniques. Integrating machine learning with traditional physical models, such as numerical weather prediction, could also create hybrid systems that better capture both data-driven and physics-based insights. Additionally, testing these approaches across diverse geographical settings and incorporating a broader range of environmental features could improve generalizability and robustness. Ultimately, the goal is to enable more intelligent, accurate, and scalable forecasting systems that support the efficient operation of wind energy infrastructure and contribute to a more sustainable energy future.
Footnotes
Author contributions
Nabiha Brahmi: Conceptualization, Methodology, Data Analysis, Writing – Original Draft, and Editing. Leila Hadj Mefteh: Writing. Maher Chaabene: Supervision, Project Administration.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data supporting the findings of this study are available on a secured website with authentication access.
