Abstract
This study assesses the efficacy of particle swarm optimization (PSO) in estimating scale (c) and shape (k) parameters of the Weibull distribution model for wind energy forecasting at two key wind farm sites in Morocco—Tarfaya in the south and Tangier in the north, utilizing real wind data from 2022. Employing a novel square frequency error objective function to enhance parameter accuracy, the study adopts a two-stage training approach involving recursive least-square estimation and PSO fine-tuning. Validation with artificial data underscores PSO’s effectiveness under diverse wind conditions. Parameter sensitivity analysis identifies four optimal PSO configurations, with the PSO-4 model exhibiting superior performance. Comparative analysis against traditional and heuristic optimization methods consistently demonstrates PSO-4’s lowest root-mean-square error (RMSE) and mean absolute error (MAE), high coefficients of determination (R2), and shortest computation time. The research highlights PSO-4 model as a precise and efficient tool for Weibull distribution parameter estimation in wind energy forecasting, showcasing robust convergence across both wind farm sites.
Introduction
In recent years, there has been a notable upswing in attention toward wind energy as a promising and sustainable solution to meet the escalating global demand for clean and renewable power sources. This heightened interest is underscored by the urgent imperative to shift toward cleaner energy alternatives. With the growing momentum of wind power projects, the precise assessment and thorough analysis of wind resources have become indispensable in unlocking their full potential (Rao, 2019).
Wind resource analysis forms the backbone of wind farm development, planning, and operational efficiency. The successful realization of wind energy projects hinges significantly on accurate production forecasts (Yakoub et al., 2022). The critical importance of minimizing errors in these forecasts cannot be overstressed, as they directly impact the financial viability of investments (Sun et al., 2019). Inaccurate forecasts, in severe cases, can result in substantial financial losses for investors (Keynia and Memarzadeh, 2022; Sun et al., 2019). Conversely, underestimations may subject wind turbines to excessive loads, potentially leading to overload and compromising the overall efficiency and safety of the system (Barthelmie and Jensen, 2010).
A central element of wind resource analysis involves characterizing wind regimes, which govern the intricate behavior of wind patterns (Peng et al., 2010). At the core of this analysis lies the fundamental statistical approach of probability distribution, with the Weibull Distribution serving as a pivotal tool (Bowden et al., 1983; Carta et al., 2009). Wind energy experts rely on the Weibull distribution to gain valuable insights into wind speed frequency distributions over specific time intervals (Tuller and Brett, 1984).
However, accurately estimating parameters shape (k) and scale (c) to represent the wind regime of a region using the Weibull distribution remains a substantial challenge, with the aim of achieving the smallest possible error in adjustment (Ozay and Celiktas, 2016; Wais, 2017). To address this challenge, various methods have been proposed to determine Weibull parameters across diverse global locations.
Deterministic methods have played a foundational role in this research domain, with noteworthy contributions such as the introduction of the power density (PD) method by Akdağ and Dinler (2009). This novel approach for estimating Weibull distribution parameters in wind energy applications has been compared to conventional techniques such as graphic, maximum likelihood, and moment methods. Through a series of goodness-of-fit tests across diverse geographical locations, the effectiveness of the PD method in providing accurate parameter estimates has been demonstrated. Conversely, Chang (2011), conducted a comprehensive analysis of six numerical methods frequently employed for Weibull parameter estimation: moment method (MM), empirical method (EM), graphical method (GM), maximum likelihood method (MLM), modified maximum likelihood method (MMLM), and energy pattern factor method (EPFM). Through Monte Carlo simulations and analysis of real wind speed data, they shed light on the strengths and weaknesses of each method, elucidating their performance under varying conditions. Furthermore, Ozay and Celiktas (2016), have utilized a two-parameter Weibull statistical distribution to analyze wind characteristics within specific regions. Their research not only calculates wind speed frequency distribution but also delves into the k and c parameters. They have enriched their findings by presenting a wind rose graph and wind direction trends, offering valuable insights into the wind energy potential of the studied areas. Usta (2016), introduced the probability weighted moments based on the power density method (PWMBP) as an innovative approach for Weibull parameter estimation in wind energy applications. They have compared PWMBP with six common methods, including MLM, MMLA, GM, MM, PD, and probability-weighted moments method (PWMM). Their results have pointed to the superior accuracy and efficiency of PWMBP in parameter estimation. Moreover, Yang et al. (2023), have presented an iterative method based on least squares for estimating the parameters of the three-parameter Weibull distribution. This iterative approach involves multiple iterations to obtain stable parameter estimates, ultimately leading to more accurate lifetime calculations for wind energy systems. Katinas et al. (2017), have focused on the application of Weibull probability distribution methodologies in determining wind power density. Through an analysis of eight different methods, they have assessed their reliability using various statistical measures, emphasizing the importance of selecting the most suitable method based on geographical factors and wind power density. Additionally, Chaurasiya et al. (2018), have explored the accuracy of various methods for computing wind power density in specific regions. Their study has evaluated these methods using goodness-of-fit tests, including the root mean square error (RMSE) test, coefficient of determination (R2), mean absolute percentage error (MAPE), and chi-square (χ2) test. They have also delved into the accuracy of SODAR measurements when compared to cup anemometer data. Furthermore, Xie et al. (2023), have introduced a novel parameter estimation method for the three-parameter Weibull distribution by transforming the cumulative distribution function. This approach minimizes discrepancies in scale parameter values by accounting for errors in shape and location parameters, demonstrating superiority over traditional methods in case studies. Dorvlo (2002) has employed the chi-square method to determine the Weibull parameters in four locations in Oman, Saudi Arabia. Costa Rocha et al. (2012), have conducted an analysis and comparison of seven numerical methods to assess their effectiveness in determining Weibull distribution parameters, utilizing wind data from Camocim and Paracuru cities in the northeastern region of Brazil. Similarly, Andrade et al. (2014), operating in the Brazilian northeastern region, have compared various methods, such as the GM, MM, EPFM, MLM, EM, and equivalent energy method (EEM). Their evaluation has centered on the efficiency of these methods in predicting available power when compared to measured values.
While deterministic methods have predominantly influenced this field, a recent trend has emerged toward the adoption of heuristic methods. These heuristic approaches have shown promise in addressing the challenges associated with parameter estimation for the Weibull distribution. For instance, Abou El-Ela et al. (2023), have conducted a thorough investigation of optimal Weibull parameter estimation using both analytical and heuristic methods. Their study has compared various algorithms, including particle swarm optimization (PSO), crow search algorithm (CSA), aquila optimizer (AO), and bald eagle search optimizers (BES), to determine the most accurate probability density function for wind data. Kılıç et al. (2021), have presented a genetic algorithm (GA) that incorporates an adaptive search space based on the expectation–maximization algorithm for maximum likelihood estimation of Weibull parameters. Their research has showcased the efficiency of this approach through simulation studies and real data examples. Han et al. (2023) performed a reliability analysis on wind turbine subassemblies, utilizing the 3-P Weibull distribution model and maximum likelihood estimation. They introduced the ergodic artificial bee colony algorithm (ErgoABC), enhanced with chaos search theory and a Lévy flights strategy, for efficient parameter estimation. Validation through simulation demonstrated the algorithm’s effectiveness in high-dimensional optimization. The study underscores the efficacy of the 3-P Weibull model in evaluating the lifetime distribution of critical wind turbine components. Additionally, Alrashidi et al. (2020), have presented a framework for evaluating the performance of different probability density functions (PDFs) in fitting wind speeds. They have introduced a novel metaheuristic optimization algorithm called social spider optimization (SSO) for wind characterization purposes, showcasing its efficiency in estimating PDF parameters. Grounded in empirical data encompassing variables like wind speed and ambient temperature, the technique proved effective in accurately estimating wind power, achieving a notable low MAPE of 3.513%. In their research, Wang et al. (2016), have employed both cuckoo search optimization (CSO) and ACO methods to assess wind potential and forecast wind speed in four locations within China. The comparison of results has shown that these combined wind energy assessment and speed forecasting techniques have provided promising assessments and predictions, outperforming the individual assessment and forecasting components. Guedes et al. (2020), applied four metaheuristic optimization algorithms, namely migrating bird’s optimization (MBO), Imperialist competitive algorithm (ICA), harmony search (HS), and CSO, to model wind speed data and estimate Weibull parameters in two regions of Brazil. The findings suggest that MBO and ICA outperformed the conventional maximum likelihood estimation method.
In this landscape of heuristic approaches, several studies have leveraged particle swarm optimization (PSO) for various aspects of statistical modeling. Carneiro et al. (2016), have employed the PSO approach to estimate Weibull parameters for wind resources in the Brazilian northeast region. Their research has demonstrated the superior performance of PSO in characterizing the unique wind conditions of the region. Furthermore, in a study conducted by Patidar et al. (2023), the estimation of Weibull distribution parameters was investigated at three locations in India (Kayathar, Jafrabad, Gulf of Khambhat) using five numerical methods and three metaheuristic optimization algorithms (SSO, PSO, GA). The research, comparing method accuracy through statistical analysis, concludes that the Gulf of Khambhat exhibits the highest wind power density. Wind atlas analysis and application program (WAsP) stands out among the numerical methods, offering the best fit to the wind speed histogram. Additionally, in terms of accuracy and convergence to the optimal solution, the metaheuristic algorithms SSO and PSO outperform GA. Rahmani et al. (2013) exploration involved the integration of ant colony optimization (ACO) and PSO to forecast energy output in the Binaloud wind farm, Iran, resulting in a model that demonstrated improved accuracy and faster convergence. Algamal and Basheer (2021) and Mahmood and Algamal (2021) harness the power of PSO to fine-tune parameter estimation for the three-parameter Weibull and gamma distributions, respectively. Through rigorous analysis of real-world data, both studies demonstrate remarkable improvements in reliability and hazard function estimations compared to conventional methods. These findings underscore the efficacy of PSO in optimizing distribution parameters, thereby enhancing the accuracy of statistical models. Shifting focus to beta regression modeling, Algamal (2019) introduces PSO as a superior method for variable selection, particularly in models with varying dispersion. Through comprehensive simulations and real data applications, the study highlights PSO’s prowess in identifying relevant variables, thereby bolstering the predictive performance of beta regression models. This research emphasizes the importance of efficient variable selection techniques in statistical modeling. Al-Thanoon et al. (2019) venture into Quantitative Structure-Activity Relationship (QSAR) classification, proposing a hybrid PSO algorithm for descriptor selection. By surpassing existing methods in classification accuracy and descriptor selection across benchmark datasets, this innovative approach showcases PSO’s effectiveness in enhancing QSAR modeling. The study underscores the significance of algorithmic advancements in improving predictive modeling in chemical and pharmaceutical research. In a different vein, Qasim and Algamal (2018) emphasizes feature selection in classification tasks, introducing a PSO-based approach alongside logistic regression models. By leveraging the Bayesian Information Criterion (BIC), the method significantly enhances classification performance across diverse datasets, underscoring its utility in feature selection. This research highlights the synergy between optimization techniques and statistical modeling in tackling high-dimensional data analysis challenges. Additionally, Sadik et al. (2023) address parameter tuning in bridge penalty regression models, employing PSO for concurrent optimization of shrinkage and tuning parameters. While showcasing effectiveness compared to conventional methods, the study acknowledges the influence of dataset complexity and application context on performance variations. This research underscores the importance of algorithmic adaptability in addressing diverse optimization objectives in statistical modeling.
This study explores the effectiveness of PSO in estimating Weibull distribution parameters for wind energy forecasting at Tarfaya and Tangier wind farm locations in Morocco, utilizing 2022 wind data. A novel objective function based on square frequency error enhances parameter estimation accuracy. The PSO-4 model, validated with artificial data, outperforms other models, demonstrating superior accuracy and computational efficiency. A comparative analysis against conventional methods, including graphical (GM), moment (MM), maximum likelihood (MLM), modified maximum likelihood (MMLM), power density (PDM), Lysen’s empirical (LEM), Justus’ empirical (JEM), least square (LSM), and alternative maximum likelihood (AMLM), as well as heuristic optimization techniques such as harmony search (HS), cuckoo search optimization (CSO), and ant colony optimization (ACO) proposed by Freitas de Andrade et al. (2019), reveals that PSO-4 consistently achieves the lowest RMSE and MAE values, high R2 values, and the shortest computation time. The algorithm exhibits robust convergence, suggesting its potential as a valuable tool for precise Weibull distribution parameter estimation in wind energy forecasting.
The document is structured into five sections. In Section “Introduction,” we introduce the research goals and explore related studies. Section “ Review of methods for Weibull parameter estimation” provides a review of traditional Weibull parameter estimation methods. Moving to Section “Weibull parameter optimization method,” we introduce a novel objective function and rigorously validate the PSO-4 model using artificial datasets. Progressing to Section “Results and Discussion,” a comprehensive comparative analysis is conducted, evaluating PSO-4 against traditional methods and three heuristic optimization techniques, focusing on two prominent wind farm locations in Morocco—Tarfaya and Tangier—using wind data collected in 2022. Lastly, Section “Conclusion” presents the research’s conclusion.
Review of methods for Weibull parameter estimation
The analysis of wind speed distribution and the evaluation of wind energy potential within the field of wind energy research commonly rely on the extensively employed two-parameter Weibull distribution. This distribution is defined by a probability density function (PDF) denoted as f(v) and a cumulative distribution function (CDF) denoted as F(v). Both these functions hold substantial importance in delineating wind speed behavior and forecasting wind power generation (Jamil et al., 1995).
where c represents the scale factor with units of m/s, k is the dimensionless shape factor, and F(v) signifies the probability of velocities less than or equal to v.
The estimation of the scale (c) and shape (k) parameters in the Weibull distribution involves various methods, each with its unique strengths and limitations (Akdağ and Dinler, 2009; Basumatary et al., 2005). Now, let’s provide a brief overview of each of these estimation methods.
Graphical method (GM)
The graphical method involves plotting the empirical CDF of the data on a special Weibull probability paper. The shape and scale parameters can be estimated by finding the best-fitting straight line on the plot. This method is simple to use, especially for quick estimations, but it may lack accuracy compared to other methods.
Moment method (MM)
The moment method estimates the parameters based on the sample moments of the data. The first moment (mean) and the second moment (variance) are used to derive the formulas for c and k. The moment method is straightforward and easy to compute, but it may not be as accurate as other advanced methods.
and
with
Maximum likelihood method (MLM)
The maximum likelihood method stands out as a widely employed statistical approach for parameter estimation in a distribution. This method revolves around identifying the values of c and k that optimize the likelihood function, portraying the probability of witnessing the provided data. Its broad usage is attributed to the provision of estimators that are both consistent and efficient.
and
where n represents the number of nonzero data points, and v i the wind in time step i.
Modified maximum likelihood method (MMLM)
The altered maximum likelihood method is a derivation of the traditional MLM. It incorporates a modification to the likelihood function to address potential bias in situations involving small sample sizes. This approach is particularly well-suited for instances where there is a scarcity of data points.
and
where f(v
i
) represents the frequency of wind speed within bin i, and
Power density method (PDM)
The power density method relies on aligning the power curve of wind turbines with wind speed data. Through this fitting process, it becomes possible to estimate the shape and scale parameters of the Weibull distribution. This approach is widely employed in the wind energy sector for conducting resource assessments.
and
where
Lysen’s empirical method (LEM)
Lysen’s empirical method entails employing nonlinear regression to fit the Weibull distribution to wind speed data. This approach revolves around minimizing the sum of the squares of discrepancies between the observed and estimated cumulative probabilities.
and
Justus’ empirical method (JEM)
Justus’ empirical method bears similarities to the LEM but employs a distinct estimation technique. It also entails fitting the Weibull distribution to the cumulative probabilities derived from wind speed data.
Least square method (LSM)
The least squares method is employed to estimate parameters by minimizing the sum of squared variances between the observed data and the fitted Weibull distribution. This technique is widely used for optimizing parameter estimation.
and
Alternative maximum likelihood method (AMLM)
The alternative maximum likelihood method is another variation of the MLM. It uses a different parameterization of the Weibull distribution and involves solving a set of nonlinear equations to estimate the parameters.
and
Each method has its own strengths and weaknesses, and the choice of the best method depends on the characteristics of the data and the specific requirements of the application. In practice, it is often advisable to compare the results obtained from different methods and select the one that provides the best fit to the data. Additionally, the choice of the estimation method should be justified based on the sample size, data quality, and the desired level of accuracy.
The Figure 1 offers a visual comparison of the estimated PDFs obtained from different methods for wind speed data. Each curve reflects how well the corresponding method fits the data and captures the distribution’s characteristics. The variations among the curves in different subplots indicate differences in how each method handles the data. By examining the estimated PDF curves in the subplots, we can see how each estimation method performs in approximating the Weibull distribution for the wind speed data. This comparative analysis allows us to evaluate the strengths and weaknesses of each method. The Figure 1 aids in selecting the most appropriate estimation method based on how well it aligns with the observed wind speed data. The visual representation helps researchers, analysts, and practitioners make informed decisions about which method best fits their data and the specific context of their application.

Comparison of Weibull distribution estimation methods for wind speed data.
Weibull parameter optimization method
When there is limited mathematical understanding of an algorithm’s behavior, it is categorized as a heuristic method. Such algorithms aim to solve complex problems without providing guarantees, using a substantial amount of resources, particularly in terms of time, to discover high-quality solutions. Many heuristic approaches rely on probabilistic decisions during their execution. The main distinction from pure random search is that heuristic algorithms employ randomness intelligently and with bias (Korf, 1990). It is essential to note that all optimization procedures seek the best outcome for a given scenario, and this process involves finding the objective function.
In this research, an objective function was assessed, which had not been previously documented in the literature for this specific problem class. This equation, referred to as the square frequency error, revolves around minimizing the sum of squared errors between the values derived from the approach-adjusted curve and the observed frequency of occurrences in the data distribution.
This objective function seeks results that emphasize intermediate values between the reduction of the difference between the Weibull curve and the data histogram and the deviation of production.
In this context, n represents the count of velocity intervals in the histogram, while fadj and fobs denote the frequencies associated with the adjusted curve and observed occurrences in the histogram, respectively. WPDadj and WPDobs are values of the deviation of production between the adjusted curve and the histogram of the data.
The optimization of the Weibull distribution model through PSO algorithm involves a comprehensive two-phase training process. Initially, 135 particles, denoted as θ p vectors and encompassing vectors θ 1 through θ 135 , are initialized. Within each θ p vector, the parameters are endowed with initial values that are randomly distributed within their specified ranges. These θ p vectors serve as potential optimal solutions for the Weibull distribution model.
In the first training phase, T s input-output data samples are utilized to estimate the linear parameters in Φ. This estimation is accomplished through the recursive least-square (RLS) method, and subsequently, the objective function is computed based on equation (20). This estimation and objective function evaluation are conducted for each θ p vector.
Moving into the second training phase involves the utilization of the PSO approach to fine-tune parameters across all θ p vectors, taking into account their individual objective functions. The iterative execution of phases 1 and 2 continues until reaching the specified maximum iteration limit. Upon completion of the training process, the θ p vector yielding the highest value for the objective function is selected as the definitive specification for the Weibull distribution model. This thorough process guarantees the identification of an optimized model that aligns most effectively with the provided data and distribution characteristics.
First phase: RLS estimation for Φ matrix
In the initial phase, with knowledge of all input data samples denoted as T
s
and initialized parameters in the θ
p
vector,
and
Here, Ψ n refers to the vector associated with the nth row of Ψ, and S n represents the covariance matrix. Initially, the value of S 0 is established as σf(v), with σ set to 4, a parameter found effective by other researchers (Jang, 1993). The optimization process initiates with the update of the vector Φ using the initial set of output samples through equations (21) and (22). This iterative procedure persists for each successive output sample set, generating a new Φ vector on each occasion. The iteration continues until n attains the overall count of sample sets. The final solution for Φ is identified by choosing the one that minimizes the value of the specified objective function, as outlined in equation (20).
Subsequently, the model’s performance is analyzed using equations (30)–(32), with the parameters within θ p utilized to gauge the model’s compactness and interpretability. Following this, the overall objective function is determined in accordance with equation (20). The complete procedure for assessing the Weibull distribution model is then iteratively applied to all 135 particles, yielding their respective values for individual objective functions. This systematic approach guarantees a comprehensive examination of the model’s effectiveness and parameter optimization across multiple iterations.
Second phase: fine-tuning model parameters with PSO algorithm
The performance of the model is improved using the objective function values derived from the 135 particles through the PSO algorithm. To be more specific, a particle’s personal best position, denoted as pbest p , is updated with the current values of θ p if its current objective function, OF p [k], surpasses the objective function of its previous personal best position, OFb p b [k-1]. Consequently, during the kth iteration, the personal best position and its associated objective function for each particle are updated in the following manner:
and
During the kth iteration, the particle exhibiting the highest objective function value is chosen as the overall best. The objective function of this particle, denoted as OF 0 [k], undergoes a comparison with OF g b [k-l], representing the objective function linked to the global best position from the preceding iteration. If OF 0 [k] surpasses OF g b [k-l] in value, the position of the current overall best particle, θ p 0[k], is designated as the new global best position. Consequently, throughout the kth iteration, the global best position for the entire swarm, along with its corresponding objective function, undergoes an update in the following manner:
and
Leveraging data from the aforementioned optimal positions, the θ p of each particle undergoes adaptation through the PSO algorithm equation outlined below:
The value for parameter update, denoted as V p (k+1), can be obtained using the following formula:
The acceleration coefficients, η l and η 2 , which were initially set to 2 following the standard PSO application, can be optimized for Weibull distribution model. To achieve this, their actual values are subject to change. Additionally, the variables “rand 1 ” and “rand 2 ” are random values derived from a uniform distribution. The search process is further enhanced by incorporating V p (k) to introduce momentum from the previous update. This momentum is governed by the inertia factor Θ, which gradually decreases linearly with each iteration, as expressed by the following equation:
The algorithm operates for a maximum of i k iterations, with Θ(0) representing the initial inertia weight and Θ(i k ) the final inertia weight. The inertia weights are deliberately set such that Θ(0) is greater than Θ(i k ), ensuring that the training avoids getting stuck in sub-optimal regions during the early tuning phase. Consequently, as the process advances, the model parameters can be precisely fine-tuned.
In the mentioned training algorithm, four PSO parameters need proper selection: Θ(0), Θ(i k ), η 1 , and η 2 . Ensuring accurate and computationally efficient model training is contingent on making appropriate choices for these parameters. Once all the model parameters have been fine-tuned, it becomes necessary to recalculate and update the linear parameters using equations (21) and (22). This step is essential to generate a new set of objective function values. The training algorithm is executed in an iterative manner, generating a sequence of particles with enhanced solutions until the maximum iteration limit is attained. Ultimately, the particle yielding the highest objective function value, at the culmination of this process, will be employed to establish the final structure for the Weibull distribution model.
Predicted parameters k and c: Optimizing PSO parameter selection
After conducting an initial simulation study, we have identified four distinct configurations for the PSO parameters, labeled as η 1 , η2, Θ(0), and Θ(i k ). Initially, all these parameters are established in accordance with the algorithmic design proposed by Kennedy and Eberhart. Specifically, η 1 and η 2 are both assigned a value of 2, while Θ(0) = Θ(i k ) is set to 1. Utilizing these parameter configurations, we generate ten unique Weibull distribution models individually. Each of these models is derived from diverse sets of 135 particles, each possessing the potential to yield the optimal solution. The total iteration count for fine-tuning each model is fixed at 70. Throughout each iteration, we capture the correlation between the sampled parameters, k and c, and those predicted by the overall best-performing particle. This correlation provides valuable insight into the convergence and accuracy of the optimization process.
We proceed to select four models that exhibit the lowest error values. Throughout the tuning process, we illustrate their correlation between measured and predicted parameters k and c of Weibull distribution model as shown in Figure 2.

Correlation between measured and predicted parameters k and c: (a) PSO-1 model, (b) PSO-2 model, (c) PSO-3 model, and (d) PSO-4 model.
The Table 1 data presents a comparative analysis of different PSO models, namely PSO-1, PSO-2, PSO-3, and PSO-4 to predict the k and c parameters of the Weibull distribution. Each model is characterized by specific parameter configurations that influence its optimization behavior. PSO-1 employs uniform acceleration coefficients (η1 = η2 = 2) and maintains constant inertia weights (Θ0 = Θ(i k ) = 1) throughout the optimization process. In contrast, PSO-2 shares similar acceleration coefficients but introduces dynamic changes in inertia weights (Θ(0) = 1, Θ(i k ) = 0) as iterations progress. PSO-3 employs both uniform acceleration coefficients and dynamic inertia weights (Θ(0) = 0.5, Θ(i k ) = 0) to enhance adaptability during optimization. Lastly, PSO-4 follows a similar inertia strategy as PSO-3, yet with varying inertia weights (Θ(0) =1, Θ(i k ) =0.5).
Performance comparison of PSO tuning parameter sets on convergence and accuracy.
Performance evaluation is carried out using the R2 and RMSE (%) metrics. The PSO variants exhibit systematic improvement in performance, demonstrated by increasing accuracy and computational efficiency. PSO-1 demonstrates commendable accuracy with an R2 of 0.99443, albeit with a slightly higher RMSE at 3.9%. Its convergence in 93 iterations over 98 seconds indicates reasonable efficiency. PSO-2 shows improvement over PSO-1, featuring a higher R2 of 0.99628, a slightly reduced RMSE of 3.8%, and marginally shorter convergence time at 78 iterations in 74 seconds. PSO-3 stands out with substantial improvement, showcasing an impressive R2 of 0.99821, a lower RMSE of 3.5%, and efficient convergence in 54 iterations within 66 seconds. However, PSO-4 emerges as the top performer, boasting the highest R2 of 0.99923, the lowest RMSE of 3.4%, and the most efficient convergence in 21 iterations within 53 seconds. This trend underscores the iterative refinement of PSO variants, with PSO-4 representing the optimal choice, striking a balance between accuracy and computational efficiency for predicting the k and c parameters of the Weibull distribution.
The assessment of the implemented methods’ performance was conducted by means of the following tests.
Root-mean-square error (RMSE):
T v is the total number of validation data samples.
Mean absolute error (MAE):
Determination coefficient (R2):
The primary goal of adjusting the Weibull parameters for wind energy is to assess and estimate the energy potential in the studied region. From a prospecting perspective, the analysis of wind production deviation (WPD) holds significant value in determining the available energy. On the other hand, from the standpoint of design and operation, understanding how errors occur across different wind speed ranges can assist in selecting the appropriate type of machinery and ensuring its availability under specific circumstances.
Moreover, the production deviation percentage between the acquired curve and the data histogram was evaluated utilizing equations (33)–(35).
In this context, ρ denotes the specific mass of the air, v represents the wind speed, γ stands for the gamma function, and k and c are the estimated parameters of the Weibull curve.
PSO-4 algorithm validation
To validate the suitability of the proposed method, two artificial random data series were created, each representing different wind conditions commonly observed in Morocco. The first series corresponds to the wind conditions typically found in the southern regions, characterized by a shape factor (k) of 5 and a scale factor (c) of 13. The second series represents wind conditions prevalent along the northern coast with a shape factor (k) of 3 and a scale factor (c) of 8. The Weibull curves were then generated using the PSO-4 algorithm model and evaluated using the RMSE, MAE, and R2.
The wind data series is generated by combining a pair (k, c), comprising a total of 117,842 wind speed values. The number of values was determined based on the guidelines of IEC 61400 PART 12-1, which stipulates the use of 1 year of integrated data, collected at 10-minute intervals, as the minimum period for characterizing the wind in the Morocco’s southern and northern region. The evaluation of the results involved utilizing the Weibull curves obtained through the adjustment process and applying the logarithmic inversion approach. This approach transforms the velocity data into an upward curve by applying the logarithm twice on the function (Log [-Log (F(v)))]. This visualization method simplifies the analysis of the curve’s adjustment to the data points. Figure 3(a) and (b) offers a visual representation of the measured wind speed data, providing insights into the distribution characteristics. The solid blue line depicts the measured Weibull PDF for the specified parameters (k = 3, c = 8), while the dashed red line represents the Weibull PDF estimated using the PSO-4 algorithm. The close alignment between the measured and estimated PDFs indicates the accuracy of the PSO-4 method in capturing the underlying distribution features. Turning attention to the right subplot, the logarithmic inversion curve (Figure 3(b)) offers an alternative perspective on the wind speed distribution. This curve provides valuable insights into the inverse cumulative distribution function, complementing the traditional PDF representation.

Comparison of measured and estimated wind speed distributions using PSO-4 for different shape and scale factors: (a), (c) Weibull curve; (b), (d) logarithmic inversion curve.
The Figure 3(c) and (d) delves into the comparison for a different set of parameters (k = 5, c = 13). The measured and estimated PDFs continue to closely align, demonstrating the robustness of the PSO-4 algorithm across diverse parameter values. The histograms, once again, portray the distribution characteristics, while the logarithmic inversion curve (Figure 3(d)) enhances the understanding of the inverse cumulative distribution function specific to this pair. The statistical test results applied in the validation of the generated data are presented in Table 2.
Statistical analysis of pairs (k = 3, c = 8) and (k = 5, c = 13) using PSO-4 model.
The Table 2 summarizes the outcomes of employing the PSO-4 algorithm to estimate parameters for the Weibull distribution in two distinct sets of artificial random data series. The parameters under consideration are represented by pairs (k = 3, c = 8) and (k = 5, c = 13).
The performance metrics used to evaluate the effectiveness of the PSO-4 algorithm in parameter estimation include RMSE, MAE, R2, and WPD. These metrics collectively offer insights into the accuracy and reliability of the estimated parameters.
The results indicate highly accurate parameter estimation by the PSO-4 algorithm, as reflected in the remarkably small values for RMSE and MAE (on the order of 0.0001). The proximity of these metrics to zero suggests a minimal difference between predicted and observed values, demonstrating the algorithm’s precision. Furthermore, the high R2 values (0.9936 and 0.9938) signify a strong fit of the model to the data, indicating that the PSO-4 method successfully captures the underlying patterns in the artificial random data series.
The WPD metric, which specifically measures the deviation of predicted wind production from actual values, further supports the effectiveness of the PSO-4 algorithm. The relatively low WPD values (0.5294 and 0.7402) suggest a reasonable level of accuracy in predicting wind production, reinforcing the algorithm’s suitability for parameter estimation in this context.
The table’s outcomes affirm the proficiency of the proposed PSO-4 method in accurately estimating Weibull distribution parameters for the given artificial random data series. The combination of small RMSE and MAE values, high R2 values, and reasonable WPD values collectively underscores the reliability and suitability of the PSO-4 algorithm for this specific application in the context of wind production data.
Results and discussion
Data preprocessing for wind site analysis
In this research, the Tarfaya and Tangier wind farm locations have been identified as potential sites for wind energy generation. The Tarfaya wind farm, situated 20 km southeast of Tarfaya town in southern Morocco (Latitude: 27°56′8″ Longitude: −12°55′7.3″) at an elevation of 80 m above ground level, and the Tangier wind farm, developed across two zones in northern Morocco (Latitude: 35°38′50.2″ Longitude: −5°36′29.9″). The first zone is in Dhar Saadane, located 22 km southeast of Tangier, while the second zone is positioned 12 km east of Tangier, also at an elevation of 80 m above ground level. Table 3 illustrates the energy characteristics of the wind farm and the operational data of the wind turbines considered in this study for both sites. Additionally, the hourly time series data of wind speed for the year 2022, based on the Tarfaya and Tangier wind farm sites, was obtained from a meteorological station. Descriptive statistics for this data can be found in Table 4.
Wind farm characteristics.
Statistics analysis of wind speed data of sites.
The data from each location were divided into intervals with a 1 m/s variation. For each interval, the wind velocity was required to be higher than the lower value and less than or equal to the upper value, except for the first interval, which included velocities between 0 and 1 m/s. After dividing the data into intervals, the size of data within each interval was determined, and this amount was divided by the total number of data points, resulting in a relative frequency value for each interval. The Tarfaya and Tangier weather stations validates the data without altering the databases, identifying and eliminating data considered invalid during the process. Instead, it only flags data as suspicious, allowing the user to make the decision of whether to use them or not. Initially, the total data collected for the Tarfaya wind farm and the Tangier wind farm were 117,993 and 117,842, respectively. However, after the treatment, the total number of data points considered was reduced to 117,820 and 117,706, resulting in a utilization rate of 99.91% and 99.76%, respectively.
PSO-4 Compared to traditional and heuristic methods
To assess the robustness of our PSO-4 algorithm for estimating Weibull parameters, we conducted a comparative analysis with three heuristic methods: harmony search (HS), cuckoo search optimization (CSO), and ant colony optimization (ACO), developed by the authors (Freitas de Andrade et al., 2019). These methods depend on specific parameter sets, and adjusting these parameters is crucial to reducing computational time response and achieving convergence to optimal values. The parameters utilized in this study, as detailed in Table 5, were extracted from the paper where these methods were originally employed.
Parameters applied to the heuristic approaches.
The parameters extracted from the paper (Freitas de Andrade et al., 2019) were used to select the objective function that exhibited the best performance for each method (see equation (20)). Subsequently, the PSO-4 method was compared with the CSO, ACO, and HS methods to determine the most effective optimization approach in the two regions under study. Fitting performance was evaluated using the RMSE, MAE, and R2 tests, alongside an examination of the deviation in the WPD forecast.
Figures 4 and 5 display Weibull curves (a) and logarithmic inversion curves (b), generated through heuristic methods, specifically CSO, ACO, and HS, as developed by (Freitas de Andrade et al., 2019), along with our PSO-4 model. These visual representations are juxtaposed against the provided probability function f(v), utilizing hourly time series data from the Tarfaya and Tangier wind farms in 2022. The observed outcomes collectively assert the superiority of heuristic methods over their analytical counterparts, as validated by statistical tests.

Tarfaya wind farm: (a) Weibull curve and (b) logarithmic inversion curve.

Tangier wind farm: (a) Weibull curve and (b) logarithmic inversion curve.
Furthermore, the analysis emphasizes the exceptional performance of the PSO-4 algorithm. It not only stands out for its superior accuracy in estimating Weibull parameters but also excels in minimizing square errors, yielding the most favorable outcomes when compared to other method heuristics. This dual proficiency underscores the robustness of the PSO-4 algorithm, positioning it as a valuable and effective tool for precise Weibull parameter estimation. In essence, the study advocates a preference for heuristic methods, with the PSO-4 algorithm emerging as a particularly promising and adept approach for accurate estimation of Weibull parameters. The corresponding results from these tests are provided in Tables 6 and 7.
Conventional and heuristic method results for Tarfaya wind farm.
Table 6 provides an in-depth analysis of various conventional and heuristic methods applied to estimate the k and c parameters in the Weibull distribution, using wind data from the Tarfaya wind farm. Each approach undergoes a thorough evaluation using essential statistical metrics, including RMSE, MAE, R2, WPD, and computational time.
The GM demonstrates relatively modest accuracy in estimating k and c. While RMSE (0.018045), MAE (0.014590), and R2 (0.949850) suggest a strong correlation with actual data, the relatively high WPD (2.156045) indicates a noticeable difference in wind production density predictions. This could imply that while GM captures the general trend, it might struggle with accurately predicting wind production density.
MM, MLM, and MMLM exhibit comparable performances, featuring low RMSE and MAE values and robust correlations (R2) between predicted and actual data. Significantly, MM stands out as the most efficient with a computational time of 97 seconds. PDM and empirical approaches (LEM, JEM, LSM) deliver moderate accuracy, each yielding unique parameter estimates. Competitive results in RMSE, MAE, and R2 are observed, but computational times vary, with LEM having the longest at 131 seconds. This suggests that while these approaches provide reasonable accuracy, they might be computationally more intensive. AMLM falls within the mid-range for accuracy, displaying slightly higher RMSE and MAE values compared to some empirical methods. Its longer computational time of 120 seconds distinguishes it in this evaluation. AMLM’s accuracy might be compromised to some extent for the sake of a more elaborate computational process.
Optimization algorithms, including ACO, HS, and CSO (Freitas de Andrade et al., 2019), along with our novel PSO-4 model, demonstrate exceptional accuracy with remarkably low RMSE and MAE values. PSO-4 stands out as particularly efficient, achieving high accuracy with a short computational time of 76 seconds. While HS, CSO, and ACO (Freitas de Andrade et al., 2019) also exhibit strong performances, their computation times vary. The choice between these optimization algorithms could depend on the trade-off between accuracy and computational efficiency. Among all approaches, PSO-4 emerges as the most promising, offering the highest accuracy with the lowest RMSE and MAE values, a robust correlation (R2 value of 0.993745), and the shortest computation time at just 76 seconds. This underscores the effectiveness of PSO-4 in accurately estimating Weibull distribution parameters for the Tarfaya wind farm while maintaining computational efficiency.
The Table 7 presents a comprehensive comparison of various conventional and heuristic methods employed to estimate the parameters (k and c) of the Weibull distribution applied to wind speed data from the Tangier wind farm. Each approach is assessed based on critical metrics such as the estimated parameters, RMSE, MAE, R2 value, WPD, and computational time in seconds.
Conventional and methods results for Tangier wind farm.
The PSO-4 method emerges as a standout performer, showcasing the lowest RMSE at 0.002936 and the smallest MAE at 0.000881, underscoring its exceptional accuracy and precision in parameter estimation. Furthermore, PSO-4 exhibits the highest R2 value of 0.992071, signifying an outstanding fit to the wind speed data. Its notable WPD of 0.119773 emphasizes its efficiency in harnessing wind power. Importantly, PSO-4 achieves this performance with the lowest computational time, clocking in at 76 seconds, highlighting its efficiency in delivering accurate results swiftly.
In comparison, alternative optimization algorithms such as ACO, HS, and CSO (Freitas de Andrade et al., 2019) exhibit competitive performances across various metrics, highlighting their robustness in parameter estimation. ACO, in particular, stands out for its exceptionally low RMSE, indicating its effectiveness in accurately estimating Weibull distribution parameters. HS and CSO demonstrate comparable performance, underscoring their viability as alternatives for parameter estimation.
On the other hand, traditional methods such as the GM, MM, MLM, MMLM, PDM, LEM, JEM, LSM, and AMLM demonstrate lower effectiveness, particularly in terms of accuracy and fit. The detailed assessment of these traditional methods highlights their limitations compared to optimization algorithms in capturing the intricacies of wind speed data.
The analysis emphasizes the trade-off between accuracy and computational efficiency, ultimately highlighting PSO-4 as the optimal method for precise and efficient parameter estimation in the Weibull distribution for wind speed data at the Tangier wind farm.
Tables 8 and 9 present statistical results for the convergence characteristics of different optimization algorithms applied to the Tarfaya and Tangier wind farms in 2022. The convergence characteristics are assessed using standard deviation (STD), standard error (STE), best, worst, and mean values. The algorithms under consideration include CSO, HS, ACO, and PSO-4.
Convergence characteristics statistical results: Tarfaya wind farm, 2022.
For the Tarfaya wind farm (Table 8), the CSO algorithm exhibits a moderate STD of 5.14E-08 and STE of 6.38E-09, indicating reasonably stable convergence. HS demonstrates slightly better stability with a lower STD of 3.71E-08 and STE of 4.17E-09. ACO stands out with remarkably low STD (1.88E-10) and STE (5.21E-11), indicating highly stable convergence. PSO-4, however, outperforms all with an exceptionally low STD of 1.25E-13 and STE of 4.77E-14, showcasing unparalleled stability. In terms of best, worst, and mean values, PSO-4 consistently achieves the best convergence.
For the Tangier wind farm (Table 9), CSO exhibits a moderate STD of 4.38E-08 and STE of 5.23E-09, indicating stable convergence. HS shows slightly better stability with lower STD (2.41E-08) and STE (3.58E-09). ACO again demonstrates exceptional stability with remarkably low STD (1.15E-10) and STE (4.87E-11). PSO-4, as observed in Tarfaya, excels with extremely low STD (1.04E-13) and STE (4.12E-14). In terms of best, worst, and mean values, PSO-4 consistently achieves the best convergence.
Convergence characteristics statistical results: Tangier wind farm, 2022.
Across both wind farm locations, PSO-4 consistently emerges as the most effective algorithm, demonstrating superior stability and efficiency with the lowest STD, STE, and mean convergence values.
Conclusion
This research has provided a thorough assessment of the PSO-4 algorithm for estimating the scale and shape factor parameters of the Weibull distribution model in the context of wind energy forecasting. The study focuses on real wind data obtained from the Tarfaya and Tangier wind farms in Morocco, utilizing hourly time series data from the year 2022. This investigation utilizes a novel square frequency error objective function to minimize errors between the Weibull curve and observed frequency in a data histogram. It employs a two-stage training process that combines RLS estimation and PSO fine-tuning to enhance parameter accuracy. Parameter sensitivity analysis identifies optimal PSO configurations, with the PSO-4 model exhibiting superior performance.
The effectiveness of the proposed PSO-4 method was rigorously validated using artificial data series, confirming its accuracy in estimating Weibull parameters, especially with specified parameter pairs (k = 3, c = 8) and (k = 5, c = 13). A comparison between measured and estimated PDFs demonstrates a close alignment, affirming the algorithm’s ability to capture underlying distribution features. Statistical metrics such as RMSE, MAE, R2, and WPD further support the effectiveness of the PSO-4 model in accurately estimating Weibull distribution parameters (k and c) under varying wind conditions.
Moreover, a comparative analysis utilizing real hourly time series data from the year 2022 with conventional methods (GM, MM, MLM, MMLM, PDM, LEM, JEM, LSM, AMLM), and three heuristic optimization methods, including HS, CSO, and ACO as presented in Freitas de Andrade et al. (2019), highlights the superior performance of the PSO-4 algorithm in terms of accuracy and computational efficiency. The results from the Tarfaya and Tangier wind farms consistently show that PSO-4 outperforms other optimization algorithms, providing the most accurate parameter estimates with the lowest RMSE and MAE values and a high R2 value.
The convergence characteristics analysis reinforces the stability and efficiency of the PSO-4 algorithm, with exceptionally low STD and STE values. In both wind farm locations, PSO-4 consistently achieves the best convergence with the lowest STD, STE, and mean values.
The comprehensive evaluation of the PSO-4 algorithm in this study underscores its reliability, accuracy, and efficiency in estimating Weibull distribution parameters for wind speed data for Tarfaya and Tangier wind farms in Morocco. The results suggest that the PSO-4 algorithm is a promising and effective tool for applications in the field of wind energy production, offering a valuable contribution to the accurate modeling of wind conditions in specific regions.
Footnotes
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
