Hybrid principal component analysis and artificial neural networks method to improve the seismic response prediction

Abstract

Seismic vulnerability assessment is crucial for ensuring the structural safety of buildings, particularly in earthquake-prone regions. While Nonlinear Time History Analysis (NLTHA) provides high accuracy, its computational demands make it impractical for rapid assessments. Machine learning (ML) models, especially Artificial Neural Networks (ANN), offer an efficient alternative but often require large datasets for reliable predictions. This study introduces a hybrid Principal Component Analysis-Artificial Neural Network (PCA-ANN) model to enhance seismic response prediction by reducing input dimensionality while preserving critical information. A dataset of over one million seismic responses was generated using NLTHA on three reinforced concrete (RC) frame buildings subjected to various ground motions. Comparative analysis between PCA-ANN and conventional ANN models reveals that PCA-ANN significantly improves both predictive accuracy and computational efficiency. The PCA-ANN model achieved a correlation coefficient (R²) of 99.1% and reduced Mean Squared Error (MSE) by 87% compared to the standalone ANN. Additionally, PCA-ANN maintained robust performance with limited dataset sizes, achieving an R² above 75% using only 25% of the dataset, whereas ANN failed under similar conditions. Further validation through Incremental Dynamic Analysis (IDA) and fragility curves shows that PCA-ANN exhibits discrepancies below 2% compared to NLTHA. The model also achieves the lowest Relative Squared Error (RSR) (18%, 21%, and 24% for low-, mid-, and high-rise buildings, respectively) and the lowest Percentage Bias (PBias) (1.7%, 0.5%, and 0.3% for the same building types) when utilizing the full dataset. These results highlight PCA-ANN’s superior reliability across varying structural heights and dataset sizes. This study demonstrates that PCA-ANN is an efficient and accurate tool for seismic risk assessment, reducing computational costs while maintaining predictive reliability.

Keywords

nonlinear time history analysis artificial neural networks principal component analysis incremental dynamic analysis

Introduction

Performance-based design is a sophisticated approach to seismic design. It is based on estimating the behavior of the structure under the seismic loads. Thus, predicting the potential structural damage resulting from a seismic event may be possible (Benazouz et al., 2017; Dukes et al., 2018; Zhang et al., 2022). It entails comparing the seismic demand to the building’s capacity and assuring its functionality during and after an earthquake. The accuracy of the estimation is related to the approach used. Nonlinear Time History Analysis (NL-THA) is one of the many techniques used to analyze structural response to ground motions (GMs). It is based on the analytical and numerical solution of differential equations of motion (Sun et al., 2023). Compared to Nonlinear Static Procedures (NSP) (Mebarki et al., 2024), the NL-THA is a complicated procedure requiring extensive calculations.

Machine learning (ML) has emerged as a promising tool for rapid seismic response prediction. Once trained, these models can capture complex relationships between structural and seismic parameters, allowing for near-instantaneous predictions.

Lagaros and Fragiadakis (Lagaros and Fragiadakis, 2007) Proposed an artificial neural network methodology to quickly assess the seismic demand of steel buildings from GM records. Giovanis et al. (Giovanis et al., 2016). Developed an artificial neural network model to quickly generate the median IDA curve using a Monte Carlo approach for a case study involving a 9-story steel frame. They demonstrated that using more dataset samples during the training process can reduce the error between the actual and predicted values from 10% to 2.2%. Also, Khojastehfar et al.(Khojastehfar et al., 2014) introduced an ANN-based method to construct fragility curves under a collapse damage state for a steel frame building. Mitropoulou et al.(Mitropoulou and Papadrakakis, 2011) proposed a soft computing framework to estimate the fragility curves by predicting the seismic demand for four limit states, and used three 3D RC frame buildings as a case study. Also, Benbokhari et al. (Benbokhari et al., 2023) used the ANN model to predict the seismic response of an equivalent single degree of freedom (ESDOF) to enhance the target displacement prediction. The study focuses on comparing the proposed model with some existing approaches adopted by many codes like FEMA-356 (2000), FEMA-440 (FEMA-440, 2009) and ATC-40 (Aabbas and Jarallah, 2021). Results demonstrated the ability of the proposed model compared to the existing approaches for the seismic response prediction with Mean absolute error (MAE) less than 0.005.

This type of ML can be used in several fields, for example, Mangalathu et al. (Mangalathu et al., 2018) estimated the vulnerability of skewed concrete bridges in California using the ANN, Rachedi et al.(Rachedi et al., 2021) assessed the seismic risk of existing bridges considering the soil interaction using the ANN. Asgarkhani et al. (Asgarkhani et al., 2024) used a machine learning model to predict the maximum inter-story drift ratio (MIDR) and the roof drift ratio (RDR) for steel-braced frames, achieving performance levels of 98.7% and 93.5%, respectively. Kazemi et al. (Kazemi et al., 2023) proposed an ML model to predict the seismic MIDR of reinforced concrete buildings, 94,400 samples of RC buildings were used to train the model and they achieved 96.3% performance of the best ML model. Harirchian et al. (Harirchian et al., 2020) used a classification ML model to assess the seismic hazard safety of RC buildings and the results showed a performance of 68% of the best ML model to classify the damage class. (Tang et al., 2022) proposed an ML classification model to rapidly assess the seismic damage classes of RC buildings, achieving 96.8% performance.

In addition, other several works demonstrate that the ANN can be used as an alternative approach to the numerical current approaches, especially in earthquake engineering (Abdellatif et al., 2024; Barkhordari and Jawdhari, 2023; Derakhshani and Foruzan, 2019; Hou et al., 2024; Kazemi and Jankowski, 2023; Li et al., 2022; Liao et al., 2023; Noureldin et al., 2022; Rojas-Mercedes et al., 2022; Sajan et al., 2023; Wen et al., 2022; Zhang et al., 2024;).

However, the training time of ANNs can be a significant issue, especially when analysts are working with large datasets and during the tuning of hyperparameters. Additionally, the size of the dataset can impact the complexity of the ANN, potentially leading to decreased performance. (Khan et al., 2021; Lu et al., 2021; Qi et al., 2022). Principal component analysis (PCA), on the other hand, is the solution used for reducing the dimensionality of input features in an ANN. Various research studies have been conducted using this hybrid approach, combining supervised and unsupervised learning. Asencio et al. (Asencio-Cortés et al., 2015) Utilized PCA to enhance earthquake prediction in Chile. The study found that PCA significantly improved machine learning performance, increasing accuracy from 57%, 71%, and 65.80% to 69%, 74%, and 75% respectively. Abbasi et al. (Abbasi et al., 2023) estimated the settlement of dams caused by earthquakes using a combination of ANN, PCA, and wavelet-artificial neural networks, and they reduced the dimensionality from eight to five using PCA.

Existing literature demonstrates that hybrid learning significantly influences machine learning (ML) models in terms of accuracy, neural network complexity, and training efficiency. However, there is a critical gap in understanding its impact on seismic response prediction. While previous studies have explored ML applications in earthquake engineering, they have not comprehensively examined how hybrid learning techniques, particularly the integration of Principal Component Analysis (PCA) with Artificial Neural Networks (ANN), can enhance seismic response predictions.

This study addresses this gap by investigating the effect of hybrid learning on predicting seismic responses, particularly how it improves accuracy while reducing the required dataset size. The research aims to determine whether hybrid learning can achieve reliable seismic response predictions with a limited dataset, addressing challenges related to data availability and computational cost in earthquake engineering applications.

To achieve this, a dataset will be generated using OpenSees software through nonlinear time-history analyses (NLTHA) on randomly selected reinforced concrete (RC) framed structures subjected to various ground motions (GMs). The dataset will consist of one million samples, which will be used to train two ML models for predicting the maximum inter-story drift ratio (MIDR). Additionally, the study will evaluate the influence of dataset size by training the models with 25%, 50%, 75%, and 100% of the original dataset. This analysis will provide insights into the effectiveness of hybrid learning in seismic response prediction and its potential to optimize data efficiency without compromising accuracy.

Methodology

The proposed hybrid learning approach integrates both unsupervised and supervised learning to reduce the input features. As illustrated in Figure 1, a dataset will be generated using the Finite Element Method (FEM) in OpenSees software.

Figure 1.

The proposed methodology for the structural seismic response prediction.

Initially, a nonlinear time history analysis (NLTHA) will be conducted on randomly selected reinforced concrete (RC) framed structures. The input features, including structural and ground motion (GM) parameters and the output (Maximum Inter-story Drift Ratio, MIDR), will be stored in a single file. Following this, Principal Component Analysis (PCA) will be performed to align the dataset along principal component axes. These new principal components (PCs) will serve as the input layer for the artificial neural network (ANN) model.

For the comparison study, a separate ANN model will be trained using the original input features (structural and GM parameters). The comparison will include analyses of the Incremental Dynamic Analysis (IDA) curves, fragility curves, and the effects of dataset size on the results.

Dataset generation

The performance of any ML algorithm depends directly on the collected or generated dataset. Its quality and size affects directly the predictability of the model. For this work, OpenSees is used to perform more than 1 million NLTHA using 80 GMs. The used structures are generated randomly from a selection range of the structural geometric characteristics as shown in Table 1 and illustrated in Figure 2. The random selection must also be practical that is by engineering design concepts:

• The height of the first story (He) is always greater than or equal to the height of the subsequent stories (Hs).

• The depth (h1) of the beam should always be greater (or equal) than its width (b1).

Table 1.

Geometric parameters and interval values for each input.

Symbol	Parameter	Range
N_b	Number of bays	1–9
N_s	Number of stories	1–20
L_b	Bays’ lengths (m)	3.5 - 4 - 4.5 - 5 - 5.5 - 6
H_s	Storys’ heights (m)	3 - 3.2 - 3.4 - 3.6 - 3.8 - 4
H_e	First story’s height (m)	3 - 3.2 - 3.4 - 3.6 - 3.8 - 4
h,b	Length and width of columns	[30 cm–100 cm]; step = 5 cm
h₁,b₁	Depth and width of beams	[25 cm–70 cm]; step = 5 cm
As_column	Reinforcement area of columns	0.8% - 6% of gross area
As_beam	Reinforcement area of beams	0.3%- 2% of gross area

Figure 2.

The structural geometry and material models of a RC-framed building.

The heatmap shown in Figure 3 presents the correlation matrix of the dataset, illustrating the relationships between the variables. The diagonal values are all 1, indicating a perfect correlation of each variable with itself. There are strong correlations observed between PGV and PGD (0.94), Ecum and Is (1.00), and CAV and Is (0.96), suggesting redundancy among these variables. Moderate correlations are noted between the output and Ns (0.55), Sa(T1,%) (0.54), and PGA (0.39), indicating their potential influence on the target variable. Moreover, Sa(T1,%) exhibits a strong correlation with PGA (0.72), which may imply a dependency. Most other variables display weak correlations with the output, suggesting they may not significantly impact predictive modeling. The high correlations among certain inputs could lead to multicollinearity, which may affect the performance of regression models. Therefore, identifying and potentially removing redundant variables could enhance model efficiency.

Figure 3.

Correlation heatmap between input and output features.

It is important to note that the selection of input features was done randomly and uniformly to ensure all values had an equal probability of being selected.

Selection of ground motions

The selection of ground motion in the IDA approach is a crucial step, as it provides the simulated motion representative of actual earthquake events (Wasti and Özcebe, 2003). The selection process is influenced by several factors, including intensity, frequency content, duration, amplitudes, and target response spectra. It’s important to note that making an appropriate selection helps the analyst avoid errors that may arise from insufficient stimulation of the ground motions (GMs). (Chen and Yi, 2015). Additionally, an ANN requires a diverse set of ground motions for effective training. By capturing patterns within the data, the ANN learns from various examples, which enhances its ability to develop robust models and respond to complex inputs. A wider range of data increases the likelihood that the ANN will generalize its findings and accurately represent different types of ground motion. In this case, 80 ground motion records were selected and matched from the PEER database (Center, 2013), and they are represented in Table A1. Figure 4 presents the selected ground motions from the PEER database. Figure 4(a) illustrates the target seismic response spectrum along with the 16th, 50th, and 84th percentiles of the response spectra for the selected ground motions. Figures 4(b)–(d) show the relationships between earthquake magnitude and rupture distance, magnitude and shear wave velocity, and shear wave velocity and rupture distance, respectively. These figures highlight the variability of the ground motions chosen to train the machine learning model.

Figure 4.

Ground motions selection from PEEG database: (a) Selected response spectra scaled to a target response spectrum, (b) Magnitude (Mw) versus the closest distance to the rupture (R_rup),c) Magnitude versus the closest distance to shear wave velocity (V_s30), and (d) R_rup versus V_s30.

Seven characteristics are used to describe these ground motions, allowing the ANN to differentiate between the effects of each motion, even if they share the same peak ground acceleration. Table 2 summarizes the investigated intensity measures (IMs) and their definitions.

Table 2.

The selected GM parameters.

IM name	Definition	Equation
Peak ground acceleration	$P G A$	$= Max [a c c e l e r a t i o n (t)]$
Peak ground velocity	$P G V$	$= Max [V e l o c i t y (t)]$
Peak ground displacement	$P G D$	$= Max [D i s p l a c e m e n t (t)]$
The cumulative energy of a ground motion	E _cum	= $\int_{0}^{t = e n d} a {(t)}^{2} . d t$ $[30]$
Arias intensity	I _A	$= \frac{π}{2 g} \int_{0}^{T d} a {(t)}^{2} d t$ $[31]$
Cumulative absolute velocity	CAV	$= \int_{0}^{T m a x} \| a (t) \| d t$ (Campbell and Bozorgnia, 2023; DIF and Stambouli, 2023;)
Spectral acceleration response of the first mode	S_a (T₁, ξ = 5%)	= Maximum spectral acceleration of the first mode of vibration of a structure, at a , ξ = 5% damping ratio and a specific period of time (T1)

Principal component analysis (PCA)

The PCA is an unsupervised machine learning algorithm used for dimensionality reduction, feature selection, and data visualization. It is a statistical procedure that is based on converting a set of correlated data into linearly uncorrelated variables called principal components (PC) (Maćkiewicz and Ratajczak, 1993).

The number of principal components is less than or equal to the number of input features. The 1^st PC has the largest possible variance and each succeeding component in turn has the highest variance possible and it is orthogonal to the preceding components.

The primary objective of PCA is to identify the axes that exhibit the largest variances and provide the most informative features of the data. This process involves transforming the original dataset into a new dataset with fewer dimensions. Figure 5 illustrates the reorientation of the dataset from its original axes to the new principal component axes. By multiplying the standardized matrix by the matrix of eigenvectors, the new dataset is generated, with its size depending on the number of principal components selected.

Figure 5.

Dataset reorientation from the original axes using the Eigenvectors matrix.

Figure 6 illustrates the principal component, eigenvalues (EV), and cumulative variability (CV). PC1 has the highest EV (=4.861) and variability (27%) as a result. If a fixed threshold of 90% is applied, the dimensions can also be reduced to 11 dimensions (CV = 91.261%), which exhibit good variability.

Figure 6.

The principal components versus the eigenvalues of each PC. The red line represents the cumulative variability (%) in function of number of the PCs.

The difference between the ANN and the hybrid ANN-PCA is that the first one uses the inputs as they are (18 input features) that is: the input dimensions depend on the used variables. Whereas the second approach uses the eigenvectors to transform the original data and to reduce its dimensionality that is, the ANN inputs should be equal or less than the original variables (<= 18).

Artificial neural networks for seismic response prediction

ANNs have been increasingly applied to civil engineering and earthquake engineering to predict events and quantify seismic risk assessment (Khan et al., 2021; Lu et al., 2021). Its ability to find the relation between input and outputs made the regression and classification tasks easier and more efficient in terms of performance (Chen et al., 2023; Shivani and Rooban, 2021).

In this section, two ANN models are used, the first one is an ANN model trained using the generated dataset with 18 input features including the structural characteristics and earthquake parameters. On the other hand, the second model uses the PCA to reorientate the generated dataset and use the principal components as input features as depicted in Figure 7. The performance of both models is investigated after optimizing the hyperparameters and finding the best ones.

Figure 7.

PCA-ANN architecture for seismic response prediction.

The ANN is constructed through a series of sequential procedures, as shown in Figure 7. The first phase involves dataset preprocessing, which encompasses data-cleaning procedures such as the elimination of missing, duplicate, and infinite values. Input features should be standardized/ normalized to consider the effect of all the inputs. The input features are scaled between −1 to +1 (Al Shalabi and Shaaban, 2006).

The data will be divided into three distinct sets: training, testing, and validation, with proportions of 80%, 10%, and 10%, respectively. Cross-validation is essential at this stage to determine the average performance of the trained model, taking into account the effects of random shuffling and the selection of the training, testing, and validation data.

To determine the optimal hyperparameters for the ANN and PCA-ANN models, a series of training sessions were conducted where the correlation coefficient (R²) and Mean Squared Error (MSE) were calculated after each trial. The study explored the number of neurons (NN), the number of hidden layers (HL), and the activation functions for both models. Figure 8 illustrates the best combinations of [NN: HL] in terms of R² and MSE for each model. It is important to note that the most effective activation functions identified were the ReLU function for the hidden layers and the linear function for the output layer.

Figure 8.

The hyperparameter investigation for the ANN and PCA-ANN models: (a) Number of neurons and (b) number of hidden layers.

According to Figure 8, the best [NN: HL] for the ANN and PCA-ANN models are [90:7] and [70:4] respectively. They correspond to the highest R² (91.9% and 99.26%) and the lowest MSE (4e-3 and 3e-4 ). Furthermore, for training the ANNs, an “Adam” algorithm is used for optimization with a backpropagation (BP) algorithm, and it is based on three phases: (Forward pass phase, back pass phase, and updating phase). Figure 9 illustrates the performance of the ANN-PCA approach after the training phase for the dataset (Train, test, and validation) as well as the MSE for each iteration.

Figure 9.

The performance of the PCA-ANN model to predict the building’s response in terms of R² (a) testing, (b) training, (c) validation, and (d) The mean square error of each iteration.

Effect of hybrid learning on the prediction performance

The use of PCA for dimension reduction can have both positive and negative effects on the performance of ANNs. By capturing the principal components of a dataset, PCA can simplify the data, potentially speeding up the training process and improving performance in certain cases. Xiaonan et al. (Chen et al., 2020) found that using a combination of algorithms, specifically PCA and ANN, improved the accuracy of aircraft cost estimation compared to using ANN alone. The previous section demonstrated these findings and indicated that hybrid learning could help reduce hyperparameters, ultimately decreasing the training time from 5 minutes to 2 minutes per operation.

This section explores how the number of principal components (PC) can influence the performance of the ANN in terms of the R² and mean squared error (MSE). Figure 10 compares the mean R²—representing the average correlation coefficient for training, testing, and validation—as well as the MSE for each number of PC, against the ANN model that does not utilize PCA.

Figure 10.

The number of used principal components, along with the corresponding mean correlation coefficient R² and MSE.

Figure 10 illustrates that as the number of PCs increases, the R² value rises from 38% to 99.26%, while the MSE decreases from 0.47 to 0.000,543. The best R² achieved with the ANN is 91.9%. Additionally, by using a smaller number of hyperparameters (NN: HL), the performance of the ANN can be enhanced, which leads to a significant reduction in training time. Furthermore, a higher performance can be achieved with just 13 PCs compared to the ANN, which requires substantially more time and more hidden layer units.

Case study

Three RC-frame low-, mid-, and high-rise buildings are selected as case studies to compare the hybrid learning (PCA-ANN) and ANN performance in terms of IDA, fragility assessment, and data size effect. The IDA and fragility curves obtained from PCA-ANN are compared to the numerical solutions (NLTHA). The accuracy is estimated using three statistical criteria: the correlation coefficient (R²) as written in equation (1) (Benesty et al., 2009) , the Root Mean Square Error to Standard Deviation Ratio (RSR), as written in equation (2) (Alouache et al., 2019), and the percentage bias (PBIAS), as written in equation (3) (Moriasi et al., 2007).

R² is used to assess the degree of correlation between the actual data (NLTHA) and the predicted data. Higher R² values indicate a stronger correlation between the two sets of data. RSR measures the dispersion between the predicted values and the actual values. An RSR value of 0 signifies a perfect simulation with the lowest variability. PBias is used to evaluate the relationship between the predicted results and the actual values. A PBias value of 0 indicates a perfect match between the ANN (Artificial Neural Network) predictions and the NLTHA values.

R^{2} = {[\frac{\sum_{i = 1}^{n} ({M I D R}_{A N N, i} - {\bar{M I D R}}_{A N N}) ({M I D R}_{N L T H A, i} - {\bar{M I D R}}_{N L T H A})}{\sqrt{\sum_{i = 1}^{n} {({M I D R}_{A N N, i} - {\bar{M I D R}}_{A N N})}^{2}} \times \sqrt{\sum_{i = 1}^{n} {({M I D R}_{N L T H A, i} - {\bar{M I D R}}_{N L T H A})}^{2}}}]}^{2}

(1)

R S R (%) = 100 \times (\frac{\sqrt{\sum_{i = 1}^{n} {({M I D R}_{A N N, i} - {M I D R}_{N L T H A, i})}^{2}}}{\sqrt{\sum_{i = 1}^{n} {({M I D R}_{A N N, i} - {\bar{M I D R}}_{N L T H A})}^{2}}})

(2)

P B I A S (%) = \frac{\sqrt{\sum_{i = 1}^{n} {({M I D R}_{A N N, i} - {M I D R}_{N L T H A, i})}^{2}}}{\sum_{i = 1}^{n} ({M I D R}_{N L T H A, i})} \times 100

(3)

Where:

• $M I D R$ : is the predicted roof drift ratio using the ANN model.

• ${\bar{M I D R}}_{A N N}$ : is the average value of the predicted $M I D R$ using the ANN model.

• ${M I D R}_{N L T H A}$ : is the exact value of the $M I D R$ using OpenSees.

• ${\bar{M I D R}}_{N L T H A}$ : is the average value of the exact value of the $M I D R$ using OpenSees.

Table 3 Represents the statistical criteria for the performance evaluation of the selected parameters.

Table 3.

Statistical criteria for the performance evaluation (Annad and Lefkir, 2022).

Performance rating	R²	PBIAS	RSR
Very good	75<R² < 100	\| PBIAS\|<10	0<RSR <50
Good	65<R² < 75	10<\| PBIAS\|<15	50<RSR <60
Satisfaction	50<R² < 65	15<\| PBIAS\|<25	60<RSR <70
Unsatisfaction	0<R² < 50	\| PBIAS\|>25	RSR >70

The characteristics of the buildings

In order to check the ANN predictability, three RC frame buildings are selected to perform the IDA using the NLTHA and the ANN method. Figure 11 and Table 4 represent the elevation views and characteristics of the used buildings, respectively.

Figure 11.

The geometrical characteristics of the studied buildings: (a) low-rise, (b) mid-rise, and (c) high-rise.

Table 4.

Characteristics of the Case Study Buildings (Low-, Mid-and High-rise RC Frame Buildings).

	Low-rise	Mid-rise	High-rise
N_S	3	6	12
N_B	4	4	4
H_S (m)	3.5 (m)	3.5 (m)	3.5 (m)
L_B (m)	5.2 (m)	5.2 (m)	5.2 (m)
B_column × h _column	(40 ×40) (cm²)	(50 ×50) (cm²)	(75 ×75) (cm²)
b_beam × h_beam	(40 ×30) (cm²)	(40 ×30) (cm²)	(50 ×30) (cm²)
A_s^Column	18.375 cm²	39.25 cm²	112.5 cm²
A_s^Beam	12.58 cm²	18 cm²	15 cm²

Impact of dataset size on machine learning performance

As Alwosheer et al. (Alwosheel et al., 2018) The dataset size should be optimal; simply having more data does not always lead to better performance. The relationship between size and accuracy is complex. An increase in data can lead to overfitting or slower training if the data includes irrelevant or noisy features. Additionally, there are no established guidelines for determining the optimal dataset size for achieving the best performance.

This subsection explores how dataset size impacts the performance of Artificial Neural Networks (ANN) and PCA-ANN models. It examines four distinct datasets and builds four ANN models using 25%, 50%, 75%, and 100% of the generated dataset, respectively. The performance of both PCA-ANN and ANN will be evaluated using statistical criteria such as R², PBias, and RSR. Furthermore, the relative error—calculated using equation (4)—and the mean relative error between the predicted and median Incremental Dynamic Analysis (IDA) curves will also be assessed.

R e l a t i v e e r r o r (%) = 100 \times (| \frac{{M I D R}_{N L T H A} - {M I D R}_{A N N}}{{M I D R}_{N L T H A}} |)

(4)

In Figure 12(a)–(c), it is evident that the size of the dataset influences how the IDA median curve converges to the NLTHA solution. Additionally, when using 100% of the data, the predictions of the median curve align closely with the median IDA curves. Therefore, the larger the dataset used, the more accurate the results will be.

Figure 12.

Median IDA Curves and Relative Errors of the PCA-ANN Model and the NLTHA. (a) low-rise, (b) mid-rise, and (c) high-rise IDA curves; (d) low-rise relative errors, (e) mid-rise relative errors, and (f) high-rise relative errors.

Figure 12(d)–(f) provide valuable analytical insights. For a 100% dataset, the Mean Relative Error (MRE) is recorded as follows: 1.02% for low-rise structures, 0.36% for mid-rise structures, and 0.27% for high-rise structures. It’s important to note that the highest MRE occurs with the 25% dataset across all building categories. Furthermore, there is a positive correlation between dataset size and accuracy, highlighting the crucial role that sufficient data plays in refining predictive results. This comprehensive analysis confirms the inherent relationship between dataset characteristics and the effectiveness of ANN median predictions in seismic response prediction. To enhance result accuracy, it is advisable to create a deterministic dataset that contains no irrelevant or noisy data, as such extraneous information can adversely affect the ANN’s performance, particularly when the dataset is large. However, it is essential to note that this may increase the training time of the machine learning model.

Additionally, the results presented in Figure 13 examine the effects of dataset size and the hybrid PCA-ANN and ANN algorithms used, in terms of R², PBias, and RSR, along with their acceptable limits.

Figure 13.

Comparison between PCA-ANN and ANN results for low-, mid-, and high-rise buildings using different dataset sizes: (a) R² (ANN), (b) RSR (ANN), (c) PBIAS (ANN), (d) R² (PCA-ANN), (e) RSR(PCA-ANN), (f) PBIAS (PCA-ANN).

In terms of R ² : Figure 13(a)–(d) show the R² scores for the PCA-ANN and ANN models across four datasets. The PCA-ANN model achieves the highest R² scores of 96%, 94%, and 95% for low, mid, and high-rise buildings, respectively, when using 100% of the dataset. Additionally, the lowest R² score for the PCA-ANN model, when using 25% of the dataset, is still above the acceptable limit of 75%. In contrast, the highest R² score for the ANN model, also using 100% of the dataset, does not exceed 85%. The only dataset that meets the acceptable R² threshold of 75% for the ANN model is when 75% of the dataset is used.

In terms of RSR: Figure 13(b)–(e) display the RSR scores for both the PCA-ANN and ANN models across four datasets. It is observed that the PCA-ANN model achieves the lowest RSR scores, measuring 18%, 21%, and 24% for low, mid, and high-rise buildings respectively when utilizing 100% of the dataset. Additionally, the highest RSR score for the PCA-ANN model remains below the acceptable threshold of RSR <50% when using 25% of the dataset. In contrast, the lowest RSR score for the ANN model, when using the full dataset, does not exceed 30%. The only datasets yielding acceptable RSR scores are those that utilize 75% and 100% of the overall dataset.

In terms of PBias: Figure 13(c)–(f) display the PBias values for the PCA-ANN and ANN models across four datasets. It is observed that the PCA-ANN model shows the lowest PBias values, specifically PBias of 1.7%, 0.5%, and 0.3% for low, mid, and high-rise buildings when using 100% of the dataset. Moreover, the highest PBias value for the PCA-ANN model remains below the acceptable limit of PBias <10% when utilizing 25% of the dataset. In contrast, the lowest RSR value for the ANN model, when using the 100% dataset, does not exceed 30%. The only datasets that yield an acceptable RSR are those that use 75% and 100% of the data.

The effect of emerging the PCA on IDA curves prediction

This section examines the impact of using PCA on the performance of predicting IDA curves. The IDA curves are determined by calculating the maximum inelastic seismic response through NLTHA for each ground motion and its respective intensities. Figures 14 –16 display the IDA curves for low-, mid-, and high-rise buildings using the NLTHA, PCA-ANN, and ANN approaches (Miari and Jankowski, 2022).

Figure 14.

IDA curves of Low-rise building using: (a) NLTHA, (b) PCA-ANN, (c) ANN and (d) 50% fracile using NLTHA, PCA-ANN and ANN approaches.

Figure 15.

IDA curves of mid-rise building using: (a) NLTHA, (b) PCA-ANN, (c) ANN and (d) 50% fracile using NLTHA, PCA-ANN and ANN approaches.

Figure 16.

IDA curves of high-rise building using: (a) NLTHA, (b) PCA-ANN, (c) ANN and (d) 50% fracile using NLTHA, PCA-ANN and ANN approaches.

Figures 14 –16 present IDA curves for a low-rise building, utilizing three different methodologies: NLTHA, PCA-ANN, and ANN. These figures illustrate the variation of the MIDR with PGA across various fractiles, with a particular focus on the 50th fractile to analyze the mean response more effectively. Figure 14(a)–(c); Figure 15(a)–(c); and Figure 16(a)–(c) depict the IDA curves along with the 16%, 50%, and 86% fractiles for low, mid, and high-rise buildings, respectively. Each set of figures enables a nuanced comparison of the seismic response across different building heights and analytical approaches.

According to the results depicted in Figures 14 –16 The NLTHA approach exhibits a nearly linear trend, with the median (50% fractile) curve aligning well with the expected structural response, while the 16% and 84% bounds capture uncertainty. The PCA-ANN model closely follows the NLTHA results, preserving critical drift limits and structural behavior, though minor deviations appear in the spread of uncertainty bounds. The ANN model, however,

Shows greater variability, with a broader scatter of data points and a slight overestimation of MIDR at higher PGA values. When comparing 50% fractile curves, PCA-ANN demonstrates strong agreement with NLTHA, while ANN deviates slightly at higher intensities. This suggests that PCA-ANN provides a more reliable approximation, effectively balancing accuracy and computational efficiency. Overall, PCA-ANN outperforms ANN in capturing structural response trends, making it a viable surrogate model for seismic analysis, while NLTHA remains the most accurate but computationally demanding approach.

The effect of emerging the PCA on fragility assessment

Fragility assessment is a crucial process for evaluating the seismic vulnerability and condition of any structure located in a known hazard area. Fragility curves illustrate the probability of exceeding a specific limit state, such as performance level or damage state, as a function of intensity measures like PGA, PGV, or PGD. This section examines the probabilistic accuracy derived from IDA curves using NLTHA, PCA-ANN, and ANN models applied to the case study structures. Four performance levels are defined according to FEMA 356 guidelines (2000) : Immediate occupancy (IO), life safety (LS), and collapse prevention (CP) correspond to a MIDR = {1%,2% and 4%} respectively.

The fragility curves are derived using the IDA method which is represented with a lognormal cumulative distribution function as shown in equation (5):

P [performance level | PGA = {PGA}_{i}] = Φ (\ln (\frac{\frac{x}{μ}}{σ})

(5)

Where P represents the probability of exceedance, x is a specific value of the PGA_i , $μ$ and $σ$ are the median and the standard deviation of the intensity (PGA) at a specific performance level, $σ$ is the Φ (.) is the cumulative density of function (CDF). The Baker’s method can fit the CDF’s parameters (Baker, 2015). Figure 17 illustrates fragility curves of the case study structures obtained using IDA curves (NLTHA, PCA-ANN, ANN) for IO, LS, and CP performance levels. The PCA-ANN fragility curves converge to the NLTHA curves for the all-performance levels for low-, mid-, and high-rise buildings. Table 5 summarizes the mean absolute difference (MAD) between the NLTHA fragility curves and the PCA-ANN and ANN models using equation (5).

M A D = \frac{1}{n} \sum_{i = 1}^{n} | P_{N L T H A} [performance level | PGA i] - [performance level | {PGA}_{i} |

(6)

Where

P_{N L T H A}

represents the probability of exceedance calculated using equation (4) of the NLTHA,

P_{m}

represents the probability of exceedance using the ML models and n represents the number of PGA points.

Figure 17.

Fragility curves of IO, LS and CP performance levels for: a/ low-rise building, (b) mid-rise building, and (c) high-rise building.

Table 5.

The MAD of fragility Curves for IO, LS and CP Performance Levels.

	IO		LS		CP
	MAD (PCA)	MAD (ANN)	MAD (PCA)	MAD (ANN)	MAD (PCA)	MAD (ANN)
Low-rise	0.000636	0.08106	0.010741	0.060602	0.002896	0.073569
Mid-rise	0.015251	0.094463	0.019967	0.086279	0.004265	0.0789
High-rise	0.029536	0.071269	0.011875	0.043264	0.003668	0.08756

Table 5 presents the MAD values that compare the fragility curves derived from the PCA-ANN and ANN models to the results from NLTHA across various performance levels (IO, LS, CP) and building heights (low-rise, mid-rise, and high-rise).

For the IO level, PCA-ANN consistently shows lower MAD values than ANN, with the lowest error observed in low-rise buildings 6e-4 and the highest in high-rise structures 2e-2. In contrast, ANN exhibits larger discrepancies, particularly in mid-rise buildings 9e-2. This indicates that PCA-ANN provides a more accurate representation of fragility curves compared to ANN alone.

At the LS level, PCA-ANN also demonstrates lower MAD values, with errors ranging from 0.0107 for low-rise buildings to 0.0118 for high-rise buildings. Conversely, ANN shows higher deviations, with mid-rise structures presenting the largest MAD value 0.0862. This pattern suggests that PCA-ANN better aligns with the NLTHA results.

For CP, PCA-ANN maintains lower MAD values across all building heights, with the lowest error in high-rise buildings 0.0036. ANN, in contrast, shows significantly higher discrepancies, particularly in low-rise 0.0735 and high-rise structures 0.0875. This further confirms that PCA-ANN outperforms ANN in capturing fragility behavior.

Overall, PCA-ANN consistently produces fragility curves closer to NLTHA results compared to ANN, making it a more reliable approach for seismic vulnerability assessment. The differences are more pronounced in mid-rise and high-rise buildings, where ANN struggles to match NLTHA accuracy.

Conclusion

This study develops a hybrid learning model that combines principal component analysis (PCA) and artificial neural networks (ANNs) to enhance the prediction of seismic responses. It focuses on understanding how this hybrid learning approach impacts both the training process and the performance of the model, particularly for predicting the seismic responses of reinforced concrete (RC) framed structures. To conduct the investigation, a dataset of 1 million samples was generated using OpenSees software. Two machine learning models—ANN and PCA-ANN—were trained to predict seismic responses. A comparison was made between the two models, evaluating the incremental dynamic analysis (IDA) curves, fragility curves, and the relationship between dataset size and hybrid learning. These findings represent some of the most significant outcomes of the study:

• PCA-ANN enhances predictive accuracy, achieving an R² of 99.1% and reducing MSE by 87% compared to ANN.

• The model maintains strong performance even with reduced dataset sizes, improving computational efficiency compared to the supervised learning model.

• Fragility curves generated by PCA-ANN closely match NLTHA results, with discrepancies below 2%.

• PCA-ANN demonstrates lower Mean Absolute Difference (MAD) values across all performance levels and building heights, outperforming ANN.

• The method provides a practical balance between computational cost and accuracy, making it suitable for large-scale seismic risk assessment.

These findings highlight the potential of PCA-ANN as an efficient and accurate tool for seismic vulnerability analysis, effectively balancing computational cost with predictive reliability. Future research could investigate the impact of hybrid learning on performance-based seismic design, particularly focusing on how the model’s performance affects structural design decisions and earthquake response behaviors. This direction would further enhance the application of machine learning in earthquake engineering, contributing to more robust and resilient structural design practices.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Abdellatif Benbokhari

Appendix

Table A1.

The selected ground motions from the PEER database (Center, 2013).

Result id	Earthquake name	Year	Station name	Magnitude	Mechanism	Rjb (km)	Rrup (km)	Vs30 (m/sec)
1	“Imperial Valley-02”	1940	“El Centro Array #9”	6.95	Strike slip	6.09	6.09	213.44
2	“Northwest Calif-02”	1941	“Ferndale City Hall”	6.6	Strike slip	91.15	91.22	219.31
3	“Borrego”	1942	“El Centro Array #9”	6.5	Strike slip	56.88	56.88	213.44
4	“Kern County”	1952	“LA - Hollywood stor FF”	7.36	Reverse	114.62	117.75	316.46
5	“Kern County”	1952	“Pasadena - CIT Athenaeum”	7.36	Reverse	122.65	125.59	415.13
6	“Kern County”	1952	“Santa Barbara Courthouse”	7.36	Reverse	81.3	82.19	514.99
7	“Kern County”	1952	“Taft Lincoln School”	7.36	Reverse	38.42	38.89	385.43
8	“Northern Calif-03”	1954	“Ferndale City Hall”	6.5	Strike slip	26.72	27.02	219.31
9	“El Alamo”	1956	“El Centro Array #9”	6.8	Strike slip	121	121.7	213.44
10	“Borrego Mtn”	1968	“El Centro Array #9”	6.63	Strike slip	45.12	45.66	213.44
11	“Borrego Mtn”	1968	“LA - Hollywood stor FF”	6.63	Strike slip	222.42	222.42	316.46
12	“Borrego Mtn”	1968	“LB - Terminal Island”	6.63	Strike slip	199.84	199.84	217.92
13	“Borrego Mtn”	1968	“Pasadena - CIT Athenaeum”	6.63	Strike slip	207.14	207.14	415.13
14	“Borrego Mtn”	1968	“San Onofre - so Cal Edison”	6.63	Strike slip	129.11	129.11	442.88
15	“San Fernando”	1971	“2516 via Tejon PV”	6.61	Reverse	55.2	55.2	280.56
16	“San Fernando”	1971	“Anza post Office”	6.61	Reverse	173.16	173.16	360.45
17	“San Fernando”	1971	“Bakersfield - Harvey Aud”	6.61	Reverse	111.88	113.02	241.41
18	“San Fernando”	1971	“Borrego Springs Fire sta”	6.61	Reverse	214.32	214.32	338.54
19	“San Fernando”	1971	“Buena Vista - Taft”	6.61	Reverse	111.37	112.52	385.69
20	“San Fernando”	1971	“Carbon Canyon dam”	6.61	Reverse	61.79	61.79	235
21	“San Fernando”	1971	“Castaic - Old Ridge Route”	6.61	Reverse	19.33	22.63	450.28
22	“San Fernando”	1971	“Cedar Springs Pumphouse”	6.61	Reverse	92.25	92.59	477.22
23	“San Fernando”	1971	“Cedar Springs_ Allen Ranch”	6.61	Reverse	89.37	89.72	813.48
24	“San Fernando”	1971	“Cholame - Shandon Array #2”	6.61	Reverse	217.54	218.13	184.75
25	“San Fernando”	1971	“Cholame - Shandon Array #8”	6.61	Reverse	218.17	218.75	256.82
26	“San Fernando”	1971	“Colton - so Cal Edison”	6.61	Reverse	96.81	96.81	301.95
27	“San Fernando”	1971	“Fairmont dam”	6.61	Reverse	25.58	30.19	634.33
28	“San Fernando”	1971	“Fort Tejon”	6.61	Reverse	59.52	61.64	394.18
29	“San Fernando”	1971	“Gormon - Oso Pump Plant”	6.61	Reverse	43.95	46.78	308.35
30	“San Fernando”	1971	“Hemet Fire Station”	6.61	Reverse	139.14	139.14	328.09
31	“San Fernando”	1971	“Isabella dam (Aux Abut)”	6.61	Reverse	130	130.98	591
32	“San Fernando”	1971	“LA - Hollywood stor FF”	6.61	Reverse	22.77	22.77	316.46
33	“San Fernando”	1971	“LB - Terminal Island”	6.61	Reverse	58.99	58.99	217.92
34	“San Fernando”	1971	“Lake Hughes #1”	6.61	Reverse	22.23	27.4	425.34
35	“San Fernando”	1971	“Lake Hughes #12”	6.61	Reverse	13.99	19.3	602.1
36	“San Fernando”	1971	“Lake Hughes #4”	6.61	Reverse	19.45	25.07	600.06
37	“San Fernando”	1971	“Lake Hughes #9”	6.61	Reverse	17.22	22.57	670.84
38	“San Fernando”	1971	“Maricopa Array #1”	6.61	Reverse	193.25	193.91	303.79
39	“San Fernando”	1971	“Maricopa Array #2”	6.61	Reverse	108.56	109.73	443.85
40	“San Fernando”	1971	“Maricopa Array #3”	6.61	Reverse	109.01	110.18	441.25
41	“San Fernando”	1971	“Pacoima dam (upper left abut)”	6.61	Reverse	0	1.81	2016.1
42	“San Fernando”	1971	“Palmdale Fire Station”	6.61	Reverse	24.16	28.99	452.86
43	“San Fernando”	1971	“Pasadena - CIT Athenaeum”	6.61	Reverse	25.47	25.47	415.13
44	“San Fernando”	1971	“Pasadena - Old Seismo Lab”	6.61	Reverse	21.5	21.5	969.07
45	“San Fernando”	1971	“Pearblossom Pump”	6.61	Reverse	35.54	38.97	529.09
46	“San Fernando”	1971	“Port Hueneme”	6.61	Reverse	68.84	68.84	248.98
47	“San Fernando”	1971	“Puddingstone dam (Abutment)”	6.61	Reverse	52.64	52.64	421.44
48	“San Fernando”	1971	“San Diego Gas & Electric”	6.61	Reverse	205.77	205.77	354.06
49	“San Fernando”	1971	“San Juan Capistrano”	6.61	Reverse	108.01	108.01	459.37
50	“San Fernando”	1971	“San Onofre - so Cal Edison”	6.61	Reverse	124.79	124.79	442.88
51	“San Fernando”	1971	“Santa Anita dam”	6.61	Reverse	30.7	30.7	667.13
52	“San Fernando”	1971	“Santa Felita dam (Outlet)”	6.61	Reverse	24.69	24.87	389
53	“San Fernando”	1971	“Tehachapi Pump”	6.61	Reverse	61.75	63.79	669.48
54	“San Fernando”	1971	“UCSB - Fluid Mech Lab”	6.61	Reverse	124.38	124.41	322.42
55	“San Fernando”	1971	“Upland - San Antonio dam”	6.61	Reverse	61.72	61.73	487.23
56	“San Fernando”	1971	“Wheeler Ridge - ground”	6.61	Reverse	68.38	70.23	347.67
57	“San Fernando”	1971	“Whittier Narrows dam”	6.61	Reverse	39.45	39.45	298.68
58	“San Fernando”	1971	“Wrightwood - 6074 Park Dr”	6.61	Reverse	61.64	62.23	486
59	“Friuli_ Italy-01”	1976	“Barcis”	6.5	Reverse	49.13	49.38	496.46
60	“Friuli_ Italy-01”	1976	“Codroipo”	6.5	Reverse	33.32	33.4	249.28
61	“Friuli_ Italy-01”	1976	“Conegliano”	6.5	Reverse	80.37	80.41	352.05
62	“Friuli_ Italy-01”	1976	“Feltre”	6.5	Reverse	102.05	102.15	356.39
63	“Friuli_ Italy-01”	1976	“Tolmezzo”	6.5	Reverse	14.97	15.82	505.23
64	“Gazli_ USSR”	1976	“Karakyr”	6.8	Reverse	3.92	5.46	259.59
65	“Tabas_ Iran”	1978	“Bajestan”	7.35	Reverse	119.77	120.81	377.56
66	“Tabas_ Iran”	1978	“Boshrooyeh”	7.35	Reverse	24.07	28.79	324.57
67	“Tabas_ Iran”	1978	“Dayhook”	7.35	Reverse	0	13.94	471.53
68	“Tabas_ Iran”	1978	“Ferdows”	7.35	Reverse	89.76	91.14	302.64
69	“Tabas_ Iran”	1978	“Kashmar”	7.35	Reverse	193.91	194.55	280.26
70	“Tabas_ Iran”	1978	“Sedeh”	7.35	Reverse	150.33	151.16	354.37
71	“Tabas_ Iran”	1978	“Tabas”	7.35	Reverse	1.79	2.05	766.77
72	“Imperial Valley-06”	1979	“Aeropuerto Mexicali”	6.53	Strike slip	0	0.34	259.86
73	“Imperial Valley-06”	1979	“Agrarias”	6.53	Strike slip	0	0.65	242.05
74	“Imperial Valley-06”	1979	“Bonds Corner”	6.53	Strike slip	0.44	2.66	223.03
75	“Imperial Valley-06”	1979	“Brawley Airport”	6.53	Strike slip	8.54	10.42	208.71
76	“Imperial Valley-06”	1979	“Calexico Fire Station”	6.53	Strike slip	10.45	10.45	231.23
77	“Imperial Valley-06”	1979	“Calipatria Fire Station”	6.53	Strike slip	23.17	24.6	205.78
78	“Imperial Valley-06”	1979	“Cerro Prieto”	6.53	Strike slip	15.19	15.19	471.53
79	“Imperial Valley-06”	1979	“Chihuahua”	6.53	Strike slip	7.29	7.29	242.05
80	“Imperial Valley-06”	1979	“Coachella Canal #4”	6.53	Strike slip	49.1	50.1	336.49

References

Aabbas

Jarallah

(2021) Comparative study of the seismic assessment according to ATC-40, FEMA-356 and FEMA-440 for existing hospital building located at baghdad city. International Journal of Civil Engineering 1.

Abbasi

Seifollahi

Daneshfaraz

, et al. (2023) Estimation of vertical settlement of earthen dams caused by earthquake using ANN model and wavelet-ANN composition. Geotechnical & Geological Engineering 41(5): 3169–3186.

Abdellatif

Chikh

Ahmed

(2024) Seismic response prediction using a hybrid unsupervised and supervised machine learning in case of 3D RC frame buildings. Research on Engineering Structures and Materials. Epub ahead of print 20 March 2024.

Al Shalabi

Shaaban

(2006) Normalization as a preprocessing engine for data mining and the approach of preference matrix. In: IEEE Conference Publication | IEEE Xplore. Piscataway: IEEE, Available at: https://ieeexplore.ieee.org/document/4024051 (accessed 9 February 2025).

Alouache

Selatnia

Lefkir

, et al. (2019) Determination of the just suspended speed for solid particle in torus reactor. Water Science and Technology 80(1): 48–58.

Alwosheel

van Cranenburgh

Chorus

(2018) Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis. Journal of Choice Modelling 28: 167–182.

Annad

Lefkir

(2022) Analytic network process for local scour formula ranking with parametric sensitivity analysis and soil class clustering. Water Supply 22(11): 8287–8304.

Asencio-Cortés

Martínez-Álvarez

Morales-Esteban

, et al. (2015) Improving earthquake prediction with principal component analysis: application to Chile. In: Onieva

Santos

Osaba

(eds) Hybrid Artificial Intelligent Systems, Lecture Notes in Computer Science. Cham: Springer International Publishing, 393–404.

Asgarkhani

Kazemi

Jakubczyk-Gałczyńska

, et al. (2024) Seismic response and performance prediction of steel buckling-restrained braced frames using machine-learning methods. Engineering Applications of Artificial Intelligence 128: 107388.

10.

Baker

(2015) Efficient analytical fragility function fitting using dynamic structural analysis - jack W. Baker, Available at: https://journals-sagepub-com-s.web.bisu.edu.cn/doi/abs/10.1193/021113EQS025M (accessed 22 January 2024).

11.

Barkhordari

Jawdhari

(2023) Machine learning based prediction model for plastic hinge length calculation of reinforced concrete structural walls. Advances in Structural Engineering 26(9): 1714–1734.

12.

Benazouz

Laouami

Mebarki

, et al. (2017) Seismic structural demands and inelastic deformation ratios: sensitivity analysis and simplified models. Earthquakes and Structures 13: 59–66.

13.

Benbokhari

Chikh

Mebarki

(2023) Dynamic response estimation of an equivalent single degree of freedom system using artificial neural network and nonlinear static procedure. Research on Engineering Structures and Materials 10(2): 431–444.

14.

Benesty

Chen

Huang

, et al. (2009) Pearson correlation coefficientNoise Reduction in Speech Processing. Berlin, Germany: Springer, Vol. 2.

15.

Campbell

Bozorgnia

(2023) Ground-motion model for the standardized version of cumulative absolute velocity. In: Earthquake Spectra. London, England: Sage Publications Sage UK, 87552930221144063.

16.

Center

(2013) PEER ground motion database. PEER NGA-West2 Database 3.

17.

Chen

(2015) Incremental dynamic analysis of corroded reinforced concrete bridge columns subjected to near-field earthquake. Journal of Hunan University 42(3): 1–8.

18.

Chen

Huang

(2020) Application of a PCA-ANN based cost prediction model for general aviation aircraft. IEEE Access 8: 130124–130135.

19.

Chen

Yang

Liu

, et al. (2023) An energy-frequency parameter for earthquake ground motion intensity measure. Earthquake Engineering & Structural Dynamics 52(2): 271–284.

20.

Derakhshani

Foruzan

(2019) Predicting the principal strong ground motion parameters: a deep learning approach. Applied Soft Computing 80: 192–201.

21.

DIF

Stambouli

(2023) Impact of site condition on Arias intensity and Cumulative absolute velocity: application for low seismicity areas. Asian Journal of Civil Engineering 24: 1411–1424.

22.

Dukes

Mangalathu

Padgett

, et al. (2018) Development of a bridge-specific fragility methodology to improve the seismic resilience of bridges. Earthquake and Structures 15(3): 253–261.

23.

FEMA-356 (2000) Prestandard and commentary for the seismic rehabilitation of buildings. Reston: American Society of Civil Engineers.

24.

FEMA-440 (2009) Improvement of Nonlinear Static Seismic Analysis Procedures. Washington: Federal Emergency Management Agency.

25.

Giovanis

Fragiadakis

Papadopoulos

(2016) Epistemic uncertainty assessment using incremental dynamic analysis and neural networks. Bulletin of Earthquake Engineering 14(2): 529–547.

26.

Harirchian

Kumari

Jadhav

, et al. (2020) A machine learning framework for assessing seismic hazard safety of reinforced concrete buildings. Applied Sciences 10(20): 7153.

27.

Hou

Wang

(2024) Interpretable machine learning models for predicting probabilistic axial buckling strength of steel circular hollow section members considering discreteness of geometries and material. In: Advances in Structural Engineering. Thousand Oaks: Sage Publications Ltd STM. 13694332241289175.

28.

Kazemi

Jankowski

(2023) Machine learning-based prediction of seismic limit-state capacity of steel moment-resisting frames considering soil-structure interaction. Computers & Structures 274: 106886.

29.

Kazemi

Asgarkhani

Jankowski

(2023) Machine learning-based seismic response and performance assessment of reinforced concrete buildings. Archives of Civil and Mechanical Engineering 23(2): 94.

30.

Khan

Gupta

Sekhri

(2021) A novel PCA-FA-ANN based hybrid model for prediction of fluoride. Stochastic Environmental Research and Risk Assessment 35(10): 2125–2152.

31.

Khojastehfar

Beheshti-Aval

Zolfaghari

, et al. (2014) Collapse fragility curve development using Monte Carlo simulation and artificial neural network. Proceedings of the Institution of Mechanical Engineers - Part O: Journal of Risk and Reliability 228(3): 301–312.

32.

Lagaros

Fragiadakis

(2007) Fragility assessment of steel frames using neural networks. Earthquake Spectra 23(4): 735–752.

33.

Chen

(2022) Fast seismic response estimation of tall pier bridges based on deep learning techniques. Engineering Structures 266: 114566.

34.

Liao

Lin

Zhang

, et al. (2023) Attention-based LSTM (AttLSTM) neural network for seismic response modeling of bridges. Computers & Structures 275: 106915.

35.

Niu

Kang

, et al. (2021) A hybrid PCA-SEM-ANN model for the prediction of water use efficiency. Ecological Modelling 460: 109754.

36.

Maćkiewicz

Ratajczak

(1993) Principal components analysis (PCA). Computers & Geosciences 19(3): 303–342.

37.

Mangalathu

Heo

Jeon

J-S

(2018) Artificial neural network based multi-dimensional fragility development of skewed concrete bridge classes. Engineering Structures 162: 166–176.

38.

Mebarki

Jerez

Boukri

, et al. (2024) Emergency management and urban resilience under seismic risks. Part I Theoretical approach for quick post-quake damage evaluation of buildings. Available at: https://www.maxapress.com/article/doi/10.48130/emst-0024-0027 (accessed 9 February 2025).

39.

Miari

Jankowski

(2022) Incremental dynamic analysis and fragility assessment of buildings founded on different soil types experiencing structural pounding during earthquakes. Engineering Structures 252: 113118.

40.

Mitropoulou

Papadrakakis

(2011) Developing fragility curves based on neural network IDA predictions. Engineering Structures 33(12): 3409–3421.

41.

Moriasi

Arnold

Van Liew

, et al. (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. DOI: 10.13031/2013.23153 (accessed 23 January 2024).

42.

Noureldin

Ali

Sim

, et al. (2022) A machine learning procedure for seismic qualitative assessment and design of structures considering safety and serviceability. Journal of Building Engineering 50: 104190.

43.

Xie

Chen

, et al. (2022) Prediction methods of common cancers in China using PCA-ANN and DBN-ELM-BP. Available at: https://ieeexplore.ieee.org/abstract/document/9924251 (accessed 7 February 2025).

44.

Rachedi

Matallah

Kotronis

(2021) Seismic behavior & risk assessment of an existing bridge considering soil-structure interaction using artificial neural networks. Engineering Structures 232: 111800.

45.

Rojas-Mercedes

Erazo

Di Sarno

(2022) Seismic fragility curves for a concrete bridge using structural health monitoring and digital twins. Earthquake and Structures 22(5): 503–515.

46.

Sajan

Bhusal

Gautam

, et al. (2023) Earthquake damage and rehabilitation intervention prediction using machine learning. Engineering Failure Analysis 144: 106949.

47.

Shivani

Rooban

(2021) Backpropagation algorithm and its hardware implementations: a review - IOPscience. Available at: https://iopscience.iop.org/article/10.1088/1742-6596/1804/1/012169/meta (accessed 23 January 2024).

48.

Sun

Zhang

Dai

, et al. (2023) Seismic fragility analysis of a large-scale frame structure with local nonlinearities using an efficient reduced-order Newton-Raphson method. Soil Dynamics and Earthquake Engineering 164: 107559.

49.

Tang

Dang

Cui

, et al. (2022) Machine learning-based fast seismic risk assessment of building structures. Journal of Earthquake Engineering 26(15): 8041–8062.

50.

Wasti

Özcebe

(2003) Seismic Assessment and Rehabilitation of Existing Buildings. Berlin, Germany: Springer Science & Business Media.

51.

Wen

Zhang

Zhai

(2022) Rapid seismic response prediction of RC frames based on deep learning and limited building information. Engineering Structures 267: 114638.

52.

Zhang

Fung

Johnson

, et al. (2022) Review of seismic risk mitigation policies in earthquake-prone countries: lessons for earthquake resilience in the United States. Journal of Earthquake Engineering 26(12): 6208–6235.

53.

Zhang

(2024) Buckling critical load prediction of pultruded fiber-reinforced polymer columns and feature analysis by machine learning. Advances in Structural Engineering 27(11): 1945–1961.