Abstract
The unconfined compressive strength (Qu) is one of the most important criteria of stabilized soil to design in order to evaluate the effective of soft soil improvement. The unconfined compressive strength of stabilized soil is strongly affected by numerous factors such as the soil properties, the binder content, etc. Machine Learning (ML) approach can take into account these factors to predict the unconfined compressive strength (Qu) with high performance and reliability. The aim of this paper is to select a single ML model to design Qu of stabilized soil containing some chemical stabilizer agents such as lime, cement and bitumen. In order to build the single ML model, a database is created based on the literature investigation. The database contains 200 data samples, 12 input variables (Liquid limit, Plastic limit, Plasticity index, Linear shrinkage, Clay content, Sand content, Gravel content, Optimum water content, Density of stabilized soil, Lime content, Cement content, Bitumen content) and the output variable Qu. The performance and reliability of ML model are evaluated by the popular validation technique Monte Carlo simulation with aided of three criteria metrics including coefficient of determination R2, Root Mean Square Error (RMSE) and Mean Square Error (MAE). ML model based on Gradient Boosting algorithm is selected as highest performance and highest reliability ML model for designing Qu of stabilized soil. Explanation of feature effects on the unconfined compressive strength Qu of stabilized soil is carried out by Permutation importance, Partial Dependence Plot (PDP 2D) in two dimensions and SHapley Additive exPlanations (SHAP) local value. The ML model proposed in this investigation is single and useful for professional engineers with using the mapping Maximal dry density-Linear shrinkage created by PDP 2D.
Keywords
Introduction
Since soil serves as a foundation for all projects, stabilizing soil is a common construction practice. There are two primary goals of stabilized soil: (1) to enhance the soil’s physical and mechanical characteristics in order to meet the technical criteria of the building (2) to increase permeability in order to increase the stability of stabilized soil. To enhance the characteristics of soils, several types of binders, such as cement, lime, fly ash, silica fume, bitumen, or natural pozzolan, are employed for stabilizing soil [1]. There are two primary chemical stabilization mechanisms: (i) in the first mechanism, chemical interactions inside additives strengthen the method by creating a matrix like C-S-H and C-A-S-H, (ii) In the second process, which is also referred to as the pozzolanic reaction [2], one or more chemicals are applied to a soil and they react with one another and the soil’s constituents. This makes it challenging to estimate the link between the mechanical qualities, such Unconfined Compressive Strength (Qu), and the mix design’s composition experimentally. A proper prediction model is required in order to formulate the complicated connection. Some empirical correlations are established to described the relations between UCS of stabilized soil and these related parameters. For instance, MolaAbasi et al. [3] established the power function as
Brief comparison between empirical correlation and Machine Learning technique
Brief comparison between empirical correlation and Machine Learning technique
Machine learning (ML) techniques have become widely used in all aspects of life in the past several decades, due to the quick advancement of artificial intelligence technologies [5–7]. The application of machine learning techniques has been effective in resolving a number of challenging issues in civil engineering, including geotechnical engineering [8, 9], and materials science [1, 10, 11].
Artificial Neural Network (ANN) techniques are now among the most successful ML algorithms for solving challenging technological issues [8, 12]. The ANN model can resolve challenging nonlinear issues. Therefore, without relying on physical chemistry, mechanical equations, etc., the findings of models are consistent. For instance, using a group method of data handling (GMDH) type neural network (NN), MolaAbasi and Shooshpasha [13] provided a polynomial model to predict the unconfined compressive strength (Qu). Similarity, using a Group Method of Data Handling (GMDH) type neural network, it has been explored how the correlation between SPT-N60 and undrained shear strength of the soils is affected by natural moisture content, plasticity index, and effective overburden stress [14]. In the investigation of Kordnaeij et al. [15], soil properties such as the liquid limit (LL), initial void ratio, and specific gravity. As a result, Das et al. [16] and Suman et al. [17] developed ANN approaches to forecast the parameters of stabilized soil, such as dry density and unconfined compressive strength (Qu). Both of investigations considered 7 input variables in the ANN models. The best performance of these models in prediction of Qu are evaluated through coefficient of determination R2 which are respectively equal to 0.724 and 0.9025 for predicting the Qu in the testing part [16, 17]. In the investigations Das et al. [16] and Suman et al. [17], the system empirical equations were developed in estimating the Qu of stabilized soil. In the best case, support vector machine (SVM) is one of the popular machine learning techniques, give the highest performance with R2 to be equal to 0.839 for testing part. Recently, Tran [1] used the same database of Das et al. [16] and Suman et al. [17] with 12 input variables to build the hybrid ML model combining Gradient Boosting algorithm and popular metaheuristic algorithm Particle Swarm Optimization (PSO) for predicting and evaluating the unconfined compressive strength of stabilized soil with high performance R2 = 0.9655, RMSE = 0.1633 MPa for testing dataset, the reliability of this hybrid ML model is confirmed by the validation technique called K-Fold Cross Validation. The development of hybrid ML model requires computational time consuming which influences the popularization of ML model to the engineers.
In order to popularize the ML model to the civil engineers for designing the unconfined compressive strength of stabilized soil containing, six popular ML algorithms, such as Linear Regression (LR), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB) and Extreme Gradient Boosting (XGB) are proposed in this study to select the highest performance and highest reliable ML model. The proposed ML algorithms are easy accessible in the Sklearn library of open-source language programming Python [18]. Six algorithms use the default hyperparameters available in Sklearn, that makes ML models to be easy to build. Six ML models will be built based on a database created from the investigation of Burroughs [19] in which 12 input variables including Liquid limit, Plastic limit, Plasticity index, Linear shrinkage, Clay content, Sand content, Gravel content, Optimum water content, Density of stabilized soil, Lime content, Cement content, Bitumen content will be used. The performance and reliability of ML models will be evaluated by the popular validation technique Monte Carlo simulation with aided of three evaluation metrics, such as: coefficient of determination R2, Root Mean Square Error (RMSE), and Mean Absolute Error (MAE).
In this study, a database containing 200 samples is gathered from published thesis [19]. 200 data samples are divided into two parts, 70% training data corresponding to 140 samples and 30% testing data corresponding to 60 samples. 12 input variables such as Liquid limit, Plastic limit, Plasticity index, Linear shrinkage, Clay content, Sand content, Gravel content, Optimum water content (OMC), Density of stabilized soil, Lime content, Cement content, Bitumen content are denoted from X1 to X12. Qu is solely output variable for all ML models. The descriptions of data distribution are presented in Table 2 and Fig. 1.

Plotting simple correlation between input variable and output variable Qu.
Statical data samples of database including Unit, Count, Average, StD, Min, Q25, Median, Q75, Max, Sk
Sk=Skewness; Std = Standard deviation.
Precisely, the liquid limit (X1) ranges from 18 to 89 (%), the average liquid limit is equal to 33.4 (%). Similarly, the plastic limit (X2) ranges from 12.0 to 36.0 (%) As shown in Table 2, the plasticity index (X3) is varied from 0 to 69 (%). The linear shrinkage (X4) is mainly in the ranges from 0 to 16.8. The clay content (X5) varies from 5 to 53%. The sand content (X6) and gravel content range from 30 to 94% and 0 to 62%, respectively. The moisture content (X8) of stabilized soil varies from 0 to 28%, the moisture values focus on about 10%. The lime content (X9), cement content (X10) and bitumen content (X11) are distributed on few points which vary in small range such as from 0 to 6 for both lime and cement content and from 0 to 3 for bitumen content. However, the density of stabilized soil (X12) and output unconfined compressive strength of stabilized soil (Qu) vary from 0 to 2.2 kg/m3 and from 1.0 to 5.4 MPa with abundant value (cf. Fig. 1). The database is based on the experimental results performed by Burroughs [19]. The bitumen used in the experiments in Burroughs’ study [19] was an emulsion that complied with Australian Standard AS 1160 [20]. The bitumen stabilization of soil renders it resistant to water absorption, according to the “Guide to Pavement Technology Part 4D: Stabilized Materials” [21], therefore, 3% of bitumen by weight of soil is used for the stabilization of soil [22]. The initial database is respectively hold in this investigation.
Figure 2 displays several correlations between the inputs and the result Qu “unconfined compressive strength.” The correlation values are displayed with the number of each correlation and in various colors. The data clearly demonstrates that several of the variables, such as X1 and X3 for the liquid limit and plasticity index, respectively, have a weak correlation. Overall, there is not much of a link between the inputs and compressive strength. In order to improve the accuracy of the suggested machine learning algorithms, all factors are taken into account. The effect of each input variable on the target output variable would be discussed after building successfully the highest and the most reliable ML model.

Feature selection with aided of correlation Pearson of the input and output variables.
Linear regression (LR)
A machine learning algorithm based on supervised learning is linear regression. Regression uses independent variables to model a goal prediction value. It is mostly used to determine how variables and forecasting relate to one another. Regression models vary according to the number of independent variables they utilize, the type of relationship they take into account between the dependent and independent variables, and other factors.
Real firms frequently employ the prediction approach known as linear regression [23]. With the use of relevant explanatory factors, linear regression attempts to forecast the link between a scalar response and an output value that has a practical significance, such as product sales or home prices. According to mathematics, the goal of linear regression is to minimize the sum of residuals between each data point and the value predicted. We are minimizing the difference between the data and the estimating model, to put it another way. The job of predicting a dependent variable’s value (y) based on an independent variable is carried out using linear regression (x). Therefore, x (the input) and y (the output) are found to be linearly related using this regression approach (output). Thus, the term “linear regression” was coined. According to Tibshirani [24], further information on linear regression may be found.
K-Nearest Neighbors (KNN)
The k nearest neighbors’ approach, often known as k-NN or KNN, is a supervised learning technique used in artificial intelligence, more specifically in machine learning. The k nearest neighbor approach involves considering (identically) the k training samples whose input is closest to the new input x, according to a distance to be determined, in order to estimate the output associated with the new input x. This algorithm’s accuracy can be increased by normalizing because it is dependent on distance [25, 26]. The k nearest neighbors (k-NN) algorithm is a nonparametric technique for classification and regression in pattern recognition. It is an issue of categorizing the entry in the space of characteristics defined by learning that the k nearest neighbors belong to in both situations. Whether the method is applied for classification or regression determines the outcome. The value for that object is the result in k-NN regression. The average of the values of the k closest neighbors makes up this value. The k-NN approach is based on prior learning, or weak learning, where the function is assessed locally and the final computation is carried out at the conclusion of the classification. One of the simplest machine learning methods is the k-NN algorithm.
Support Vector Machine (SVM)
The support vector machine, a statistically based learning technique for solving classification and regression issues was first described by Vapnik [27]. In fact, SVM has been widely utilized to accurately forecast characteristics in the field of geotechnical engineering [28, 29]. The basic idea of SVM is to create a hyperplane in order to categorize the dataset. The original input space is changed in the SVM to a high-dimensional feature space by using the training dataset [30, 31]. The best strategy is then determined by optimizing the class boundary. As a result, the training points, which are located close to the ideal plane, serve as the foundation for the support vectors. Support vector regression (SVR) is used in this study by putting out a -insensitive loss function [32]. The detailed information and computation procedures of the SVM algorithm is presented in [27].
Random Forest (RF)
Breiman [33] was the first to propose the random forest (RF), an ensemble machine learning technique utilized for regression and classification problems with low overfitting and good prediction accuracy. The decision tree is the core technique, and on the basis of the decision tree, several sub-datasets were produced from the original dataset using bagging aggregation. The best attribute was then produced at random from the sub-dataset. The mean value of each individual tree was then calculated from the output. According to a report, when the RF model’s tree count is sufficient, the RF can handle a sizable dataset and guarantee the model’s good prediction performance [34].
Gradient Boosting (GB)
Friedman made the initial suggestion for the gradient boosting (GB) method [35]. A regression tree, which is made up of many decision trees, is what the GB algorithm is. Each decision tree in the GB algorithm provides a fresh prediction. As more predictors are added to each individual decision tree, the performance of the final mode steadily improves. Boosting, as opposed to randomization, is the basis of the GB algorithm in general. The mistake produced by the prior-based model was corrected as a result of the new predictors being added sequentially to the ensemble model in the boosting phase. The GB techniques are said to offer benefits including lowering overfitting and conserving computational memory, which result in quicker prediction ability [36]. The goal of supplementing each based model was to lower a particular loss function and create the best possible model. With a wide range of loss functions that have been deduced so far and the option to design one’s own task-specific loss, the researcher ultimately chooses which loss function to use. These algorithms’ typical examples may be found in the investigation of Friedman [35].
Extreme Gradient Boosting (XGB)
First, Extreme gradient boosting was a new algorithm that Chen and Guestrin [37] developed (XGB). This method uses a novel algorithm made up of several weak learners. The XGB technique is often implemented using the boosting process, which creates a strong learner by first adding weak ones. In each stage, the gradient descent optimization approach was used to increase the number of learners in the current model by including a weak learner in order to reduce the loss function of the previous model. The XGB algorithm is said to be better to the gradient boosting approach since it can prevent the overfitting issue based on compatible regulatory functions. The new tree in the XGB model was trained to achieve the minimal error at the last iteration after the residual error from the previous tree was added to it. As a result, it can be concluded that the XGB is a better algorithm since it uses a strategy that is built on a solid objective function to produce a tree. The following expression represents the XGB algorithm’s goal function.
Monte Carlo simulation is one of the most popular validation techniques for verify the reliability and generality of ML model. Statistical analysis can be used to explain the variation in the output findings using numerical prediction models incorporating Monte Carlo simulation. Numerical AI models [38] can be used to calculate the impact of input variability on output outcomes using the Monte Carlo approach. The goal of the Monte Carlo method in this study is to repeatedly run simulations at random while accounting for input space variability, and then use a machine learning model to determine the associated output [39]. Statistical performance criteria of the output findings can be used to assess the robustness of the Monte Carlo simulation and the sensitivity of the input variables. The normalized convergence requirements are as follows for the statistical convergence of the Monte Carlo simulation:
Evaluation of the ML models’ performance after construction is crucial. Following that, the constructed models’ performance was assessed using well-known performance indices, including the determination coefficient (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The range of R2 values is 0 to 1, with a greater value indicating better model performance. In contrast, the model’s prediction accuracy is higher for the two remaining indices because of their smaller values. These indexes’ expressions are provided as follows:
Selecting the best ML model with aided of Monte Carlo simulation
Using the Monte Carlo simulation, the performance and reliability of ML models are evaluated three index R2, RMSE and MAE. In Fig. 3, the performance metrics for six algorithms, including Linear Regression (LR), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB), are displayed. These metrics include (a) coefficient of determination (R2), (b) root mean square error (RMSE), and (c) mean absolute error (MAE). We run a total of 3000 simulations for each machine learning method. The mean values and standard deviation (Std) throughout 3000 Monte Carlo simulations are highlighted in Fig. 3, which also displays the statistical measurements of the simulations.

Performance index of ML models with 3000 runs of Monte Carlo simulation (a) R2 value, (b) RMSE value and (c) MAE value for both training and testing part.
For the training portion, it should be remembered that the LR technique’s lowest value of R2 is equal to 0.5703 and the XGB technique’s maximum value of R2 is R2 = 0.9998. For the testing datasets, the maximum R2 value obtained is R2 = 0.9553, and the minimum R2 value is a negative number, which denotes an ineffective machine learning model. For the training parts, the RMSE values vary from 0.008 to 0.6519, and for the testing parts, they range from 0.1184 to 0.9839. The outcomes also show that the MAE values for the training datasets are between 0.0012 and 0.8151 and for the testing datasets, between 0.1895 and 1.7627.
The Std values of each case also are displayed in Fig. 3. It is shown that StD values are high for all machine learning model in the testing prediction. The StD values of training part are relatively lower than that of testing part for all machine learning models.
The histogram in Fig. 3 also shows that the R2 values of three models consisting of RF, GB and XGB are greater than that of three models LR, KNN and SVM. Furthermore, the RMSE and MAE value of RF, GB and XGB models is significantly lower than that of LR, KNN and SVM models. Therefore, the performance of three machine learning techniques RF, GB and XGB is better than that of LR, KNN and SVM models when the default initialization of simulations is used. Besides, the machine learning models using default initialization have high performance for training part however the performance of the models for testing part is less high than that of training part.
For more information, Table 3 displays the maximum, lowest, average, and standard values for each of the six machine learning algorithms (LR, KNN, SVM, RF, GB, and XGB models) over 3000 trials. The findings of the testing portion show that the RF model’s performance on the criterion is far worse than that of the GB and XGB models.
Statical of performance index of 6 single ML models after 3000 runs of Monte Carlo simulation
Besides, Fig. 2 shows the performance of GB and XGB model is relatively identical. More precisely, the summary of performance criteria in Table 2 shows that the reliability of GB model is higher than that of XGB model. The performance value of testing part almost shows the better of GB model than XGB model. In fact, the mean value of R2 0.7337 for 3000 simulations obtained by GB model is higher than that of XGB model with R2 = 0.7093. After 3000 simulations, the StD value of R2 obtained by GB model is also lower than that obtained by XGB model. The mean and StD of MAE values also confirm the higher accuracy of GB model than XGB model. The StD of RMSE values and MAE values are lowest in the all-machine learning techniques proposed in this investigation. Consequently, the GB model is regarded as being superior to the other machine learning models provided in this study due to having the highest R2 value and the lowest mean values of MAE and StD. Therefore, in the following part, the unconfined compressive strength Qu of stabilized soil is predicted using the single GB model with default initialization.
The GB model’s representative prediction results are shown in this section. Using the training and test testing datasets, Fig. 4 compares the experimental and GB model unconfined compressive strength outputs of stabilized soil. The comparison demonstrates that the testing portion’s stabilized soil unconfined compressive strength is consistent with experimental results. Error histograms between predicted and experimental stabilized soil unconfined compressive strength values for the training (Fig. 5a) and testing datasets also support the excellent correlation results (Fig. 5b). The comparatively tiny mistakes of the training and testing datasets are meaningless. For the testing portion, only 8/60 values of error beyond the range [-0.25; 0.25] (MPa) are detected, and for the training dataset, 3 values of error outside the range [-0.5; 0.5] (MPa) are monitored. These error values demonstrate that the suggested Gradient Boosting (GB) model’s ability to forecast the values of stabilized soil’s unconfined compressive strength makes it a useful tool.

True and GB-based value of unconfined compressive strength in two parts consisting of (a) training dataset and (b) testing dataset.

GB-based prediction error in in two parts consisting of (a) training dataset and (b) testing dataset.
Evaluation of ML performance of this study and previous investigation of literature
Figure 6 displays the regression graphs for the training and testing portions. It should be highlighted that the GB model has strong predictive power. For the training and testing dataset, the correlation values are R2 = 0.9692 and R2 = 0.9530, respectively. Therefore, a high degree of accuracy and minimal error may be achieved when utilizing the GB model to forecast the unconfined compressive strength of stabilized soil. It could be suitable for creating a numerical tool for figuring out the mechanical characteristics of stabilized soil.

Highest performance of GB model for predicting Qu representing by correlation between experimental and predicted value.
The performance and reliability of ML models are evaluated and summarized in Table 4. The evaluations in Table 4 shows the performance and reliability of single Gradient Boosting using default hyperparameters are higher than that in the investigation of Das et al. [16] and Suman et al. [17] in which the performance of Support Vector Machine with tunning hyperparameter are evaluated by R2 = 0.8390, RMSE = 0.80, MAE = 1.82 and R2 = 0.8464, RMSE = 0.80 and MAE = 1.82 for the testing dataset, moreover, the validation techniques were not considered in the two investigations. That reduces the reliability of Support Vector Machine model proposed in the investigations. The performance of the single Gradient Boosting in this study is close the performance obtained by the hybrid ML model Gradient Boosting-Particle Swarn Optimization (GB_PSO) of Tran’s study [1]. The validation techniques are used in both investigations including this study and Tran’s study [1] despite of difference of the applied validation technique. That guarantees the reliability of developed ML models. As mentioning in the introduction section, the hybrid ML model make complicatedly the training process and reduces the easily accessible of engineers for using the ML approaches in predicting the unconfined compressive strength of stabilized soil. Therefore, the close similarity between performance of the simple GB model and performance of the hybrid GB_PSO model make sense for the engineering in predicting and designing the unconfined stabilized soil Qu.
The reliable approach for solving complicated issues is to use ML models. One of the most serious criticisms of machine learning applications in civil engineering, however, is that the models are frequently black boxes that provide no insights, such as a mechanics-based or even empirical regression-based model [40]. This is a fair criticism because models that are clear about input-output relationships and sensitivities are essential for uncertainty quantification, future experimental or analytical model improvement efforts, and decision-making strategies to determine when the performance of ML models is insufficiently strong to be a reliable tool [41]. The outcomes of ML models are explained using feature importance analysis including permutation importance, partial dependence plot in two dimensions and SHapley Additive exPlanations local value in the machine learning technique [42].
Figure 7 shows the permutation importance analysis with aided of Sklearn library [18] for 12 input variables influencing on the value of Qu predicted by the single GB model. The order of feature importance can be classified as Maximal dry density (Density)> LS>Sand>Gravel>PL>Clay>OMC>PI>LL >Lime> Cement>Bitumen. The most important input variables as Density and Linear shrinkage and the least important input variables as chemical stabilization agents (cement, lime and bitumen) are consistent with the revelation of feature importance analysis in Tran [1]. It is worth noting that the cement content and lime content have low significant effect on UCS of stabilized in comparing with the process of compaction including Optimum moisture content and Maximal dry density and initial properties of soil. The effect of binder content, typically cement content and lime content can be sharply described in the SHAP local value (cf. Figs. 10 and 11).

Permutation importance analysis including 12 input variables on predicted value of output “Qu” compressive strength of stabilized soil.
In the four features as LS, Maximal dry density, cement and lime, the only linear shrinkage is the initial properties of soil or preprocess of stabilization input, meanwhile, the maximal dry density and cement, and lime are the direct actions on the unconfined compressive strength Qu of stabilized soil or postprocess of stabilization input. Moreover, the variable “Maximal dry density” and “OMC” is the most importance criteria of compaction process in stabilizing soil. Therefore, it is interesting to reveal the interaction effect of the input variables including 4 features interactions as Density-OMC, Density-LS, Lime-LS and Cement-LS. The interactions effects are performed by Partial Dependence Plot in two dimensions PDP 2D. The results of PDP 2D are shown in Figs. 8 and 9. The effects of features interactions on Qu are presented in form of grid contour which represents the value of unconfined compressive strength Qu. Higher density value in range from 1.52 g/cm3 to 2.12 g/cm3, the unconfined compressive strength Qu increases from 2.175 MPa to 3.423 MPa corresponding to OMC value to be equal to 5.98% and from 2.094 MPa to 3.583 MPa corresponding to OMC value to be equal to 22.73%. Higher density implies that higher content of chemical stabilizer agents must be used or the compaction process must be extensively used. That increases the unconfined compressive strength Qu of stabilized soil. In comparing with density effect, OMC influences less on Qu value (cf. Fig. 8a).

Partial Dependence Plot in two dimensions showing (a) Density-OMC interaction effect on Qu (MPa) and (b) Density-LS interaction effect on Qu (MPa) of stabilized soil.

Partial Dependence Plot in two dimensions showing (a) Lime-LS interaction effect on Qu (MPa) (MPa) and (b) Cement-LS interaction effect on Qu (MPa) of stabilized soil.
Meanwhile, higher linear shrinkage LS in range from 1 to 16.5%, the unconfined compressive strength Qu value decreases from 2.389 MPa to 2.049 MPa and from 4.192 MPa to 3.552 MPa corresponding to value of density 1.52 g/cm3 and 2.12 g/cm3, respectively (cf. Fig. 8b). It can conclude that the dependence of Qu on linear shrinkage is more significant than the dependence of Qu on OMC, however the dominance of density on Qu in comparing LS, OMC can be sharply observed. Higher value of LS, the unconfined compressive strength Qu decreases
The chemical stabilization agents including lime and cement can improve the unconfined compressive strength however the improvement of Qu depended more strongly on LS than these chemical stabilization agents (cf. Fig. 9). In fact, when LS value is equal to 1%, the value of Qu increases from 3.278 MPa to 3.288 MPa (increasing 0.3%) corresponding to lime content varying from 0 to 6%, and 3.249 MPa to 3.319 MPa (increasing 0.2%) corresponding to cement content varying from 0 to 6%. When LS value is equal to 16.5%, the value of Qu increases from 2.625 MPa to 2.658 MPa (increasing 1.3%) corresponding to lime content varying from 0 to 6%, and 2.594 MPa to 2.666 MPa (increasing 2.8%) corresponding to cement content varying from 0 to 6%. With the same lime content 6%, for instance, the value of Qu varies from 3.288 to 2.658 (decreasing 19.1%) corresponding to variation of LS value from 1 to 16.5% which induces to decrease Qu value from 3.319 MPa to 2.666 MPa (decreasing 19.7%). Overall, the effectiveness of compaction process, such as density, OMC for stabilization of soil is more significant than that of chemical stabilization agents.
The effectiveness of compaction process is sharply described in the specific cases shown in Figs. 10 and 11. Firstly, Figs. 10 and 11 show the local SHAP value which explains the contribution of each feature on Qu value predicted by GB model. Four specific cases of Qu value including predicted vs experimental values as 3.769 MPa vs 3.4 MPa, 2.554 MPa vs 2.62 MPa, 2.128 MPa vs 2.2 MPa, and 2.12 MPa vs 2.0 MPa are explained by local SHAP value described in Figs. 10a, 10b, 11a, and 11b, respectively. Figure 10a and 10b show that with the relative similar of LS value (5.4% and 6.2%), the compaction process needs to be performed more in order to improve the Qu value, for instance, represented lower OMC value (9% versus 11.4%) and higher maximal dry density value (2.1 g/cm3 versus 1.69 g/cm3). In which the contribution of chemical stabilization agents (cement and lime) on Qu values is insignificant. That is sharply observed in comparing the two specific cases in Fig. 11. The Qu of stabilized soil using 2% of lime content has relatively higher than that using 6% of lime content (2.128 MPa versus 2.12 MPa for predicted value and 2.2 MPa versus 2.0 MPa for experimental value), the higher value of Qu seems to be decided by LS values and maximal dry density values which are equal to be 4.5%, 1.78 g/cm3 (cf. Fig. 11a) and 5.7%, 1.93 g/cm3 (cf. Fig. 11b), therefore increasing maximal dry density can compensate the contributed value of Qu which is decreased by higher value of LS. Four values of Qu predicted by the GB model can be easily estimated by the PDP 2D of Maximal dry Density-LS (cf. Fig. 8b) with high accuracy. Moreover, the cement and lime need to be used with high value content such as 6% in order to have more significant effect on UCS of stabilized soil than the initial properties of soil such as clay content (cf. Figs. 10a and 11b).

Local SHAP value for explaining two specific cases (a) LS = 5.4, Density = 2.1, OMC = 9 and (b) LS = 6.2, Density = 1.69, OMC = 11.4.

Local SHAP value for explaining two specific cases (a) LS = 4.5, Density = 1.78, OMC = 18.5, (b) LS = 5.7, Density = 1.93, OMC = 7.9.
The PDP 2D value of Maximal dry Density vs Linear shrinkage can help the engineers to easily estimate the unconfined compressive strength of stabilized soil when the two important criteria such as initial property of soil “Linear shrinkage” and required compaction process “Maximal Dry Density” are known. That facilitates the design process of compressive strength of stabilized soil.

Comparison between actual Qu and predicted Qu by GB model using 6 input variables: (a) Training dataset, (b) Validation dataset, (c) Testing dataset and (d) All dataset.
According to investigation results of Explainable Machine Learning model including permutation importance analysis, PDP 2D and SHAP local in the previous section. The features of database take the most important factors such as compaction process (including Maximal dry density and OMC), Linear shrinkage LS and omits the least important factors such as chemical stabilizer content consisting of Lime, Cement and Bitumen. Moreover, in order to increase applicability in engineering practice, the input variables of grain size distribution consisting of clay, sand and gravel content are not also considered in the database. Therefore, the number of features is reduced from 12 input variables to 6 input variables in which the 4 initial soil properties such as Linear Shrinkage and Atterberg limits (Liquid limit (LL), Plastic limit (PL) and Plasticity index (LL-PL)), and 2 input variable of compaction process (Maximal dry density and OMC).
The database is divided into three set as training/validation/test sets in a 70/20/10 ratio (140/40/20 samples) according Brownlee [43]. The Qu prediction of GB model is presented in Fig. 12:
The comparison between actual Qu and predicted Qu by GB model using 6 input variables: (a) Training dataset, (b) Validation dataset, (c) Testing dataset and (d) All dataset (cf. Fig. 12) shows that even if the number of feature is reduced to 6 with 4 input variables of soil properties (LS, LL, PL and PI) and 2 input variables of compaction process (Maximal dry density, OMC), the GB model can predict the Qu of stabilized soil with high accuracy including R2 = 0.9456 RMSE = 0.2178 (MPa), MAE = 0.1293 (MPa) for the training dataset, R2 = 0.9113, RMSE = 0.2719 (MPa) MAE = 0.2063 (MPa) for the validation dataset and R2 = 0.9043, RMSE = 0.2999 (MPa), MAE = 0.1224 (MPa) for the testing dataset and R2 = 0.9347, RMSE = 0.2387 (MPa), MAE = 0.1542 (MPa) for the whole dataset. Therefore, a excel file is generated from GB model with aid of Python code in order to help the engineers to estimate UCS of stabilized soil from fewer input variables including 4 input variables of soil properties (LS, LL, PL, PI) and 2 input variables of compaction process (Maximal dry density and OMC). The excel file can be found in the supplementary file.
In this investigation, a well-known six single machine learning (ML) algorithms consisting of Linear Regression, K-Nearest Neighbors, Support Vector Machine, Random Forest, Gradient Boosting and Extreme Gradient Boosting have been introduced to build high performance and reliability ML model for designing the unconfined compressive strength (Qu) of stabilized soil. Using the validation technique Monte Carlo simulation with aided of three evaluation metrics such as R2, RMSE and MAE, this study shows the simple Gradient Boosting (GB) algorithm use the default hyperparameters to build the GB model which has highest performance and reliability in designing the unconfined compressive strength of stabilized soil. In which, R2, RMSE value, and MAE value of the highest performance GB model are 0.9530, 0.2001 MPa, and 0.1504 MPa for the testing dataset.
The predicted value of unconfined compressive strength is explained by three popular technique sensitivities including permutation importance, Partial Dependence Plot in 2D (PDP 2D) in grid contour value, and local SHAP value. The sensitivity analyses show Linear Shrinkage is the most important initial properties to know in designing the unconfined compressive strength. In order to improving successfully the unconfined compressive strength, the compaction process needs to be carried out extensively to acquire the required Maximal dry density. The mapping of unconfined compressive strength in dependence of Maximal dry density-Linear shrinkage created by PDP 2D is very useful for the engineers designing primally the value of unconfined compressive strength. The excel file generated from Gradient Boosting model help the engineers to estimate UCS of stabilized soil from 6 input variables including 4 soil properties consisting of LS, PL, LL, PI and 2 inputs of compaction process consisting of Maximal dry density and OMC. The application of the Machine Learning model of this investigation should be limited in range of min-max value of variable defined in Table 2.
Footnotes
Conflict of interest
The authors declare that there is no conflict of interest.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Availability of data and material
Data will be made available on request.
