Abstract
When making predictions and inferences, data analysts are often faced with the challenge of selecting the best model among competing models as a result of large number of regressors that cumulate into large model space. Bayesian model averaging (BMA) is a technique designed to help account for uncertainty inherent in model selection process. In Bayesian analysis, issues of the choice of prior distribution have been quite delicate in data analysis and posterior model probabilities (PMP) in the context of model uncertainty under model selection process are typically sensititve to the specification of prior distribution. This research identified a set of eleven candidate default priors (Zellner’s g-priors) prominent in literature and applicable in Bayesian model averaging. A new robust g-prior specification for regression coefficients in Bayesian Model Averaging is investigated and its predictive performance assessed along with other g-prior structures in literature. The predictive abilities of these g-prior structures are assessed using log predictive scores (LPS) and log maximum likelihood (LML). The sensitivity of posterior results to the choice of these g-prior structures was demonstrated using simulated data and real-life data. The simulated data obtained from multivariate normal distribution were first used to demonstrate the predictive performance of the g-prior structures and later contaminated for the same purpose. Similarly for the same purpose, the real life data were normalized before using the data as obtained. Empirical findings reveal that under different conditions, the new g-prior structure exhibited robust, equally competitive and consistent predictive ability when compared with identified g-prior structures from the literature. The new g-prior offers a sound, fully Bayesian approach that features the virtues of prior input and predictive gains that minimise the risk of misspecification.
Keywords
Introduction
In regression analysis, picking a single model among competing models tends to ignore the uncertainty associated with the specification of a selected model as a result of overstatement of the strength of evidence via
Prior distributions play very crucial roles in Bayesian probability theory as it is attractive to have conditional distributions that have a closed form under sampling (Okafor, 1999; Lee, 2004). Zellner (1983, 1986) proposed a procedure for evaluating a conjugate prior distribution referred to as Zellner’s informative g-prior, or simply g-prior. The g-prior has been vastly used in Bayesian analysis in multiple regression models, due to the verity that analytical results are more readily available, better computational efficiency and its simple interpretation (Davison, 2008). The benchmark g-prior structure has proven universally popular in BMA, since it leads to simple closed form expressions of posterior quantities and because it reduces prior elicitation to the choice of a single hyperparameter
Methodology
Bayesian model averaging
Bayesian Model Averaging (BMA) is a technique designed to help account for the uncertainty inherent in the model selection process, BMA focuses on which regressors to include in the analysis. By averaging across a large set of models one can determine those variables which are relevant to the data generating process for a given set of priors used in the analysis (Hoeting et al., 1999). Given a linear regression model with constant term
This gives rise to 2
BMA uses each model’s posterior probability,
The posterior model probability of
where
and
The estimated posterior means and standard deviations of
The Bayesian framework calls for specifying a prior distribution on the model’s parameters
where
that is partly determined by the scalar hyperparameter
where
Summary of Identified g-prior Structures Examined
The elicitation of
BMA inference thus hinges on posterior model probabilities and, in turn, on model priors
This research identified a set of eleven candidate default priors (Zellner’s informative g-prior that is based on a sample of
To assess the predictive ability of the g-priors, predictive criteria like Log Predictive Score (LPS) and Log Marginal Likelihood (LML) were employed.
Log predictive score (LPS)
Log Predictive Score (LPS) assess both the sharpness of a predictive distribution and statistical consistency between the distributional forecasts and the observations (Kadane & Lazar, 2004). The analysis requires the splitting of the data set into a training set
The predictive ability of any model is measured by the sum of the logarithm of the posterior predictive ordinates for the observations in the hold-out set. The log score for any given model is the observed coordinate of the predictive density
where
The log predictive score is a proper scoring rule for assessing predictive performance and a smaller value of LPS makes a Bayes model a prior choice for
The marginal likelihood or the model evidence is the probability of observing the data given a specific model and is defined as:
If we have two models
For more than two models, we can compute the marginal likelihoods of each and ask which among the set is the largest. The fundamental quantity in Bayesian model comparison is the marginal likelihood (sometimes also called the “evidence”), which is simply the likelihood of the data integrated over all parameter choices.
Predictive ability under different choices of g-priors examined using log predictive scores (LPS)
Predictive Ability under Different Choices of g priors Examined using Log Marginal Likelihood (LML)
The effects of the set of g-priors were examined using simulated dataset drawn from multivariate normal distributions and later the simulated data obtained was contaminated using chi-square distribution with degree of freedom 2, (level of contamination
An overview of the result using simulated data as reported above in Table 2 shows that the LPS for the twelfth g-prior structure at the different model spaces are of the lowest.
An overview of the result using simulated data as reported above in Table 3 shows that the LML for the twelfth g-prior structure at the different model spaces are of the highest.
Comparing the predicted values of the 72
observation (dependent variable) with its actual value using log predictive scores (LPS) across parameter g-priors
Comparing the predicted values of the 72
Results using data provided by FLS
The effects of the set of g-priors using datasets provided by FLS (Fernandez et al., 2001a) prominent in the BMA literature were examined. The analysis was based on
The results from Table 4 above show that the actual value of the dependent variable of the 72
Published real life data
The real life data for the implementation of the sensitivity and predictive performance of identified g-priors and proposed new g-prior structures were obtained from two sources:
The National Bureau of Statistics (NBS), Nigeria; 2012 Annual Reports on all Official Statistics on socio-economic and macro-economic indicators, various machinery and tools that have been brought to bear in improving the efficiency and reliability of official statistics. Data frames comprise of Bulletin of Statistics, Statistics South Africa, Vol. 43. 2; 2009 Annual Reports on National accounts to include Gross domestic product (GDP), Percentage change in the quarterly GDP by industry and other economic indicators. Data frames comprise of
Predictive ability under different choices of g-priors examined using both log predictive scores (LPS) and log marginal likelihood (LML) for the normalised real life data (a) and (b)
Predictive ability under different choices of g-priors examined using both log predictive scores (LPS) and log marginal likelihood (LML) for the un-normalised real life data (a) and (b)
Normalizing transformations were made on the data sets to make them multivariate normal, achieved by the standardization of the data set and removal of influential and extreme observations. For real life data (a) and real life data (b), the predictive abilities based on these real data under the different choices of g-priors were examined and compared using Log Predicted Scores (LPS) and Log Marginal Likelihood (LML) (see Tables 5 and 6).
An overview of the result as reported above in Table 5 shows that the LPS for the twelfth g-prior structure at the different model spaces using normalized real-life data (a) and (b) are of the lowest. Similarly, the results show that the LML for the twelfth g-prior structure at the different model spaces using normalized real-life data (a) and (b) are of the highest.
An overview of the result as reported above in Table 6 shows that the LPS for the twelfth g-prior structure at the different model spaces using un-normalized real-life data (a) and (b) are of the lowest. Similarly, the results show that the LML for the twelfth g-prior structure at the different model spaces using normalized real-life data (a) and (b) are of the highest.
Conclusion
The study demonstrated that fixing g to arbitrary values may have unintended consequences on posterior model probabilities and subsequently the predictive ability of the g-prior. Given huge model space for cases with more observations than regressors or for cases with lesser observations than regressors, reliable results were obtained using a new g-prior structure. This study complements the contributions of Fernandez et al. (2001), Eicher et al. (2007) and Liang et al. (2008) as it establishes a new robust g-prior structure that exhibits consistent, competitive and reliable predictive performance compared with other parameter g-priors suggested in literature and further provides closed-form representations for posterior quantities under BMA analysis.
As noted in section three, the studies have assessed the g-prior predictive performance in BMA under different conditions using simulated data that follow normal distribution or otherwise and real life data that follow normal distribution or otherwise. The empirical results tend to favour g-specifications ascribing values to g between 0 and 100 that effectively select the right models. Thus, the results demonstrated that fixing g prior values runs the risk of over or understating the importance of some variables (i.e. the posterior inclusion probability of the regressors of models). In conclusion, the new g-prior [
