Abstract
In hospitality and tourism research, p-values continue to be the most common approach to hypothesis testing. In this article, we elaborate on some of the misconceptions associated with p-values. We discuss the advantages of the Bayesian approach and provide several important practical recommendations and considerations for Bayesian hypothesis testing. With the main challenge of Bayesian hypothesis testing being the sensitivity of the results to prior distributions, we present in this article several priors that can be used for that purpose and illustrate their performance in a regression context.
Introduction
In hospitality and tourism research, and other related fields such as marketing and management, it is common to use p-values to test research claims or hypotheses. For the most part, p-values have been the sole criteria used to support or reject the hypotheses of interest. The American Statistical Association has recently issued a statement regarding a number of logical and practical limitations associated with p-values. The statement was mainly geared toward the misinterpretation and abuse of p-values, emphasizing the fact that “p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone,” and that “scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold” (Wasserstein & Lazar, 2016, p. 131).
Importantly, p-values “do not measure the size of an effect or the importance of a result” and “do not provide a good measure of evidence regarding a model or hypothesis.” In a recent article, Assaf and Tsionas (2018) have discussed all these issues in detail and have provided simulation evidence about the “dancing p-values” phenomena, illustrating specifically how p-values can fluctuate between various samples testing the same hypothesis. The authors emphasized the importance of using Bayes factors (BF) or marginal posterior densities as an attractive alternative to p-values. The Bayesian approach, which is gaining popularity across marketing and management (Zyphur & Oswald, 2015), allows researchers to provide quantitative evidence for any hypothesis, including the null hypothesis. Importantly, the Bayesian approach relies only on the data at hand and not on some imaginary repeated samples that are not usually observed.
The goal of this note is to expand on the issues raised by Assaf and Tsionas (2018), further emphasizing the advantages of using the Bayesian approach for hypothesis testing and model comparison in hospitality and tourism research. We provide several important practical recommendations for future research.
Bayesian Hypothesis Testing
Background
In this note, we do not intend to provide a detailed background of the Bayesian approach, as this has been explained in detail elsewhere (see Assaf & Tsionas, 2018; Zyphur & Oswald, 2015). Generally, the Bayesian analysis proceeds as follows. Suppose we have a model with parameters
The denominator is also
Suppose now that we have two models, say A and B, with parameters
where
In the case of two parameters, viz.
For a Bayesian, one of the most critical parts in hypothesis testing is to look directly at the marginal posterior densities of the parameters of interest. Suppose we have a general model whose parameters are
In general, this multivariate integral is not available in closed form. To compute an approximation, notice that:
from which we obtain:
The important feature is that this holds identically for all
The numerator can be computed easily in certain instances, but the denominator is unknown. If we assume that
In this formula,
Advantages and Challenges
As shown from Equation 2, for example, and in contrast to the p value approach, the Bayesian approach provides the researcher with several important advantages. First, the “Bayes factor is inherently comparative: It weighs the support for one model against that of another” (Andraszewicz et al., 2015, p. 529). It can be highly effective in the context of model comparison when the number of potential regressors is large and the objective is to select a parsimonious specification with good fit
1
. Second, the BF deals only with the data at hand, and not with some imaginary samples that are not actually observed (as is the case with the frequentist approach). Third, the BF “provides evidence for and not only against
Regression Context
Suppose we have a regression model:
or, in familiar notation
Since
that is,
In this case, the prior of
Let us assume, in the interest of simplicity, the following “noninformative” or flat prior:
which is the limiting case of Equation 10 when
where
With the more general prior (9) we have:
where
and
which is the ridge-regression estimator. Interestingly, under the prior in Equation 10, it is no longer possible to integrate
where
where
where
Other Priors and Some Important Notes
As mentioned above, the main challenge for Bayesian hypothesis testing is that the results can be sensitive to the prior distributions of the parameters reflecting the hypotheses one is testing. Along with the priors in Equations 9 and 10, we present in this section other priors that can be used for Bayesian hypothesis testing.
Zellner and Siow (1980) Prior
Andraszewicz et al. (2015) recommended using the Jeffreys–Zellner–Siow priors as default priors for linear regression (Jeffreys, 1961, 1973; Liang et al., 2008; Zellner & Siow, 1980).
Zellner and Siow (1980) suggested the following prior:
and
This is a
To illustrate, suppose for example, we can write the regression model as:
where
compared with
in the sense that at least one element of
To see how the BF works, it makes sense to consider the following POR:
The reader should notice that the POR is nothing but a ratio of two marginal likelihoods (and it is, therefore, a BF if the prior odds ratio is 1). The numerator imposes the restriction in
For “large” values of measure
Since
where
where
Importantly, we note that the recommendation of Andraszewicz et al. (2015) should not to be taken at face value. We examined this issue using an artificial data set with

Sample Distributions of Zellner–Siow (1980)
When
We also report the marginal posterior densities of the parameters obtained under this prior (Figure 2). We apply MCMC techniques
5
to provide access to these posteriors when

Marginal Posteriors of

Sample Distributions of
As shown in Figure 2, some marginal posteriors (those corresponding to
Zellner’s g-Prior
Another prior that can be used is the Zellner’s (1986) g-prior, which rests on the ordinary least square (OLS) result that the covariance matrix of the OLS estimator is
for some scalar
where
and
where
An Alternative Multivariate Cauchy Prior
Another option is the following multivariate Cauchy prior:
where the prior location parameter vector is
Although we cannot recommend universal adoption of this prior, it does appear that it behaves fairly well in most situations.
7
To validate, we again use the regression example in Section 4.1, which incidentally has a level of collinearity. We examined two cases: (a)

Sample Distributions of
From the results in Panels a and b of Figure 4 (corresponding respectively to Cases a and b above), we saw that support for a is quite limited when
In general, the multivariate Cauchy prior we propose here is remarkably “robust” in the sense that posterior inferences are approximately the same, irrespective of extreme values for
Practical Recommendations and Concluding Remarks
As presented in Andraszewicz et al. (2015, p. 540): “One may argue that in many situations, the data will pass the interocular traumatic test (i.e., when the pattern in the data is so evident that the conclusion hits you straight between the eyes; Edwards et al., 1963), and the results will be clear no matter what statistical paradigm is being used. Luckily, this is true; however, some data fail the interocular traumatic test and the results may indeed depend on the statistical paradigm that is used. In such cases, it seems worthwhile to not just base one’s statistical inference on frequentist methods alone but also rely on Bayesian techniques.”
We think one needs to be careful about this advice as it does not favor explicit articulation of one’s prior beliefs. Rarely, if ever, does one embark on research without prior knowledge. All papers cite previous literature, so, in practice, the researcher always has prior information. The question is how to incorporate such prior information into the analysis (a problem solved by Bayes’ theorem) and whether the posterior results are sensitive to reasonable deviations from the prior beliefs (that represent, e.g., beliefs that other researchers may reasonably entertain).
In this instance, looking at marginal posterior densities is quite informative and one is led to question the “validity” of prior information. Such priors are unlikely to holdup to the scrutiny of other researchers or they are unlikely to be shared by other researchers. Therefore, it is not necessary to look at both frequentist and Bayesian results; one can just look at Bayesian results and perform posterior sensitivity analysis with respect to the prior. One should also train themselves in looking at marginal posterior densities rather than relying on summary measures like the posterior mean and the posterior standard deviation. An often asked question is whether one can just look at the posterior mean and assign its ratio to the posterior standard deviation, the role of a “t statistic.” With a conjugate prior like Equation 10, one can certainly do so, as the marginal posterior of each
The answer to the question of how large should a sample size be, really depends on the parameters of the model itself. For the toy model
To sum up, and as Andraszewicz et al. (2015, p. 540) write: “So should management scientists become Bayesians? Given the lack of user-friendly software and course material.” The “lack of user-friendly software” is, of course, challenging; although programs like WinBUGS can be used to perform posterior analyses and prior sensitivity for most models used in management, marketing, and hospitality and tourism research. On the other hand, such lack is desirable as it discourages unwarranted reliance on “default priors” that may not be compatible with common sense beliefs about certain parameters. Some studies also recommend the comparison of p-values and the Bayesian approach to hypothesis testing. Again, we believe this recommendation is ill-advised. The two approaches measure quite different things and it is not easy to convince applied researchers that p-values and the Bayesian approach cannot be compared. For more sound advice, we refer the reader to Wasserstein and Lazar (2016).
Supplemental Material
Supplemental_material – Supplemental material for Bayesian Hypothesis Testing for Hospitality and Tourism Research
Supplemental material, Supplemental_material for Bayesian Hypothesis Testing for Hospitality and Tourism Research by A. George Assaf and Mike Tsionas in Journal of Hospitality & Tourism Research
Footnotes
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
