Abstract
To elicit an informative prior distribution for a normal linear model or a gamma generalized linear model (GLM), expert opinion must be quantified about both the regression coefficients and the extra parameters of these models. The latter task has attracted comparatively little attention. In this article, we introduce two elicitation methods that aim to complete the prior structure of the normal and gamma GLMs. First, we develop a method of assessing a conjugate prior distribution for the error variance in normal linear models. The method quantifies an expert's opinions through assessments of a median and conditional medians. Second, we propose a novel method for eliciting a lognormal prior distribution for the scale parameter of gamma GLMs. Given the mean value of a gamma distributed response variable, the method is based on conditional quartile assessments. It can also be used to quantify an expert's opinion about the prior distribution for the shape parameter of any gamma random variable, if the mean of the distribution has been elicited or is assumed to be known. In the context of GLMs, the mean value is determined by the regression coefficients. Interactive graphics is the medium through which assessments for the two proposed methods are elicited. Examples illustrating use of the methods are given. Computer programs that implement both methods are available.
Keywords
Introduction
Methods for quantifying expert opinion about a generalized linear model (GLM) have been proposed (Bedrick et al., 1996; Chen and Ibrahim, 2003; Garthwaite et al., 2013). These methods all focus primarily on the task of quantifying opinion about regression coefficients. For some GLMs, such as logistic regression, this determines a complete prior distribution. But with some other common GLMs, such as the normal linear model and gamma GLMs, prior opinion about an extra parameter must also be quantified in order to complete the prior distribution for all model parameters.
A normal linear model is, of course, a form of GLM and the task of quantifying opinion about normal linear models has also been addressed by Kadane et al. (1980) and Garthwaite and Dickey (1988, 1992), amongst others. Some of these elicitation procedures contain a method of assessing opinion about a linear model's error variance, but the methods have drawbacks. For example, the method of Kadane et al. (1980) requires assessments of 0.9375 quantiles of predictive distributions in the part of their procedure that relates to the error variance. Assessing quantiles well into the tails of a distribution is a difficult task that people perform poorly (Alpert and Raiffa, 1969). If a different method were used to quantify opinion about the error variance, then that part of the procedure of Kadane et al. (1980) could be dropped. Garthwaite and Dickey (1988) separate the task of quantifying opinion about regression coefficients from that of quantifying opinion about the error variance. This is potentially beneficial; decomposing a complex assessment problem into a number of smaller problems is recommended (Hogarth, 1975). However the number of assessments that Garthwaite and Dickey (1988) elicit from the expert is the minimum number needed to determine the hyperparameters of the prior distribution for the error variance. A better approach is to elicit enough assessments to give several estimates of the hyperparameters and to then reconcile these estimates in some way (Kadane and Wolfson, 1998). This same criticism applies to methods used to quantify experts’ opinion about the variance of a multivariate normal distribution (Al-Awadhi and Garthwaite, 1998, 2001).
The first task addressed in this article is to assess a prior distribution for the error variance. The method of Garthwaite and Dickey (1988) is extended so as to obtain several estimates of the hyperparameter that is most difficult to assess (a degrees of freedom parameter). Reconciliation of these estimates (perhaps with further input from the expert) yields an overall estimate of it. Although designed to quantify opinion about an error variance, the proposed method may also prove useful in contexts where observations can be paired and their difference is normally distributed with a mean of 0. This is illustrated in an example.
The second task addressed in this article is to assess prior distributions for the shape parameter of a gamma distribution and the scale parameter of gamma GLMs. It is well-known that the scale parameter of a gamma GLM, which is the reciprocal of the dispersion parameter, is also the shape parameter of the gamma distribution. Bayesian methods have been developed for analysing data to estimate these parameters.
Miller (1980) proposed a general conjugate class of priors for the two parameters of the gamma distribution, but he gave no method of eliciting its hyperparameters. Sweeting (1981) introduced some suggestions for the Bayesian estimation of the scale parameters in exponential families. The problem of unknown scale parameters in GLMs was examined by West (1985). In his work, he discussed general ideas concerning scale parameters and variance functions in non-normal models including gamma GLMs, (see also West et al., 1985). However, there does not seem to be a good method of eliciting a prior distribution for such parameters. Ibrahim and Laud (1991) suggested a Jeffreys's prior for the regression coefficients and an independent marginal informative prior on the scale parameter of the gamma GLM, but they did not suggest any family of distributions for this informative prior. The method of Bedrick et al. (1996), which is considered as the first elicitation method of informative prior distributions for GLMs, assumed the scale parameter to be known and elicited priors only for the regression coefficients. Chen and Ibrahim (2003) proposed a novel class of conjugate priors for GLMs. They also discussed elicitation issues and strategies of these conjugate priors. Their proposed prior structure involves the dispersion parameter as well. However, no explicit elicitation method was introduced for this parameter.
Hence, although prior distributions for the shape parameter of a gamma distribution and the scale parameter of gamma GLMs have been proposed in the literature, no prior elicitation method for these parameters has been suggested. To fill this gap, we propose a new method for eliciting lognormal prior distributions for such parameters. It is based on conditional quartile assessments, where the condition is that the mean of the gamma distribution is known or has already been elicited. In the context of GLMs, this mean is given by assessments of the linear predictor.
The two methods proposed in this article are implemented in interactive graphics programmes that could be used as add-ons to any method of quantifying opinion about the regression coefficients of a normal linear model or a gamma GLM. Procedures for quantifying opinion about regression coefficients form an important area of subjective probability assessment (reviews are given in Garthwaite et al., 2005 and O'Hagan et al., 2006), but such procedures are outside the scope of this article, as assessing regression coefficients is separate to the tasks addressed here.
The article is organized as follows. In Section 2 we extend the method of Garthwaite and Dickey (1988) for eliciting the variance of random errors in normal linear models. In Section 3 a novel method for eliciting a lognormal prior distribution for the scale parameter in gamma GLMs is proposed. Implementations of the two methods are described in Section 4 and their use is illustrated through examples in Section 5. Concluding comments are given in Section 6.
Eliciting a prior distribution for the error variance in normal linear models
We suppose a dependent variable
In the method of Garthwaite and Dickey (1988), the expert is asked to suppose that two responses,
As noted in the introduction, using just two assessments to determine two hyperparameters is undesirable. Here we extend this method so that a number of different hypothetical data sets are presented to the expert, the expert repeating the above task after each one. This yields several estimates of
Each hypothetical data set consists of a number of pairs of observations, where the two observations in a pair are taken at the same values of the predictor variables. Focusing on any one hypothetical set, suppose it consists of
Clearly, from (2.1) and (2.2), given
As in (2.8) and (2.9), integrating
As
A lower bound,
Each hypothetical data set yields a separate estimate of
In this section, we propose a novel method for eliciting a lognormal prior distribution for the scale parameter of a gamma GLM. The method is a viable means of eliciting the shape parameter of any gamma distribution once the distribution's mean has been elicited (or the mean is assumed to be known).
Suppose a GLM has the form,
For the gamma GLM in (3.1), with any monotonic increasing link function
We base our method on a gamma distribution with
We require a meaningful strictly monotonic function in
We believe that the expert can efficiently quantify her opinion about quartiles more easily by using the bisection method, see for example Pratt et al. (1995). As shown in Figure 1, the lower quartile of

So, we choose to question the expert about the lower quartile,
The expert will be asked to assess three quartiles of her prior distribution for
John has this disorder and will spend a time in hospital. Suppose he is fortunate and does not spend as long as most people in hospital. Specifically, suppose exactly 25% of patients with John's disorder spend a shorter time in hospital than John. Give your median assessment for the length of time,
The median and quartiles divide the range of
Let
The elicitation methods proposed in Sections 2 and 3 have each been implemented as an interactive graphical procedure. We describe these implementations in turn.
Quantifying opinion about the error variance in normal linear models
So as to frame questions in a way that is meaningful to the expert, first the expert is asked whether the normal linear model relates to an experimental setting, observations on people, or observations on items. In the context of an experiment, a set of values chosen for the predictor variables are referred to as a ‘design point’, so ‘two responses at the same design point’ form each pair of observations. With observations on people (items), the observations are on ‘two people (items) whose covariate values give them identical characteristics (features).’
In a dialogue box, the expert is then asked to assume that two independent experiments have been conducted at the same design point. She assesses her median value of the absolute difference between the observed responses in these two virtual experiments. This assessed median is
Her remaining assessments are conditional assessments after being shown hypothetical data sets. The choice of the conditioning values in these data sets is an important issue. Garthwaite and Dickey (1988) note that hypothetical data should (a) be moderately different from the expert's initial beliefs, so that the data change the expert's beliefs by a measurable amount, but (b) should not be so different that the expert dismisses the data as being false and misleading. The hypothetical data set used by Garthwaite and Dickey (1988) consists of just one hypothetical difference, which they set at

Our other hypothetical data sets contain more items. This gives these data sets greater weight and means that they should have greater effect on the expert's opinions, without the expert finding them unbelievable. In our past experience of using the method of Garthwaite and Dickey (1988), occasionally an expert has said that a single datum is too inconsequential to have any impact on his beliefs, which is also a problem that larger data sets should resolve. Kadane et al. (1980) present an expert with a sequence of hypothetical data sets and suggest that an expert should not be asked to forget any hypothetical data after it has been presented. They argue that ‘…asking the experimenter to forget would impose too great a psychological burden’. (Kadane et al., 1980, p. 849). In this spirit, we make our hypothetical data sets steadily bigger, so that each contains all the hypothetical data in its predecessors.
In our implementation we generate five hypothetical data sets. The

After a suitable hypothetical data set has been generated, it is presented to the expert using the graphical interface. Figure 3 is an example in which a set of 16 hypothetical data is displayed (the arrows pointing up). Their median is the arrow pointing down and is a summary of the data that the expert may find helpful. The expert's original median assessment,
In conjunction with
In the context of a GLM, it is assumed that one design point has been specified as a reference design point and
Next, the expert is asked to assess a median value
The expert is then asked to assess her uncertainty about
The initially assessed quartiles

Under our prior model for

For statistical coherence of the assumed normal distribution of ln
The top group of tabulated values in the right-hand side panel of Figure 4 gives the values of the three suggested coherent quartiles
The lower graph in Figure 4 represents the elicited distribution of the lower quartile
As further feedback, the expert should record her assessments of the median, lower and upper quartile of Q. Then she should change the median in the upper graph so that it equals her lower quartile assessment. Both tails of the pdf for
This feedback may indicate that the assessed quartiles do not correspond to the expert's opinions, in which case she should revise them. Some users of the software have found it easier to relate to this feedback, rather than Q, and have formulated their initial quartile assessments by examining the pdf of
When the expert is happy with the quartile values and the corresponding pdf graphs as a reasonable representation of her opinions, she clicks ‘Done’ and obtains the two corresponding hyperparameters
The first example uses the first elicitation method to quantify opinion about the variance of a normal distribution. While that elicitation method requires equations (2.3) and (2.5) to hold, the example illustrates that the method's usefulness is not restricted to quantifying opinion about experimental error. Application of the second elicitation method is illustrated in Section 5.2, where it is used to quantify opinion about the shape parameter of a gamma distribution.
Measurements along Pine Island Glacier
Pine Island Glacier is a large ice-stream located on the West Antarctic Ice Sheet in Antarctica. It is the fastest shrinking glacier on Earth and, in common with other glaciers that drain into the Amundsen Sea, its progress is accelerating and it is thinning rapidly. These observations have been attributed to the regional oceanography whereby the heat contained in deep water acts to melt the underside of an ice shelf, making it weaker and more likely to crack and fall into the sea.
In the present example, the expert is a polar oceanographer who has done field work on Pine Island Glacier and is in the early stages of planning further work there. He intends to measure water characteristics (temperature, salinity and pressure) at regularly spaced points on a 40 km transect along an ice shelf, having done similar work along the boundary between the sea and the Breidamerkurjökul glacier in Iceland. The spacing between the points where data are collected should be kept as large as possible while meeting scientific needs—polar expeditions must meet multiple objectives, so there is time pressure on all experiments. Here the expert quantifies his opinion about the variance of water temperature along the planned transect; this variance influences the appropriate choice of spacing between data-collection points.
Covariates that affect water temperature include depth, meltwater content, time of year and recent weather. The expert assumed that these covariates had fixed values when quantifying his opinion and, in particular, focused on temperatures at a depth of 60 meters in summer, when there had been no recent surge in meltwater, and no recent heavy precipitation. The difference in temperature between two points depends on the distance between them and it was decided to consider points that are two kilometers apart. Thus we consider pairs of points that are two kilometers apart and we let
Using the interactive software, the expert gave
A screen-shot for the last of these assessments is shown in Figure 3.
A table was then displayed that showed the expert the values of the degrees of freedom parameter that each of his assessments had implied. The value from his fourth assessment was above 20—out of line with his other assessments. The figure on which he had given his fourth assessment was re-displayed. One hypothetical datum had a large value of
Water-table depth in Wiltshire, UK
Ecologists are often interested in determining how environmental factors influence plant growth, competition and diversity, not least in future scenarios. Soil water availability is a major environmental driver in this respect, with plants demonstrating a high degree of sensitivity to it (see for example Araya et al., 2011). Soil water availability can be measured in a number of ways, the most common one being water-table depth (WTD). It is highly anticipated that any change in the availability of soil water (for example due to global warming) will have significant impact on water-dependent ecosystems such as wetlands.
A team of plant ecologists have recently been studying the levels of the WTD in wet grassland during the growing season (March–September) in Wiltshire, UK. Although data of the current levels of the WTD are now available for this region, experts’ opinions for future levels in 2050 need to be quantified. In this example, a plant ecologist (Dr. Yoseph Araya, University of London, UK) used our proposed method for the gamma distribution to quantify his opinions about the WTD in Wiltshire, UK, during the growing season of 2050. WTD in Wiltshire is a continuous positive valued random variable whose values in Wiltshire currently range from zero, at flooding time, to a maximum that may reach 80 cm at the peak of summer. Over the growing season of 2010, the shape of the WTD distribution was positively skewed with a mean of approximately 45 cm. WTD is often assumed to have a gamma distribution; see for example Yeh and Eltahir (2005).
Here we assume that the WTD in Wiltshire during the growing season of 2050 follows a two-parameter gamma distribution as in (3.2). The WTD is affected by climate variables, particularly rainfall, temperature and wind, whose future values, in turn, depend on the quantities of greenhouse gases that are emitted in years to come. The expert gave assessments of WTD for one ‘Representative Climate Forcing’, called RCP6. (Under RCP6, greenhouse gas emissions continue to rise without strong mitigation.) He assessed the mean value of WTD,
Suppose that there is a time point in the growing season of 2050 where the depth of water-table in Wiltshire records a value, say Q, that is greater than exactly 25% of all readings of WTD during this season. Given that the true mean value of WTD is
Responding to this question, the expert assessed his median
He was then asked to assess his lower and upper quartile of
Concluding comments
In this article, two elicitation methods have been proposed. One was designed for assessing a conjugate inverse chi-squared prior distribution for the error variance in normal models, although its usefulness is not limited to that application. The method quantifies an expert's opinions through assessments of a median and conditional medians of the absolute difference between two observations of the response variable that have the same expected value. A number of sets of hypothetical data are used in order to obtain several estimates of the hyperparameter that is most difficult to assess, namely, the degrees of freedom parameter of the inverse chi-squared distribution. Reconciliation of these estimates, using the geometric mean, yields an overall estimate of the number of degrees of freedom. The second hyperparameter of the inverse chi-squared prior distribution is also determined from the same assessments.
The other method elicits a lognormal prior distribution for the scale parameter of a gamma GLM, or the shape parameter of any gamma distribution. The method depends only on quantifying an expert's opinion about the lower quartile of a gamma distributed random variable. This lower quartile is itself a random variable, for which the expert assesses a median value as a point estimate, and an interquartile range. The lower quartile is a monotonic increasing function of the scale parameter so the expert's assessments can be transformed to quartiles of the lognormal distribution, and hence to the hyperparameters of the lognormal distribution. Examples of questions that can be addressed to the expert were given.
The use of interactive graphical software greatly facilitates the tasks that the expert must perform in both these elicitation methods. Computer programmes that implement the methods form part of Prior Elicitation Graphical Software (PEGS) that is freely available and can be downloaded from http://statistics.open.ac.uk/elicitation. PEGS-Normal can be used as an add-on to any other elicitation software for normal linear models. PEGS-Gamma is a stand-alone version that can be used to quantify opinion about the shape parameter of a gamma distribution with a known mean. An important component of PEGS is PEGS-GLM, which is interactive graphical software for quantifying opinion about a generalized linear or piecewise-linear model (Garthwaite et al., 2013). PEGS-GLM now incorporates versions of PEGS-Normal and PEGS-Gamma. Thus, quantifying opinion about error variance can be a natural part of the elicitation process for ordinary linear regression models (as it should be), and assessing the scale parameter
Expert opinion is most commonly quantified when data are absent. For example Garthwaite et al. (2008) report work where PEGS was used in an NHS study aimed at improving the provision of treatment for bowel cancer in England and Wales. In the complex model that was constructed, there were a small number of parameters for which no data were available and the judgements of clinical experts were quantified to fill these gaps. At the same time, when sample data are available, an important question is whether quantifying expert opinion though our methods is still worthwhile. That is, will a better posterior distribution be obtained through using an expert's prior distribution rather than a non-informative prior distribution. Experiments are needed to examine this question and we hope the availability of software will encourage such experiments.
Footnotes
Acknowledgements
The work reported here was supported by a studentship from The Open University, UK. We are very grateful to Dr. Mark Brandon (a polar oceanographer and a reader in the Department of Environment, Earth and Ecosystems, The Open University, UK) whose opinion was quantified in the polar-ice example, and to Dr. Yoseph Araya (a plant ecologist and a lecturer in Environmental Geography at Birkbeck College, University of London, UK) whose opinion was quantified in the water-table depth example. We must also thank a referee for helpful comments that greatly improved this article.
