Abstract
A model that extends the Rasch model and the Partial Credit Model to account for subject-specific uncertainty when responding to items is proposed. It is demonstrated that ignoring the subject-specific uncertainty may yield biased estimates of model parameters. In the extended version of the model, uncertainty and the underlying trait are linked to explanatory variables. The parameterization allows to identify subgroups that differ in uncertainty and the underlying trait. The modeling approach is illustrated using data on the confidence of citizens in public institutions.
Keywords
Introduction
Individual-specific tendencies to respond to items irrespective of content can affect the reliability and validity of scale scores. In particular, response styles and their impact on reliability have been thoroughly investigated. Response styles include a tendency to middle or extreme categories, a tendency to agree with items regardless of content, or a tendency to respond to items carelessly. An overview that includes more response styles was given by Van Vaerenbergh and Thomas (2013).
Various methods for investigating response styles in latent trait theory have been proposed. One method uses multi-trait models, which assume that there are several distinct traits that influence category selection, one or more of the traits representing response styles, see, for example, Bolt and Johnson (2009), Johnson and Bolt (2010), Bolt and Newton (2011), Wetzel and Carstensen (2017), and Falk and Cai (2016). Johnson (2003) considered a cumulative type model for extreme response styles, Wetzel and Carstensen (2017) and Plieninger (2016) proposed partial credit models (PCMs) that account for specific response styles, and Jin and Wang (2014) and Tutz et al. (2018) extended the PCM to accommodate for the extreme response style. An alternative strategy for measuring response style is the use of finite mixtures, which was introduced by Rost et al. (1996). It is assumed that the observed response is a mixture of a finite number of latent responses, that means, the whole population can be divided into disjunctive latent classes. After classes have been identified, it is investigated if item characteristics differ between classes, potentially revealing differing response styles, see, for example, Eid and Rauber (2000), Gollwitzer et al. (2005), Maij-de Meij et al. (2008), Moors (2010), and Van Rosmalen et al. (2010). An instructive overview on mixture-distribution and HYBRID Rasch models was given by von Davier and Yamamoto (2007). More recently, tree-based methods to investigate response styles have been proposed, see Böckenholt (2017) and Böckenholt and Meiser (2017).
This article investigates a specific response behavior that does not amount to a response style in the traditional sense although it shares similarities with the noncontingent response style (NCR), which is found if persons have a tendency to respond to items carelessly, randomly, or nonpurposefully (Baumgartner & Steenkamp, 2001; Van Vaerenbergh & Thomas, 2013). The response behavior that is considered is characterized by varying degrees of uncertainty. It means that respondents may respond in a deliberate way, knowing exactly which category they prefer, or suffer from a high degree of uncertainty, responding nonpurposefully. Traditionally, the term “response style” is used to describe an individual’s tendency to choose a certain kind of response category, for example, extreme or middle categories, irrespective of item content and the individual’s trait value. Modeling of potential uncertainty is somewhat different. If a person is very certain about the category he or she prefers, the person will have a very high probability to choose a specific category for any given item; however, the chosen categories will be different over items if the item parameters differ across items. Therefore, the person will not prefer a specific kind of response category and thus the behavior is not a response style in the traditional sense. Although one could consider it as a response style in a wider sense because the person has a specific way to respond to items that are not driven by content, the authors will not refer to it as a response style to avoid confusion with the term response style as it is commonly used.
In recent years, the inclusion of uncertainty in ordinal regression has been investigated by Piccolo (2003), Iannario and Piccolo (2016), Gottard et al. (2016), Tutz et al. (2017), and Simone and Tutz (2018), a comprehensive overview has been given by Piccolo and Simone (2019). The basic assumption behind the so-called CUB models, which stands for Combination of a Uniform and a shifted Binomial distribution, is that the choice of a response category is determined by a mixture of a distinct preference and uncertainty. The latter is represented by a uniform distribution over the response categories. But CUB models are designed as regression models without assuming repeated measurements, they are not latent trait models, uncertainty is linked to explanatory variables, and they do not account for subject-specific response styles.
The modeling strategy proposed in the following is the explicit modeling of uncertainty by introducing subject-specific parameters that are consistent throughout items and might be determined by external explanatory variables. The proposed model explicitly aims at modeling the heterogeneity in the population. The authors consider in detail extensions of the PCM (Masters, 1982), a model for polytomous data that reduces to the binary Rasch model when applied to dichotomous data.
Subject-specific factors for binary models were considered before. The approach proposed by Reise (2000) has been critically discussed by Conijn et al. (2011). The latter investigated in particular problems with the representation as a multilevel logistic regression model. More recently, Ferrando (2016) proposed a normal-ogive model that contains item and person discrimination parameters. The presence of two factors makes difficult estimation procedures necessary. Therefore, Ferrando (2016) proposed a two-step approach, which works only under rather specific assumptions. The model proposed here differs from the models proposed by Ferrando and others in several respects. The authors consider extensions of the PCM, not the graded response model. Moreover, the authors include explanatory variables and use marginal estimation methods that allow that the slope parameters can be correlated with content-related parameters.
In section “Unobserved Heterogeneity and the Occurrence of Invalid Parameters,” it is demonstrated that ignoring heterogeneity in variance over subpopulations may yield strongly misleading parameter estimates. In section “Heterogeneity in Uncertainty,” models are proposed that account for the heterogeneity by including subject-specific parameters. After investigating the properties of the parameter estimates in a simulation study (section “Simulation Study”), an application is given. In section “Alternative Item Response Models,” it is briefly shown that multiplicative effects function quite differently in different models.
Unobserved Heterogeneity and the Occurrence of Invalid Parameters
In the following, a specific form of unobserved heterogeneity that can cause severe problems in latent trait models is considered. It is of interest because it can be seen as one of the sources of uncertainty and a motivation for the model that is proposed. For simplicity, the authors consider the binary Rasch model although the same problems are found in latent trait models with more than two response categories. The binary Rasch model assumes that the response
In achievement tests,
The ability or attitude is determined by the continuous random variable
The link between the unobserved variable
If one assumes that the noise variable
In this representation parameters are not identifiable; therefore, constraints on the parameters are needed. Typically, one uses the scale constraint
The derivation uses implicitly that the dispersion parameter
This entails peculiar effects if one wants to compare parameters. Actually, one has two Rasch models, one that holds in the subpopulation
with
That means, if
It should be noted that the Rasch model does not hold in the total population. However, it holds in each subpopulation and can be legitimately fitted within subpopulations. But parameters (and parameter estimates) cannot be compared because parameters in each subpopulation are scaled using the scale constraint
Even if one does not want to compare parameter estimates, it is obvious that one runs into problems if one ignores heterogeneity and fits a simple Rasch model to the total population. The heterogeneity of the person parameters is less severe because although the persons come from different subpopulations, each person has his or her own parameter. However, estimates of item parameters tend to be biased because persons from different subpopulations respond to items with different difficulty parameters. For males, the difficulties are
Similar problems with unobserved heterogeneity have been found for binary and ordinal regression models, and Allison (1999) showed that misleading parameter estimates can occur if one fits a binary logit model in separate groups. Some methods to correct parameter estimates in regression were considered by Williams (2009), Mood (2010), Karlson et al. (2012), Breen et al. (2014), and Tutz (2018). In item response theory mixture-distribution approaches can be used to this end, for an overview see von Davier and Rost (2016).
Heterogeneity in Uncertainty
In the following, the authors consider models that are able to avoid the occurrence of biased estimates caused by unobserved heterogeneity. The family of models that is considered is the Rasch model family represented by the PCM.
The PCM
Let
where
The defining property of the PCM is seen if one considers adjacent categories. The resulting presentation,
shows that the model is locally (given response categories
An Extended PCM
The extended version of the PCM that is proposed has the form:
Thus, the usual predictor in the PCM,
which contains the additional subject-specific parameter
Interpretation of subject-specific parameters
Let us start with the simplest case of a binary response (
If
If
If
In the general PCM, one has to distinguish between two cases, ordered thresholds and unordered thresholds. In the case of ordered thresholds
If
For
For
For illustration, the impact of the parameter

Response probabilities in an extended PCM for four values of
In the case of three response categories (
For all persons
For persons with
For persons with
Thus, the inverse structure of thresholds yields a more distinct avoidance of the middle category than the traditional PCM.
For
As has been demonstrated, the parameter
The uncertainty can also explain the occurrence of response patterns that are unlikely in a unidimensional model in which uncertainty is ignored. The responses of a person with high uncertainty is hardly predictable because he or she shows random behavior.
It should be noted that the uncertainty parameter
The UPCM extends the PCM to account for uncertainty of respondents. In the same way, the
which includes an item-specific slope parameter
The UPCM and the UGPCM extend the PCM and its generalized version to include the uncertainty of respondents. The parameterization is quite different from the extensions of the PCM considered by Jin and Wang (2014) and Tutz et al. (2018). The latter aims at modeling extreme response styles and assumes that the distance between thresholds of adjacent categories is subject-specific.
Including Subject-Specific Characteristics
In the UPCM, each person has its own uncertainty parameter
Figure 2 shows the resulting response probabilities if a binary predictor (male:

Response probabilities in an extended PCM with a binary predictor and varying parameters
Parameters are estimated by marginal likelihood where it is assumed that the person parameters
Simulation Study
A variety of simulations to evaluate the performance of the method and the possible consequences of ignoring uncertainty. Both model versions the authors propose (UPCM and UGPCM) are compared with their simpler counterparts PCM and GPCM, respectively. The number of observations (
The item parameters
Overall, nine different simulation settings were inspected, each setting was conducted with 100 replications. Figure 3 displays boxplots of the mean squared error (MSE) of the threshold parameters

Boxplots illustrating the MSE (on log-scale) for estimates of the threshold parameters
Exemplarily, one specific simulation setting (UPCM with

Boxplots for estimates of the four threshold parameters
Figure 5 displays the estimates of the random effects covariance matrix

Boxplots for estimates of random effects covariance parameters from covariance

Boxplots for estimates of covariate effects
Figure 7 displays boxplots of the MSEs of the threshold parameters

Boxplots illustrating the MSE (on log-scale) for estimates of the threshold parameters
An Application
For illustration, data from the ALLBUS, the general survey of social science carried out by the German institute GESIS (http://www.gesis.org/allbus). The data contain the answers of 2,535 respondents from the questionnaire in 2012. In particular, the authors consider eight items that refer to the degree of confidence the participants have in public institutions and organizations are considered. These institutions are the federal court, the Bundestag (parliament), the justice system, TV, press, government, police, and political parties. The items are measured on a scale from 1 (no confidence at all) to 7 (excessive confidence). As explanatory variables for the trait effects and for the uncertainty effects the following person characteristics were used: Age: age of participant in years; Gender: 0: male; 1: female; Income: Income of participant in Euros; WestEast: 1: East Germany/former GDR; 0: West Germany/former FRG.
To ensure that all covariate effects are comparable in their size, all variables were standardized. Both a simple PCM and the UPCM were fitted to the data. The variance of the random effect for the trait parameters in the PCM was estimated to be
Although there seems to be no correlation between both random effects, it seems that the random uncertainty effect with an estimate of
is very similar to the covariance obtained for the UPCM with covariates.
Figure 8 displays the estimates of the item parameters of both the simple PCM and the proposed UPCM. It can be seen that in particular the estimates for the exterior thresholds differ between both models while the estimates for the inner thresholds are rather similar. For the version of the UPCM without covariates, the estimates of the item parameters are very similar to the estimates from the regular UPCM and therefore are not shown.

Item parameter estimates for confidence data, separately for simple PCM and the proposed UPCM.
Table 1 collects the parameter estimates of both the trait effects and the uncertainty parameters of the explanatory variables together with the corresponding standard errors. It is seen that with the exception of the gender and age effects of trait confidence all effects turned out to be significant (for
Parameter Estimates for Effects of Explanatory Variables (Together with Standard Errors), Both for Trait Effects
For the interpretation of the effects, the authors propose a visualization tool, which is in particular helpful, when many explanatory variables are available. For the motivation, let us consider again the UPCM, which can be given by:
From this representation, it is seen that the person and item parameters determine the log-odds of observing category
a multiplicative effect
a location effect that shifts the second part of the predictor by
We plot for each variable the effect point

(Exponential) effects of explanatory variables in ALLBUS data together with confidence intervals both for trait effects
Alternative Item Response Models
The PCM is an extension of the binary Rasch model, but not the only one. Also Samejima’s graded response model (Samejima, 1997) and the sequential model (Tutz, 1989; Verhelst et al., 1997) are extensions of the binary model, which contain the Rasch model as special cases. In the same way as the PCM these models can be extended to contain an additional subject-specific uncertainty component. It is straightforward in the sequential model, which assumes a step wise solving of items, and every step is specified as a dichotomous IRT model. Also the graded response model, which works well in personality questionnaires and attitude scales, can be derived from an underlying latent trait. The graded response model has the form
For
For
For
In particular, the last case (
The difference in interpretation is caused by the specific property of the PCM that modification of the local responses (given
Concluding Remarks
The extended UPCM that is proposed adds a subject-specific uncertainty component to the traditional PCM. It can in particular be used to investigate if uncertainty is determined by person characteristics. Ignoring the uncertainty component can yield biased estimates. Subject-specific uncertainty is not a response style in the traditional sense, but can be seen as a response style in a wider sense, representing a consistent pattern of response behavior.
The proposed models (both UPCM and UGPCM) are implemented in the statistical software R (R Core Team, 2019). The implementation is available from the authors and will be available from CRAN soon. Further details on the estimation procedure can be found in the online appendix.
Supplemental Material
sj-pdf-1-apm-10.1177_0146621620920932 – Supplemental material for Uncertainty in Latent Trait Models
Supplemental material, sj-pdf-1-apm-10.1177_0146621620920932 for Uncertainty in Latent Trait Models by Gerhard Tutz and Gunther Schauberger in Applied Psychological Measurement
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
