Response Styles in Rating Scales

Abstract

Heterogeneity in response styles can affect the conclusions drawn from rating scale data. In particular, biased estimates can be expected if one ignores a tendency to middle categories or to extreme categories. An adjacent categories model is proposed that simultaneously models the content-related effects and the heterogeneity in response styles. By accounting for response styles, it provides a simple remedy for the bias that occurs if the response style is ignored. The model allows to include explanatory variables that have a content-related effect as well as an effect on the response style. A visualization tool is developed that makes the interpretation of effects easily accessible. The proposed model is embedded into the framework of multivariate generalized linear model, which entails that common estimation and inference tools can be used. Existing software can be used to fit the model, which makes it easy to apply.

Keywords

adjacent categories rating scales response styles ordinal data generalized linear models

1. Introduction

In behavioral research, rating scales have been used for a long time to investigate attitudes and behaviors. However, observed ratings may not represent the true opinion; in particular, response styles may affect the response behavior (see, e.g., Baumgartner and Steenkamp, 2001; Messick, 1991). An extensive overview on response styles in survey research was given more recently by Van Vaerenbergh and Thomas (2013). A response style is considered as a consistent pattern of responses that is independent of the content of a response (Johnson, 2003).

In the present article, we consider symmetric response categories of the form strongly disagree, moderately disagree, …, moderately agree, strongly agree and focus on response styles that are characterized by a disproportionate tendency to middle categories or to extreme categories, that is, the highest and lowest response categories. The preference to extreme categories is often called extreme response style and has been a topic of research for some time. Its counterpart, the tendency to choose middle categories, has been investigated, for example, by Baumgartner and Steenkamp (2001).

In many studies, the presence of response styles has been found. Response styles can differ, for example, across nations (Clarke, 2000; Van Herk, Poortinga, & Verhallen, 2004), ethnicity (Marin, Gamba, & Marin, 1992), or educational level (Meisenberg & Williams, 2008). In particular, in the psychometric literature, extreme response styles have been discussed within the framework of item response models. Bolt and Johnson (2009) and Bolt and Newton (2011) considered a multitrait model, which is a version of the nominal response model proposed by Bock (1972). Johnson (2003) considered a cumulative-type model for extreme response styles. Eid and Rauber (2000) considered a mixture of partial credit models that are able to detect response styles. More recently, tree-type approaches have been proposed. They typically assume a nested structure, where first a decision about the direction of the response and then about the strength is obtained. Models of this type have been proposed by Suh and Bolt (2010), De Boeck and Partchev (2012), Thissen-Roe and Thissen (2013), Jeon and De Boeck (2015), Böckenholt (2012), Khorramdel and von Davier (2014), and Plieninger and Meiser (2014).

In contrast to research in item response theory, where the focus is on the modeling of individual differences in terms of latent traits based on answers to several items without accounting for explanatory variables, we aim at investigating the influence of explanatory variables on the content-related choice and the response style for one item. The strength of the model is that it simultaneously accounts for both effects. Therefore, it allows us:

to investigate content-related effects that are undisturbed by the response style for a single item,

to investigate the response style undisturbed by content-related effects,

to use covariates to disentangle content and style, and

to avoid biased estimates of the content-related effects, which are the parameters of interest in most studies.

Approaches to simultaneous modeling of content-related effects and response styles seem to be scarce. Most approaches rely on the calculation of specific indices that can be corrected by regression techniques (see, e.g., Baumgartner and Steenkamp, 2001). An exception is the latent class approaches considered, for example, by Moors (2004), Kankaraš and Moors (2009), Moors (2010), and Van Rosmalen and Van Herk (2010). Latent class models are a strong tool, but specific software is necessary, and the existence of latent classes is always a strong assumption and interpretation has to rely on their existence. The crucial difference between these latent variable approaches and the proposed adjacent categories model is that the response style is not perceived as an individual trait but exists solely in relation to the covariates. The model does not need the additional assumptions that accompany latent variable modeling.

The proposed modeling of response styles generated by covariates for one item uses a concept of the response style that differs from the usual concept. In the psychometric literature, a response style typically is considered as a tendency in how a rating scale is used across items yielding a consistent pattern of responses that is independent of the content of a response (Johnson, 2003). When using this concept, multiple items are a necessity. In our approach, the tendency to extreme or middle categories is separated from the content-related effects by using the symmetry of the response categories and letting covariates determine the tendency to specific categories. Nevertheless, since the model provides an explicit modeling of a tendency to extreme or middle categories, the term response style seems also appropriate within our modeling framework.

In Section 2, the basic model is introduced. An illustrative example is given and a visualization tool is developed. In Section 3, the effects of parameters are discussed, and the potential bias of estimates is investigated. Section 4 is devoted to inference, and tools for the estimation of parameters are provided in Section 5. In Section 6, further applications that illustrate the method are given. In Section 7, we consider possible extensions and compare the approach to alternatives proposed, in particular, in item response theory.

2. An Extended Rating Scale Model

Let Y_i ∈{1, … , k}, i = 1, … , n denote the observed responses on a rating scale; the categories 1, … , k represent graded agree–disagree attitudes with a natural symmetry like strongly disagree, moderately disagree, …, moderately agree, strongly agree. If the number of response categories is odd, there is a neutral middle category, and if k is even, there is none and the respondent is forced to exhibit at least a weak form of agreement or disagreement. Let x_i denote a vector of explanatory variables that is observed together with the response Y_i. Several models that link the explanatory variables to the ordinal response are available. Common model classes are the cumulative models, the sequential and adjacent categories models (see, e.g., Agresti, 2009; Tutz, 2012). We will focus on the adjacent categories model, which has the advantage that no constraints on the parameters are needed. Moreover, a specific version of the model is widely used in item response modeling. The so-called partial credit model (Masters, 1982) uses the adjacent logit link to model item difficulties but does not include explanatory variables. In the following, we first consider the basic model and then the extensions that account for response styles.

2.1. Adjacent Categories Model

The model proposed here is an extension of the adjacent categories model. The basic form of the model with logit link is given by:

\log \frac{π_{i, r + 1}}{π_{i r}} = θ_{r} + x_{i}^{T} β, r = 1, \dots, k - 1,

where π_ir = P(Y_i = r| x _i ) denotes the conditional probability of response category r. The model assumes that the adjacent categories logits $\log (π_{i, r + 1} / π_{i r})$ are determined by an intercept θ_r, which is specific for the adjacent categories and a linear effect of the explanatory variables, $x_{i}^{T} β$ . The ordering of the categories is modeled implicitly by assuming that the weight parameter does not depend on r. If one lets the parameter depend on the category, one obtains the classical multinomial logit model, which does not exploit the ordering of the categories (Agresti, 2009).

The interpretation of the parameters of the model is seen best when the parameters are given as functions of probabilities. For covariate vector x ^T = (x₁, … , x_p) and corresponding parameter vector β^T = (β₁, … , β_p), it may be derived that the parameter of the jth covariate is determined by:

e^{β_{j}} = \frac{π_{r + 1} (x_{j} + 1) / π_{r} (x_{j} + 1)}{π_{r + 1} (x_{j}) / π_{r} (x_{j})},

where π_r(x_j) denotes the probability of response category r for the vector of explanatory variables with the jth covariate having value x_j, and π_r(x_j + 1) is the probability of response category r if the jth covariate is increased by one unit to x_j = 1; all other variables are fixed. Thus, $e^{β_{j}}$ is the odds ratio that compares the odds for categories r + 1 and r when the jth covariate is increased by one unit.

2.2. Accounting for Response Styles

For simplicity, let us first consider the case of three response categories, k = 3. Then the model is given by the two equations that specify $\log (π_{i 2} / π_{i 1})$ and $\log (π_{i 3} / π_{i 2})$ . The extended model proposed here contains the additional parameter δ _i and has the form:

\log \frac{π_{i 2}}{π_{i 1}} = θ_{1} + x_{i}^{T} β + δ_{i}, \log \frac{π_{i 3}}{π_{i 2}} = θ_{2} + x_{i}^{T} β - δ_{i} .

The parameter δ _i specifies the response style. If δ_i → ∞, one obtains π_i2 → 1, which means a strong tendency to the middle category. If δ_i → −∞, one obtains π_i2 → 0, which means a strong tendency to the response Categories 1 and 3 corresponding to the extreme response style. It is important that the response style is separated from the preference represented by the linear term $x_{i}^{T} β$ . While $x_{i}^{T} β$ represents the content-related effect, δ _i represents the response style toward the middle category or away from it.

The effect of the additional parameter is illustrated in Figure 1 for a univariate explanatory variable with β = 1. It is seen that a person with δ_i = 2 has a stronger tendency to choose the middle category than a person with δ_i = 0, whereas a person with δ_i = −2 hardly uses the middle category. Although the numeric values change, the shapes of the response functions for Categories 1 and 3 are very similar for all values of δ _i .

Figure 1.

Response functions for several values of δ _i .

The strength of the model is that the parameter δ _i can be specified as a function of explanatory variables. Let z_i be an additional vector of variables which are assumed to determine the response style. The z_i can be different from x _i but can also be the same. With $δ_{i} = z_{i}^{T} γ$ , one obtains the model:

\log \frac{π_{i 2}}{π_{i 1}} = θ_{1} + x_{i}^{T} β + z_{i}^{T} γ, \log \frac{π_{i 3}}{π_{i 2}} = θ_{2} + x_{i}^{T} β - z_{i}^{T} γ .

The model has some interesting properties. From:

\log \frac{π_{i 3}}{π_{i 1}} = θ_{1} + θ_{2} + 2 x_{i}^{T} β,

one sees that the log odds for the categories that actually represent agreement and disagreement are not affected by the term that determines the response style. On the other hand:

\log \frac{π_{i 2} / π_{i 1}}{π_{i 3} / π_{i 2}} = θ_{1} - θ_{2} + 2 z_{i}^{T} γ,

shows that specific odds ratios do not depend on the content-related term.

It is noteworthy that the parameters of the content-related term are the same as in the simple adjacent categories model. This may be seen from simple derivation of the parameters for the simple adjacent categories model. For three response categories, an even more intuitive form than Equation 1 is given by:

e^{2 β_{j}} = \frac{π_{3} (x_{j} + 1) / π_{1} (x_{j} + 1)}{π_{3} (x_{j}) / π_{1} (x_{j})},

which shows the explicit dependence on the categories that refer to agreement or disagreement. For the parameters of the response-style effects, one obtains:

e^{2 γ_{j}} = \frac{π_{2} (z_{j} + 1) / π_{1} (z_{j} + 1)}{π_{3} (z_{j} + 1) / π_{2} (z_{j} + 1)} / \frac{π_{2} (z_{j}) / π_{1} (z_{j})}{π_{3} (z_{j}) / π_{2} (z_{j})} .

The explicit form of the parameters also ensures that the model is identifiable.

2.2.1. The general model for k response categories

In the general case, one has to distinguish between an odd and even number of response categories. For k odd, let m =[k/2] + 1 denote the middle category. Then the rating scale model that accounts for the tendency to the middle or extreme categories has the form:

\begin{array}{l} \log \frac{π_{i, r + 1}}{π_{i r}} = θ_{r} + x_{i}^{T} β + z_{i}^{T} γ, r = 1, \dots, m - 1, \\ \log \frac{π_{i, r + 1}}{π_{i r}} = θ_{r} + x_{i}^{T} β - z_{i}^{T} γ, r = m, \dots, k - 1. \end{array}

The term $θ_{r} + x_{i}^{T} β$ represents the usual effects of covariates x _i in an adjacent categories model; if $x_{i}^{T} β$ is large, higher categories are preferred, and if it is small, low categories are chosen.

Positive values of the term $δ_{i} = z_{i}^{T} γ$ increase the probabilities of higher categories for r = 1, … , m − 1 but decrease them for r = m, … , k − 1. Thus, δ_i determines whether middle categories or extreme categories are preferred. The effect is also seen when considering extreme values of δ_i. For $δ_{i} = z_{i}^{T} γ \to \infty$ , one obtains π_im → 1 and therefore a tendency to the middle category while δ_i → −∞ entails π_i2, … , π_i,k−1 → 0 and therefore a preference of the extreme categories.

It should be noted that the modeling approach differs from alternative perspectives on response styles. In the literature, response styles are often defined as preferring the outer or the midpoint categories across many unrelated/weakly related items. In our model, a negative value of the response-style parameter indicating extreme response style captures not only a preference for the extremes “strongly agree” compared to the adjacent category “agree” but also a preference for “agree” compared to “somewhat agree.” The response-style γ parameter thus picks up not only the tendency to select the extremes but a general tendency to prefer more extreme categories, given the substantive stand of the respondent.

For k even, the model has a slightly different form. Let in this case m = k/2 denote the split between agreement and disagreement categories. Then the proposed model has the form:

\begin{array}{l} \log \frac{π_{i, r + 1}}{π_{i r}} = θ_{r} + x_{i}^{T} β + z_{i}^{T} γ, r = 1, \dots, m - 1, \\ \log \frac{π_{i, m + 1}}{π_{i m}} = θ_{m} + x_{i}^{T} β, \\ \log \frac{π_{i, r + 1}}{π_{i r}} = θ_{r} + x_{i}^{T} β - z_{i}^{T} γ, r = m + 1, \dots, k - 1. \end{array}

The effect of the term $δ_{i} = z_{i}^{T} γ$ is the same as in the case where k is odd; large values indicate a tendency to the extreme response style, and small values, a tendency to the middle.

For simplicity, we will use the abbreviation RSRS for the model (k odd or even) for rating scale model accounting for response styles. Before discussing the effects in detail, we first consider an application.

2.2.2. An illustrative example

Although estimation methods have not yet been given, we consider an application to illustrate the effects obtained by using the extended model. We consider data from the Survey on Household Income and Wealth (SHIW) by the Bank of Italy that have been used before by Gambacorta and Iannario (2013). They are available from http://www.bancaditalia.it/statistiche/indcamp. The response is the happiness index indicating the overall life well-being measured on a Likert-type scale from 1 (very unhappy) to 10 (very happy). As explanatory variables, we consider gender (0 = male, 1 = female), the marital status (single, married, separated, and widowed), the place of living (north, south, and center), the general degree of confidence in other people from 1 (low) to 10 (high), the atmosphere the interview took place in (1–10), the citizenship, and the age in decades. The respondents were also asked about their assessment if the household income is sufficient to see the family through to the end of the month rated from 1 (with great difficulty) to 5 (very easily). The analysis is based on a subset with 3,816 respondents of the SHIW of 2010, age was centered around 60 and confidence around 5. We fitted a simple adjacent categories model with all of the covariates and the extended version that accounts for response styles, where all the variables are allowed to have content-related and response-style effects. For the variables age and confidence, we also included quadratic and cubic terms because the effects seem to be not negligible. First of all, it is interesting if the style-related effects are needed in the model. The likelihood ratio test for the null hypothesis H₀ : γ = 0 has the χ² value 1,101.11 on 15 degrees of freedom. Therefore, style effects are definitely present. The estimated effects and standard errors for both models are given in Table 1. It is seen that the estimates as well as the standard errors of the content-related effects differ for the adjacent categories model and its extended version. In some cases, the estimates are larger in other cases, smaller if one ignores the response style (see also Section 3). As far as the effects on the response style are concerned, it is seen that gender had no effect on the response style, but, for example, sufficiency of income, age, and confidence had effects on the response style that cannot be ignored. The weight −0.09 on sufficiency of income with very small standard error indicates that confidence in the sufficiency of income increases the tendency to choose extreme categories. Instead of discussing the various effects in detail in the next sections, visualization tools are developed.

Table 1.

Parameter Estimates and Standard Errors for the SHIW Study

		Extended Adjacent		Adjacent
Effect type	Covariates	Estimate	SE	Estimate	SE
Content-related effects (x variables)	Gender	−.0302	.0155	−.0292	.0154
	Married	.0256	.0240	.0475	.0223
	Separated	.0291	.0373	.0200	.0325
	Widow	.0116	.0338	.0170	.0292
	Center	.1666	.0192	.1887	.0195
	South	.0169	.0172	.0170	.0166
	Income (sufficient)	.0100	.0060	.0153	.0059
	Atmosphere	.0162	.0054	.0173	.0047
	Citizen (foreign)	−.0413	.0414	−.0545	.0373
	Confidence	.0035	.0072	.0029	.0070
	Confidence²	−.0084	.0011	−.0082	.0011
	Confidence³	.0008	.0004	.0011	.0004
	Age	−.0123	.0086	−.0160	.0088
	Age²	−.0041	.0031	−.0029	.0028
	Age³	.0010	.0013	.0015	.0013
Response-style effects (z variables)	Gender	.0034	.0317
	Married	−.4208	.0477
	Separated	.0067	.0701
	Widow	.1063	.0642
	Center	−.0385	.0387
	South	.1336	.0350
	Income (sufficient)	−.0908	.0124
	Atmosphere	−.1079	.0106
	Citizen (foreign)	.3206	.0806
	Confidence	.0073	.0146
	Confidence²	−.0228	.0024
	Confidence³	.0006	.0010
	Age	.0003	.0182
	Age²	−.0259	.0062
	Age³	.0078	.0028

Note. SHIW = Survey on Household Income and Wealth.

2.2.3. Visualization of effects

The extended model contains more parameters than a simple rating scale model. In particular, when various explanatory variables are included, it is hard to keep track of all the relevant effects. Therefore, we provide some visualization tools to investigate the effect strength. We explicitly consider the case of an odd number of response categories (Model 2) and start with the visualization of linear effects. It is immediately seen that the odds of adjacent categories have the form:

\frac{π_{i, r + 1}}{π_{i r}} = e^{θ_{r}} {(e^{β_{1}})}^{x_{i 1}} \dots {(e^{β_{p}})}^{x_{i p}} {(e^{γ_{1}})}^{z_{i 1}} \dots {(e^{γ_{q}})}^{z_{i q}}, r = 1, \dots, m - 1,

\frac{π_{i, r + 1}}{π_{i r}} = e^{θ_{r}} {(e^{β_{1}})}^{x_{i 1}} \dots {(e^{β_{p}})}^{x_{i p}} {(e^{- γ_{1}})}^{z_{i 1}} \dots {(e^{- γ_{q}})}^{z_{i q}}, r = m, \dots, k - 1,

where the explanatory variables for content-related effects have length p and the response-style effects length q. Thus, if the jth x variable increases by one unit, the multiplicative effect on the odds between adjacent categories is given by $e^{β_{j}}$ .

If the jth z variable increases by one unit, the multiplicative effect on the odds between adjacent categories depends on the category. It is $e^{γ_{j}}$ for categories smaller than m and $e^{- γ_{j}}$ for the higher categories. If the x and z variables are the same, the effects are seen by plotting the tuple $(e^{γ_{j}}, e^{β_{j}})$ . If a covariate is present only as an x or z variable, one of the components in the tuple is 1.

For the SHIW study, we show the effects of the marital status, gender, and the area of living in Figure 2. In the figure, pointwise confidence intervals are included. We use stars with the horizontal and vertical lengths corresponding to the .95 confidence intervals of $e^{γ_{j}}$ and $e^{β_{j}}$ , respectively. It is seen from the left panel that there is no difference between men and women in the response style ( $e^{γ_{j}}$ close to one), but women tend to choose lower scales of happiness ( $e^{β_{j}}$ around 0.97). For the variable marital, we chose “single” as the reference category obtaining the value $(e^{γ_{j}}, e^{β_{j}}) = (1, 1)$ . It is seen that all others have higher happiness scores, although especially the effect of the category “widow” is not significantly different from the category single. As far as the response styles are concerned, separated and widowed persons showed a tendency to the middle, whereas married people give a more distinct response when compared to the reference category single. From the right panel, it is seen that people living in the center of Italy have significantly higher happiness scores than people living in the south or the reference category “north.” The difference in the preference of the response styles between categories “center” and north can be neglected, but there is a significant difference between categories “south” and north. People living in the south tend to choose less extreme response categories. It should be noted that the confidence intervals we show do not include the correlation between estimates to obtain a more easily accessible visualization. Moreover, correlations tend to be small (see next section).

Figure 2.

Visualization of estimated effects for the Survey on Household Income and Wealth study including pointwise confidence intervals.

2.2.4. Visualization of nonlinear effects

In the example, the explanatory variables confidence and age contain in addition to linear terms quadratic and cubic terms. Then it is not sensible to plot the effects of parameters separately. One can understand the effects as functions of the corresponding explanatory variables. For example, the content-related effect of confidence is a polynomial containing cubic terms given by term $f_{c}^{C} (conf) = conf β_{c,1}^{C} + {conf}^{2} β_{c,2}^{C} + {conf}^{3} β_{c,3}^{C}$ (C indicating content) and the response-style effect is given by $f_{c}^{R} (conf) = conf β_{c,1}^{R} + {conf}^{2} β_{c,2}^{R} + {conf}^{3} β_{c,3}^{R}$ (R indicating response style). Omitting for simplicity the linear effects of the other covariates, one has the model:

\frac{π_{i, r + 1}}{π_{i r}} = e^{θ_{r}} (e^{f_{c}^{C} (conf)}) (e^{f_{a}^{C} (age)}) (e^{f_{c}^{R} (conf}) (e^{f_{a}^{R} (age)}), r = 1, \dots, m - 1,

\frac{π_{i, r + 1}}{π_{i r}} = e^{θ_{r}} (e^{f_{c}^{C} (conf)}) (e^{f_{a}^{C} (age)}) (e^{- f_{c}^{R} (conf}) (e^{- f_{a}^{R} (age)}), r = m, \dots, k - 1,

where $f_{a}^{C} (age), f_{a}^{R} (age)$ represent the content-related and response-style-related effects of the variable age.

Parameters in polynomial terms are hard to interpret, but one can plot the corresponding nonlinear effects. Figure 3 shows the effects of content (first row) and response style (second row). In the plots we used the same scale in order to reveal the strength of the impact of the covariates. It is seen that with increasing confidence up to about value 5, the happiness increases and above 5 slightly decreases. For the response style, one gets a distinctly quadratic effect; the tendency to extreme categories (negative values of $f_{a}^{R} (age)$ ) is very strong for high and low values of confidence and zero for middle categories of confidence. The content effect of age is not significant. Instead of omitting it, we show the estimated curve, which is an almost horizontal line close to zero. Concerning the response style, it is seen that younger people have a tendency to extreme response styles, the effect vanished at age 50; it is close to zero for all values greater than 50.

Figure 3.

Nonlinear effects of content and response style for confidence and age (Survey on Household Income and Wealth study); upper panels show the content and lower panels, the response-style effects.

As an alternative to these conventional plots for nonlinear effects, we propose to visualize them in a similar way as for linear effects by using axes that correspond to effects of response style and effects of content. The corresponding plot is obtained for the covariate confidence by plotting $(e^{f_{c}^{R} (conf)}, e^{f_{c}^{C} (conf)})$ as a function of conf (10 points). However, instead of one point as in the visualization of linear effects, one obtains a curve in two dimensions representing the multiplicative effects on the proportion of the probabilities of adjacent categories concerning content-related and response-style-related effects. Figure 4 shows the plots for the variables confidence and age. They show how both effects evolve with increasing value of the corresponding covariate. Again we use the same scale for both effects. The curves for confidence show the initial increase and subsequent weak decrease in happiness with the turning point at about 5. In particular for values of confidence between 5 and 10, the variation on the y axis representing the variation of the happiness score is weak. Much stronger variation is found for the response styles (x axis). The tendency to extreme categories weakens with increasing confidence and then gets stronger with the same turning point at 5. The curve for age shows that the effect on happiness is weak with hardly any variation on the y axis. However, the effect on the response style is rather strong. The tendency to extreme categories found for 30 years of age diminishes strongly up to about 50 years of age and then hardly changes. The visualization by curves is useful for polynomial terms but can also be used for alternative smooth functions as considered briefly in Section 7.

Figure 4.

Curves for confidence (left) and age (right) for Survey on Household Income and Wealth study.

3. Effects in the RSRS Model

One of the strengths of the extended RSRS model is that the content-related effects are separated from the tendency to middle or extreme categories. We will investigate the separation for the case k odd; for k even the separation works in a similar way.

Let the model be given by Equation 2 and again m =[k/2] + 1 denote the middle category. Then one may derive that the parameters of the x variables are determined by:

e^{2 r β_{j}} = \frac{π_{m + r} (x_{j} + 1) / π_{m - r} (x_{j} + 1)}{π_{m + r} (x_{j}) / π_{m - r} (x_{j})}, r = 1, \dots, m - 1,

where π_r(x_j) again denotes the probability of response category r for the vector of explanatory variables with the jth covariate having value x_j and π_r(x_j + 1) is the probability of response category r if the jth covariate is increased by one unit to x_j + 1; all other covariates are fixed. The representation (Equation 4) compares the probabilities for the categories m + r and m − r, that means categories with equal distance to the middle category. For k = 7 and therefore m = 4, it compares the probabilities of Categories 5 and 3, 6 and 2, and 7 and 1. Thus, it shows the effect of the explanatory variable in a symmetric way, namely, how strong is the preference of, for example, Category 5 compared to 3 if the explanatory variable increases by one unit.

It is essential that the parameter β_j does not depend on the term $z_{i}^{T} γ$ , even if x _i = z_i. That means also in the simple adjacent categories model, where $z_{i}^{T} γ = 0$ , the parameters β_j are given by Equation 4. Therefore, the content-related effects in the model are distinctly separated from the tendency to middle or extreme categories.

For the parameters that determine the response style, one obtains:

γ_{j} = 1 / (2 r) (\log \frac{π_{m} (z_{j} + 1) / π_{m - r} (z_{j} + 1)}{π_{m + r} (z_{j} + 1) / π_{m} (z_{j} + 1)} - \log \frac{π_{m} (z_{j}) / π_{m - r} (z_{j})}{π_{m + r} (z_{j}) / π_{m} (z_{j})}), r = 1, \dots, m - 1,

where in a similar way as before π_r(z_j) denotes the probability of response category r for the vector of explanatory variables with jth covariate z_j and π_r(z_j + 1) is the probability of response category r if the jth covariate is increased by one unit to z_j + 1; all other covariates are fixed. The parameter γ_j depends only on response probabilities of categories m, m + r, and m − r for different values of z_j. It represents how the concentration of the probability mass is increased in the middle if z_j is increased by one unit. In the same way as β_j is separated from $z_{i}^{T} γ$ , the parameter γ_j is separated from the term $x_{i}^{T} β$ , signaling the separation of the weights on x variables and z variables. One effect of the separation of the effects is that estimates of γ_j, β_j if x_j = z_j typically show weak correlation. For an illustration see Figure 5 where the estimates (1,000 replications) of one normally distributed explanatory variable with x = z are shown for various parameters β,γ and increasing sample size n. However, the separation of effects does not mean that the response style can be ignored when estimating the content-related effects of variables (see next section).

Figure 5.

Estimates for several values of β,γ and samples sizes: Explanatory variable follows a standard normal distribution, the true values are given in gray.

3.1. Accuracy of Estimates if the Response Style is Ignored

If one is not aware of response styles, one fits a regression model that contains only the effect of explanatory variables on the response. In the following, it is demonstrated that this procedure can result in strongly biased estimates and poor accuracy of the estimates of β, which are the parameters of interest in most studies. For simplicity, we consider the case of only one explanatory variable, which follows a standard normal distribution. Figure 6 shows the mean squared errors (MSEs), the variances, and the bias of the maximum likelihood (ML) estimate of β if one fits a simple adjacent categories model, which ignores the presence of differing response styles, and if one fits the extended model that accounts for the response style. The data generating model is the extended model with seven categories for varying values of γ and θ_r = 0, β = 1. The upper panels show the case where x = z, therefore one is estimating the content-related effect of an explanatory variable that also has an effect on the response style. It is seen that the MSEs for both models is about the same for very small values of γ. For large absolute values of γ, the MSE is much larger if the response style is ignored. The poor performance is mainly caused by the bias. One obtains strongly biased estimates even for moderate values of γ that underestimate the size of the effect. The effect is shown for the true value β = 1. The same strength of the bias is found if β = −1, but then the parameter β is overestimated instead of underestimated. The tendency is the same, one sees attenuation of the effects, in extreme cases; if γ = 2, the absolute value of the estimate, $| \hat{β} |$ , is almost the half of the true value |β|.

Figure 6.

Mean squared errors, variances, and bias as a function of γ; in the upper panel, one has x = z, and in the lower panel, x and z differ and are independent. Dashed lines indicate the model without accounting for the response style, and the drawn lines indicate the model with response-style effects.

One might suspect that the bias is so strong because the variable has two effects, one on the preference and one on the response style. Therefore, we also investigated the case with a predictor η_r = θ_r + xβ + zγ, where x,z are independently normally distributed variables. The lower panel of Figure 6 shows the resulting curves. It is seen that one obtains biased estimates also if a variable that is independent of x generates varying response styles but is ignored. Therefore, one ignores heterogeneity of response styles in the population.

In Figure 6, the effect is always attenuation of effects, a familiar phenomenon that also occurs in random effects models if heterogeneity is ignored (see, e.g., Tutz, 2012, Chapter 14). However, in the case of ignored response styles in some cases, one can also see stronger effects. In Figure 7, MSE, bias, and variance are shown for the same models as in Figure 6, but now the thresholds have been changed to θ₁ = 0, θ₂ = −0.4, θ₃ = −0.8, …. For these thresholds, higher categories are preferred for all of the values of the explanatory variables. It is seen that the bias is again negative for all values of γ if x and z are uncorrelated (lower panel), but one obtains overestimation of the true value of β = 1 in the case where x = z if γ is positive (upper panel). Therefore, if there is a tendency to higher categories and the effect β is positive, and one ignores the tendency to select middle categories (γ positive), this is interpreted by the model without response effect as a stronger β. The consequence is that larger values of β are obtained, the estimated effect tends to be larger than the true effect. The same effects are also found if more than just two variables are included in the model. For illustration of the effects, we considered values of γ from a wide range. Although large values of γ might occur, in the real data sets we considered |γ| was not beyond 1. An indicator of potential nonnegligible bias might be strong differences in estimates for the model with response style and the model without response style.

Figure 7.

3.2. Effect of Sample Sizes

It has been demonstrated that biased estimates can be avoided by accounting for the response style when estimating the content-related effects. A quite different question is which observations contribute to the estimation accuracy when differing response styles are present and accounted for in the model. Intuitively accuracy of estimates will be weaker if many respondents prefer the middle category because then there is a tendency that less information about β is available. The effect can be illustrated by looking at the effect of β in the simple case of three response categories and a simple binary predictor x representing, for example, gender. As already shown in Section 2, the true effect is given by:

e^{2 β} = \frac{π_{3} (f) / π_{1} (f)}{π_{3} (m) / π_{1} (m)},

where π_r(f), π_r(m) denote the probability of an response in category r for females and males, respectively. If in one of the two populations there is a strong tendency to the middle categories, the relative frequencies corresponding to π₃(⋅)/π₁(⋅) will be estimated very unstable because only few observations will be observed in Categories 1 and 3. Consequently, the accuracy of $\hat{β}$ will suffer.

To demonstrate the effect, we show simulation results. We consider a binary predictor x ∈ {0,1} and effect strengths β = 1 and γ = 1. Figure 8 shows the MSEs for a range of sample sizes, where n₀ denotes the sample size of population x = 0 and n₁ the sample size of population x = 1. In the left panel, the thresholds were θ₁ = θ₂ = 0 yielding probability vectors (0.33, 0.33, 0.33) for x = 0 and (0.06, 0.468, 0.468) for x = 1. Therefore, in the population x = 1, the proportion π₃(x = 1)/π₁(x = 1) is rather extreme and unstable to estimate. It is seen from Figure 8 that increasing the number of observations in the population x = 0 does improve estimation accuracy only very little, while increasing the number of observations in the population x = 1 improves the estimation accuracy very strongly. In the right panel of Figure 8, the thresholds are θ₁ = −2, θ₂ = 0 yielding probability vectors (0.787, 0.106, 0.106) for x = 0 and (0.33, 0.33, 0.33) for x = 1. Now the proportion π₃(x = 0)/π₁(x = 0) is rather extreme and unstable to estimate. As is seen from the right panel, increasing the number of observations in the population x = 0 strongly improves the estimates, while increasing the number of observations in the population x = 1 hardly matters.

Figure 8.

Mean squared error as a function of the sample sizes n ₀, n ₁ for subpopulations x = 0, x = 1, respectively.

Thus, if extreme proportions occur in one population, which can be induced by response styles, estimation accuracy profits from the increase in these populations. The effect cannot be exploited in a first investigation, but if one has a pilot study, which gives first results on the probabilities to expect, it can be used to stratify the sample in future studies to improve the accuracy of estimates.

4. Estimation of Parameters and Inference

Estimation and testing of the model is simplified by embedding the model into the framework of (multivariate) generalized linear models (GLMs). Let the data be given by (y _i, x _i, z _i), i = 1, …, n. Given x _i, z _i, one assumes a multinomial distribution, y_i ∼ M(1, π_i), where $π_{i}^{T} = (π_{i 1}, \dots, π_{i k})$ with components π_ik = P(Y_i = r|x_i, z_i). It is straightforward to show that the extended model can be given in the form:

g (π_{i}) = X_{i} δ,

where X_i is a design matrix composed of the values x _i, z _i. δ is the total vector of parameters containing the parameters θ₁, … , θ_k−1, β, γ and g(⋅) is a vector-valued link function g = (g₁, … , g_k−1) : ℝ^{k − 1} → ℝ^{k − 1} given by:

g_{r} (π_{1}, \dots, π_{k - 1}) = \log (\frac{π_{r + 1}}{π_{r}}), r = 1, \dots, k - 1.

An equivalent form of the link between explanatory variables and response is:

π_{i} = h (X_{i} δ),

where h = (h₁, … , h_k−1) = g⁻¹ is the so-called response function. Equations 5 and 6 represent the structural assumption of a multivariate GLM. Maximum likelihood estimates and inference for multivariate GLMs are extensively discussed in Fahrmeir and Tutz (2001) and Tutz (2012). For example, one can use likelihood ratio tests, score tests, or Wald tests to test linear hypotheses of the form H₀ : Cδ = ξ against H_I : Cδ ≠ ξ, where C is a fixed matrix of full rank and ξ is a fixed vector.

An interesting aspect is the covariance of estimates which is asymptotically given by the expected information or Fisher matrix, F(δ) = E(−∂λ/∂δ∂δ^T), which has the form:

F (δ) = \sum_{i = 1}^{N} X_{i}^{T} W_{i} (δ) X_{i} .

The blocks W_i(δ) of the weight matrix are given by $W_{i} (δ) = {(\frac{\partial g (π_{i})}{\partial π^{T}} Σ_{i} (δ) \frac{\partial g (π_{i})}{\partial π})}^{- 1}$ . If the two sets of explanatory variables are the same, that is, x_i = z_i, one can see from the model Equations 2 and 3 that the column that codes the variable x_j and the column that codes the corresponding z variable are orthogonal. Therefore, the estimates of the effects β_j and γ_j are asymptotically uncorrelated; the effects become orthogonal, really separating the content-related effect and the response-style effect.

5. Implementation and Available Programs

The model can be estimated and evaluated by using the flexible R-package vector generalized linear and additive model (VGAM; Yee, 2010, 2014), which also has to be used in estimation and testing of our applications. Function vglm() allows to estimate the so-called vector GLMs (Yee & Wild, 1996). The extended RSRS model can be seen as a special case of this general family of models. One has to use the family function acat(reverse = FALSE), which specifies the link function that corresponds to the adjacent categories model in the ordering considered here. The argument parallel = FALSE ∼1 ensures that only intercepts are category specific. When using the function, one has to distinguish between x and z variables. The x variables are not category specific, whereas the z variables represent a special case of category-specific covariates for which only the sign differs for categories below and above the middle category. For category-specific covariates, one takes advantage of the argument xij. One just has to specify the design matrices by including the z variables in the specific form of Models 2 and 3, and estimation of the extended model by vglm() is obtained. An R function that automatically generates the design matrix and estimates the model is available from the authors. Embedding the estimation procedure into the framework of VGAM also has the advantage of quite fast computation. For more details, see the supplemental material in the online version of the journal.

6. Further Applications

6.1. Health Care

As a second application, we use data from the ALLBUS, the general social survey of social science carried out by the German institute GESIS. They are available from http://www.gesis.org/allbus. For our analysis, we consider data from 2012 consisting of 2,899 persons. The response is the confidence in the health-care system measured on a scale from 1 (no confidence at all) to 7 (excessive confidence). Explanatory variables that we include in our model are gender (0 = male, 1 = female), income in thousands of euro, age in decades, and the medical condition of the person on a scale from 1 (very good) to 5 (bad). Again we estimated a simple adjacent categories model and the extended model, where all covariates were allowed to have content-related and response-style effect. In a second step, we refitted the model including only the covariates with a significant effect in each part. The estimated coefficients and the corresponding standard errors are given in Table 2. Concerning variable selection, covariate gender and income are excluded from the x variables, and covariate age is excluded from the z variables. The likelihood ratio test statistic for the global hypothesis H₀ : γ = 0 is 44.6 on 8 degrees of freedom. Thus, response-style effects should not be neglected. The ordinal predictor medical condition with reference very good has significant content-related effects as well as significant response-style effects. Figure 9 shows the tuple $(e^{{\hat{γ}}_{j}}, e^{{\hat{β}}_{j}})$ of the extended model including pointwise confidence intervals represented by stars. The estimated coefficients show that the confidence in the health-care system decreases with deteriorating medical condition. In addition, there is a significant tendency to choose extreme categories for persons with a bad medical condition. For females compared to males, there is a significant tendency to middle categories. The explanatory variables income and age contain also quadratic and cubic terms. Figure 10 shows the estimated nonlinear effects of content (first row) and response style (second row). The covariate income has no significant effect on the confidence. However, with increasing income, there is an increasing tendency to middle categories. The effect is not far from being linear, but the quadratic and cubic terms are significant. Concerning age, the confidence in the health system decreases up to age 40 and increases between 40 and 80. The decrease after 80 should not be overinterpreted since it is based on few observations. There seems to be no effect of age on the response style (given the other covariates). We do not show the two-dimensional curves for this example because they are not informative.

Table 2.

Parameter Estimates and Standard Errors for the Health-Care Data

		Extended Adjacent		Adjacent
Effect type	Covariates	Estimate	SE	Estimate	SE
Content-related effects (x variables)	Age	.0694	.0168	.0702	.0168
	Age²	.0206	.0043	.0225	.0044
	Age³	−.0052	.0024	−.0055	.0022
	Good	−.0073	.0472	−.0416	.0414
	Mostly good	−.1621	.0479	−.1499	.0446
	Partly good	−.2663	.0548	−.2491	.0543
	Bad	−.3011	.0718	−.2834	.0788
Response-style effects (z variables)	Gender	.138	.0434
	Income	.0733	.0238
	Income²	−.0071	.0030
	Income³	.0001	.0001
	Good	.1263	.0676
	Mostly good	−.0356	.0685
	Partly good	−.1602	.0822
	Bad	−.3140	.1172

Figure 9.

Visualization of estimated effects of covariate medical condition for the health-care data.

Figure 10.

Nonlinear effects of content and response style for income and age (health care); upper panels show the content and lower panels, the response-style effects.

6.2. Motivation of Students

As a third example, we consider data from a student questionnaire. It has been evaluated what effect the expectation of students for getting an appropriate job has on their motivation. The response is the effect on motivation on a scale from 1 (often negative) to 5 (often positive), with intermediate values “sometimes negative/positive” and no effect. For our analysis, we use data from 343 students from the subject areas psychology, physics, and teaching serving as explanatory variable. The data are given in Table 3. Overall there is a strong preference for the middle categories, which is characteristic for this sort of question. The comparison of the simple adjacent categories model and the extended model yields the likelihood ratio test statistic 6.14 on 2 degrees of freedom. Thus, response-style effects again should not be neglected. The estimated coefficients for both models are given in Table 4, a visualization of the effects of the extended model including pointwise confidence intervals is shown in Figure 11, where subject teaching was chosen as reference category.

Table 3.

Data From a Student Questionnaire

Effect on Motivation
Subject Area	Often Negative	Sometimes Negative	None or Mixed	Sometimes Positive	Often Positive
Psychology	9	26	53	8	6
Physics	8	22	100	20	6
Teaching	26	20	35	0	4

Table 4.

Parameter Estimates and Standard Errors for the Student Questionnaire

		Extended Adjacent		Adjacent
Effect type	Covariates	Estimate	SE	Estimate	SE
Content-related effects	Psychology	.4462	.1867	.6338	.1688
x Variables	Physics	.6616	.1821	.8798	.1633
Response-style effects	Psychology	.2147	.2308
z Variables	Physics	.5259	.2226

Figure 11.

Visualization of estimated effects of covariate subject area for the student questionnaire.

The estimates in the content-related part of the model show that students of psychology and physics see more positive effects on their motivation than students of the teaching profession. In fact, job prospects for students of the teaching profession are poor nowadays. The estimated response-style effects show a significant tendency to middle categories for students of physics as compared to students of the teaching profession.

A comparison of the content-related effects in Table 4 for the simple and the extended model shows that the estimates of the simple model are considerably larger. Thus, one observes a positive bias in the estimated β coefficients of the x variables when ignoring response-style effects. One reason for the positive bias is the peculiar distribution of the data. Table 3 shows that most observations are in the middle category (none or mixed), and at the same time, there is a general shift to the left or to low categories. Therefore, ignoring the tendency to the middle category leads to an overestimation of the β coefficients.

7. Extensions and Comparison With Alternative Approaches

In the following, we shortly sketch possible extensions of the modeling approach. The first concerns the handling of nonlinear effects. If one has continuous covariates, one can replace the linear term x^Tβ by an additive term $f_{1}^{C} (x_{1}) + \dots + f_{p}^{C} (x_{p})$ and the linear term z^T γ by $f_{1}^{R} (z_{1}) + \dots + f_{q}^{R} (z_{q})$ , where $f_{j}^{C} (\cdot), f_{j}^{R} (\cdot)$ are unspecified functions. In the SHIW example, we already considered the effects as functions, but they were restricted to be polynomials. Within the more general framework of additive modeling, the functions can be considered as unknown without being specified as polynomials. Typically the unknown functions are approximated by an expansion in basis functions. For example, one assumes $f_{j}^{C} (x) = \sum_{r = 1}^{M} β_{j r} φ_{j r} (x)$ , where φ_jr are fixed basis functions, for example, Gaussian kernels or B-splines. The latter has been propagated, in particular, by Eilers and Marx (1996). Then one estimates the parameters β_jr, which can be estimated in the usual way because the influential term is linear in the parameters. One option is to use few basis functions, say four to six; then estimation is still stable. A more flexible approach is to use many basis functions, say 40, but use penalization techniques that still allow to estimate the larger number of parameters. When the basis functions are chosen as B-splines, one obtains the so-called penalized splines (P-splines; for details see Eilers & Marx, 1996). By adapting these smoothing methods to the current problem, the modeling of response styles can be extended to include additive terms in the tradition of generalized additive models (Hastie & Tibshirani, 1986). We do not consider the approach in detail because it involves more advanced penalization techniques, which might detract from the main objective of the article.

The model considered here by construction disentangles the effects of response style and content for 1 item. The basic concept to include a subject-specific term (added for response categories r = 1, … , m − 1 and subtracted for categories r = m, … , k − 1 if k is odd) can also be used when one wants to model the response style for more than 1 item. The additional effect can be a simple subject-specific effect, representing heterogeneity of persons, or can depend on covariates in the way as specified here. Then one obtains a specific extended partial credit model that accounts for response styles. Although the extension is straightforward as a model, the estimation procedures used here might not be the best choice. In a partial credit model that accounts for the response style in the way proposed here, one has to estimate the item difficulties, the person abilities, and the additional response-style parameters, either as subject-specific parameters or as depending on covariates or both. If one uses just a linear term depending on covariates (and no subject-specific response-style parameter) the proposed estimation procedure can directly be used. However, it is certainly more attractive to model the heterogeneity by including an own subject-specific response-style parameter, for example, as a random effect. The modeling as random effect allows to reduce the number of structural parameters to estimate since one has only to estimate the variance of the random effects. However, then specific estimation procedures for the maximization of the marginal likelihood are needed and have to be developed. An additional problem is that the response style might depend on the item. The assumption that it is the same for all items is rather strong. If one lets it depend on items, one gets an inflation of parameters that call for regularization techniques or other novel estimation techniques. The extended partial credit model is certainly worth investigating, but the investigation of the possible models and the development of appropriate estimation tools need further research that is beyond the scope of the present article.

Nevertheless, we will shortly consider the differences of the method used here and some of the modeling approaches to response styles that have been proposed, in particular, in item response theory. A traditional way to account for differences in the use of rating scales are mixture models. For example, Eid and Rauber (2000) investigated measurement invariance in organizational surveys by using the polytomous mixed Rasch model. The basic assumption is that the whole population can be subdivided into disjunctive latent classes yielding parameters that are linked to the classes. Typically one fits models with two or three classes obtaining class-specific parameters that have to be interpreted. As Eid and Rauber (2000) demonstrated, when fitting a model with two latent classes, the classes might represent different response styles. The main difference to the approach propagated here is that response styles are not explicitly modeled. The resulting classes can represent extreme response styles or a tendency to the middle categories but do not have to. It might occur that no specific pattern referring to response styles is found for the latent classes. Although finite mixture models are an interesting approach to model heterogeneity, in particular, the number of latent classes is not so easy to determine, and if one fits a model with more classes, one might obtain quite different estimates and therefore different interpretations. Similar problems are found for the class of multidimensional extensions of response models that account for response styles as considered, for example, by Bolt and Johnson (2009). By including further latent traits in the predictor, one obtains multidimensional models. The additional traits can represent response styles. Again the difference is that response styles are not explicitly searched for. Of course, one might see this as an advantage. However, there is again some arbitrariness concerning the number of latent traits and the interpretation. The arbitrariness is augmented if the estimates have to be rotated (see, e.g., Bolt & Johnson, 2009) to obtain a simple interpretation. If one suspects different response styles, we find it more attractive to model them explicitly. If one accounts for them by construction, one can see if they are present or not.

More explicit modeling of response styles is found in tree-type models as considered, for example, by Thissen-Roe and Thissen (2013) and more recently by Jeon and De Boeck (2015). The models assume a sequential decision model. In a first stage, it is distinguished between a positive and a negative response, and in subsequent steps, the strength of the response is determined. Models of this type can be seen more general as nested models (Suh & Bolt, 2010). For ordinal responses with covariates, they have been used earlier by Tutz (1989). The models are similar in spirit to the approach proposed here; they model response styles by parameters and have to distinguish between odd and even number of categories. The main differences are in the sequential decision procedure and the parameterization. In step models, one assumes 1PL or 2PL models for the separate steps. In the approach considered here, there is no sequential mechanism assumed, and the parameters are embedded into an adjacent categories model.

Finally, we want to mention approaches to validate the interpretation of response style. In the case of several items, this may be done by either selecting 2 item subsets that are weakly or unrelated (Moors, 2003, 2004) or using many items (Johnson, 2003; Van Herk et al., 2004) that are unrelated (Baumgartner & Steenkamp, 2001; Clarke III, 2001; Weijters, Cabooter, & Schillewaert, 2010). This allows researchers to be certain that a persistent tendency across unrelated items can be ascribed to style (unrelated to item content). In our approach, only 1 item is used to detect response styles, but the model is constructed in a way to pick up the response style linked to the particular question that is asked.

8. Concluding Remarks

A model is proposed that simultaneously accounts for content-related effects and response styles that have a tendency to middle or extreme categories. Thus, content-related effects can be studied without being influenced by the presence of specific response styles and vice versa. In traditional ways to investigate extreme response styles, for example, by computing an index for extreme response styles as the relative number of scores given on the extreme categories as used among others by Bachman and O’Malley (1984) and Van Herk, Poortinga, and Verhallen (2004), it is not known how the content-related effects are linked to the index. This is avoided by simultaneous modeling.

A particular strength of the approach is that it provides an easy-to-use tool and may avoid biased estimates. Of course, it cannot solve all the problems connected to rating scales. For example, it does not address problems linked to the number of response categories and response category labels (Weijters et al., 2010) or the tendency to show greater acquiescence (Baumgartner & Steenkamp, 2001) but can ameliorate some of the effects that come with specific response styles. Since researchers should “do whatever they can to control for response styles” (Van Vaerenbergh & Thomas, 2013), an easy-to-use tool should also be used.

Footnotes

Acknowledgments

We thank three reviewers and the coeditor for their constructive comments which helped to improve the article considerably.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Agresti

(2009). Analysis of ordinal categorical data (2nd ed.). New York, NY: Wiley.

Bachman

J. G.

O’Malley

(1984). Yea-saying, nay-saying, and going to extremes: Black-White differences in response styles. Public Opinion Quarterly, 48, 491–509.

Baumgartner

Steenkamp

J.-B. E.

(2001). Response styles in marketing research: A cross-national investigation. Journal of Marketing Research, 38, 143–156.

Bock

R. D.

(1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51.

Böckenholt

(2012). Modeling multiple response processes in judgment and choice. Psychological Methods, 17, 665–678.

Bolt

D. M.

Johnson

T. R.

(2009). Addressing score bias and differential item functioning due to individual differences in response style. Applied Psychological Measurement, 33, 335–352.

Bolt

D. M.

Newton

J. R.

(2011). Multiscale measurement of extreme response style. Educational and Psychological Measurement, 71, 814–833.

Clarke

(2000). Extreme response style in cross-cultural research: An empirical investigation. Journal of Social Behavior & Personality, 15, 137–152.

Clarke

III (2001). Extreme response style in cross-cultural research. International Marketing Review, 18, 301–324.

10.

De Boeck

Partchev

(2012). Irtrees: Tree-based item response models of the glmm family. Journal of Statistical Software, 48, 1–28.

11.

Eid

Rauber

(2000). Detecting measurement invariance in organizational surveys. European Journal of Psychological Assessment, 16, 20.

12.

Eilers

P. H. C.

Marx

B. D.

(1996). Flexible smoothing with B-splines and Penalties. Statistical Science, 11, 89–121.

13.

Fahrmeir

Tutz

(2001). Multivariate statistical modelling based on generalized linear models. New York, NY: Springer.

14.

Gambacorta

Iannario

(2013). Measuring job satisfaction with cub models. Labour, 27, 198–224.

15.

Hastie

Tibshirani

(1986). Generalized additive models (c/r: p. 310–318). Statistical Science, 1, 297–310.

16.

Jeon

De Boeck

(2015). A generalized item response tree model for psychological assessments. Behavior Research Methods, doi:10.3758/s13428-015-0631-y

17.

Johnson

T. R.

(2003). On the use of heterogeneous thresholds ordinal regression models to account for individual differences in response style. Psychometrika, 68, 563–583.

18.

Kankaraš

Moors

(2009). Measurement equivalence in solidarity attitudes in europe insights from a multiple-group latent-class factor approach. International Sociology, 24, 557–579.

19.

Khorramdel

von Davier

(2014). Measuring response styles across the big five: A multiscale extension of an approach using multinomial processing trees. Multivariate Behavioral Research, 49, 161–177.

20.

Marin

Gamba

R. J.

Marin

B. V.

(1992). Extreme response style and acquiescence among hispanics the role of acculturation and education. Journal of Cross-Cultural Psychology, 23, 498–509.

21.

Masters

G. N.

(1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.

22.

Meisenberg

Williams

(2008). Are acquiescent and extreme response styles related to low intelligence and education? Personality and Individual Differences, 44, 1539–1550.

23.

Messick

(1991). Psychology and methodology of response styles. In Snow

R. E.

Wiley

D. E.

(Eds.), Improving inquiry in social science: A volume in honor of Lee J. Cronbach (pp. 161–200). Hillsdale, NJ: Lawrence Erlbaum.

24.

Moors

(2003). Diagnosing response style behavior by means of a latent-class factor approach. Socio-demographic correlates of gender role attitudes and perceptions of ethnic discrimination reexamined. Quality and Quantity, 37, 277–302.

25.

Moors

(2004). Facts and artefacts in the comparison of attitudes among ethnic minorities. A multigroup latent class structure model with adjustment for response style behavior. European Sociological Review, 20, 303–320.

26.

Moors

(2010). Ranking the ratings: A latent-class regression model to control for overall agreement in opinion research. International Journal of Public Opinion Research, 22, 93–119.

27.

Plieninger

Meiser

(2014). Validity of multiprocess IRT models for separating content and response styles. Educational and Psychological Measurement, 74, 875–899. doi:10.1177/0013164413514998

28.

Suh

Bolt

D. M.

(2010). Nested logit models for multiple-choice item response data. Psychometrika, 75, 454–473.

29.

Thissen-Roe

Thissen

(2013). A two-decision model for responses to Likert-type items. Journal of Educational and Behavioral Statistics, 38, 522–547.

30.

Tutz

(1989). Compound regression models for categorical ordinal data. Biometrical Journal, 31, 259–272.

31.

Tutz

(2012). Regression for categorical data. Cambridge, MA: Cambridge University Press.

32.

Van Herk

Poortinga

Y. H.

Verhallen

T. M.

(2004). Response styles in rating scales evidence of method bias in data from six EU countries. Journal of Cross-Cultural Psychology, 35, 346–360.

33.

Van Rosmalen

Van Herk

H. G. P.

(2010). Identifying response styles: A latent-class bilinear multinomial logit model. Journal of Marketing Research, 47, 157–172.

34.

Van Vaerenbergh

Thomas

T. D.

(2013). Response styles in survey research: A literature review of antecedents, consequences, and remedies. International Journal of Public Opinion Research, 25, 195–217.

35.

Weijters

Cabooter

Schillewaert

(2010). The effect of rating scale format on response styles: The number of response categories and response category labels. International Journal of Research in Marketing, 27, 236–247.

36.

Yee

T. W.

(2010). The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1–34.

37.

Yee

T. W.

(2014). VGAM: Vector generalized linear and additive models. R package version 0.9-4 available from http://CRAN.R-project.org/package=VGAM

38.

Yee

T. W.

Wild

C. J.

(1996). Vector generalized additive models. Journal of Royal Statistical Society B, 58, 481–493.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.12 MB