Abstract
This article presents an analysis of interviewer effects on the process leading to cooperation or refusal in face-to-face surveys. The focus is on the interaction between the householder and the interviewer on the doorstep, including initial reactions from the householder, and interviewer characteristics, behaviors, and skills. In contrast to most previous research on interviewer effects, which analyzed final response behavior, the focus here is on the analysis of the process that leads to cooperation or refusal. Multilevel multinomial discrete-time event history modeling is used to examine jointly the different outcomes at each call, taking account of the influence of interviewer characteristics, call histories, and sample member characteristics. The study benefits from a rich data set comprising call record data (paradata) from several face-to-face surveys linked to interviewer observations, detailed interviewer information, and census records. The models have implications for survey practice and may be used in responsive survey designs to inform effective interviewer calling strategies.
Introduction
The falling response rates in many surveys and countries (Baruch and Holtom 2008; de Heer 1999; de Leeuw and de Heer 2002) have shifted attention to the determinants of the response process in sample surveys. For interviewer-administered surveys, it has been recognized that interviewers play an important role in gaining both contact (e.g., Purdon, Campanelli, and Sturgis 1999) and cooperation from sample survey members (Couper and Groves 1992; Groves and Couper 1998; Jaeckle et al. 2009; O’Muircheartaigh and Campanelli 1999; Pickery and Loosveldt 2002, 2004). In face-to-face surveys, the interviewer needs to make one or several visits, referred to as calls, to each household to first establish contact and finally cooperation. Often hypothesized to be a key element in persuading sample members to take part in a survey is the interaction process between householders and interviewers at each contact (Bates, Dahlhamer, and Singer 2008; Couper and Groves 2002; Groves, Cialdini, and Couper 1992; Groves and Couper 1996; Groves and Heeringa 2006; Groves and McGonagle 2001; Sturgis and Campanelli 1998). This process may be best investigated by analyzing the actual call process and the interactions between the householder and the interviewer on the doorstep (Groves and Couper 1998).
This article aims to analyze the influence of the interviewer on the process leading to cooperation or refusal in face-to-face surveys. Of particular interest is the interaction process between the householder and the interviewer, including initial reactions from the householder to the survey request, and interaction effects between householder and interviewer characteristics. The article investigates the role of interviewer strategies, behaviors, and attitudes on the cooperation process. Rather than focusing on the final outcome of response or nonresponse as most previous research on interviewer effects (Durrant et al. 2010; Groves and Couper 1998; O’Muircheartaigh and Campanelli 1999; Pickery and Loosveldt 2000), the analysis here is carried out at the call level, focusing on the outcome at each call.
Although the analysis of call record data has in recent years found increasing attention (Bates et al. 2008; Groves and Heeringa 2006; Wagner 2013), the influence of the interviewer on a call-by-call basis on the response process has not yet been studied. One reason for this is that it is not obvious how best to analyze such complex time-dependent call record data. Durrant, D’Arrigo, and Steele (2011, 2013) introduce methodologies to analyze such complex data and showcase this using an application. Durrant et al. (2011) focus on the correlates of time to first contact, in particular on best times to contact a household. Durrant et al. (2013) analyze the correlates of the process to cooperation and refusal using a basic model, including household-level variables and some standard call record variables only but without analysis of interviewer effects and household–interviewer interactions. Their focus is on best times to achieve cooperation. The article here extends this previous research by using the principles of the methods developed but applying them to the investigation of interviewer and sample member interaction effects. More specifically, this analysis asks questions such as “Does the initial householder reaction predict the probability of cooperation at a future call?” “Does the way the interviewer makes contact with a household at a particular call influence cooperation rates at the current or future calls?” “What is the influence of interviewer characteristics, behaviors, and skills, in particular those related to tailoring, on cooperation rates across calls?” and “How do influences of call characteristics and the initial reactions of the householder change depending on the experience and the tailoring ability of the interviewer?”
Most of the previous research on call record data has focused on ways to establish contact with a household (Durrant et al. 2011; Kulka and Weeks 1988; Wagner 2013; Weeks et al. 1980). Since the contact and the cooperation/refusal stages are two quite distinct processes (Groves and Couper 1998; Lynn and Clarke 2002; Nicoletti and Perachi 2005), analyzing time to first contact is therefore not of direct relevance here. To investigate the influence of tailoring and the interaction process on the doorstep, which is the key focus of this article, a call-by-call analysis with information on what happens on the doorstep is most promising. So far, however, only few researchers have started to look at the influence of the interviewer on nonresponse during the call process. Blom, de Leeuw, and Hox (2010) investigate multilevel logistic models using data from the European Social Survey, taking account of households within interviewers and countries, to study cross-country differences. Jaeckle et al. (2009) focus on the effect of interviewer personality traits. However, information on nonresponding units is in both cases limited. Bates et al. (2008) make use of doorstep concerns and call histories to predict survey response, however without controlling for the influence of the interviewer. Purdon et al. (1999) investigate the effects of calling times on cooperation, but only presenting descriptive statistics and not accounting for the clustering of calls and households within interviewers. If interviewer influences were analyzed, then information about the household was usually very limited, meaning that interaction effects could not be investigated, or call information was not taken into account (e.g., Blom et al. 2010; Groves and Couper 1998; Jaeckle et al. 2009).
In this article, a multilevel multinomial model is used to examine jointly the different outcomes at each call, taking account of the influence of interviewer characteristics, call histories, and sample member characteristics. The multilevel modeling approach for the analysis of call record data, which can be complex with many different hierarchies, was first presented for the cooperation process in Durrant et al. (2013). This method is used and extended here. As in the previous work, the multilevel model accounts for clustering of calls within households and interviewers. Here, and this is new, the focus is on the influence of call variables that describe part of the interaction, interviewer characteristics, skills and behaviors, and cross-level interactions. By including interviewer-level variables, we aim to explain part of the significant interviewer variance.
The work is guided by the conceptual framework for response behavior, as developed in Groves and Couper (1998), and by sociological and psychological concepts (Goyder 1987; Groves et al. 1992). In addition to household characteristics, influences that have been hypothesized to play a key role in the response process include (a) call-level characteristics (Bates et al. 2008; Blom, Lynn, and Jaeckle 2008; Groves and Couper 1996), (b) the initial interviewer–householder interaction on the doorstep (Bates et al. 2008; Groves and McGonagle 2001), and (c) the influence of the interviewer (Blom et al. 2010; Groves and Coouper 1998; Groves and McGonagle 2001; O’Muircheartaigh and Campanelli 1999; Pickery, Loosveldt, and Carton 2001). Our analysis aims to investigate the influence of all of these components. In the literature, primarily three factors have been identified that may drive interviewer effects on nonresponse outcomes: interviewer experience (Durbin and Stuart 1951; Groves and Couper 1998), attitudes and confidence of the interviewer (Groves and Couper 1998), and strategies and behaviors. In particular, the importance of being responsive toward the individual sample member, including tailoring of strategies toward respondents and awareness of respondent concerns (Groves and Couper 1998; Morton-Williams 1993) has been stressed. It may be hypothesized that fixed effects of interviewer strategies, behaviors, and tailoring approaches are more important at the call level than for final response analysis.
This article benefits from rich information on interviewers based on a survey of interviewers working for the U.K. Office for National Statistics (ONS). This information was linked to the survey outcome from six U.K. household surveys. An advantage of the study is that rich information from interviewer observations and U.K. Census records is available for both responding and nonresponding sample members. In addition, comparatively detailed call record information, so-called paradata (Couper 1998), is available, including the time and day of the call, the outcome of the call, initial reactions from household members, and basic characteristics of the person the interviewer talked to on the doorstep.
The research is anticipated to influence field decisions, particularly in adaptive and responsive survey designs, where survey data collection outcomes are continuously monitored allowing early intervention and the alteration of the survey design (Groves and Heeringa 2006; Laflamme, Maydan, and Miller 2008; Kirgis and Lepkowski 2013). The results may guide strategies for interviewer training and performance management. The work may also contribute to further methodological development and to provide guidance to survey practitioners on how to use and analyze call record data and information on interviewers.
The remainder of the article is structured as follows. First, the different linked data sources are discussed. Then, the multilevel multinomial model and the results are presented. The final section provides a summary of the main findings and a discussion of the results.
Data
Information on Interviewers
The study makes use of the U.K. 2001 Census link study (Durrant et al. 2010), which combines the survey outcome of six face-to-face U.K. household surveys with detailed information on sample member from the U.K. 2001 Census, at the individual and household level, and rich paradata, including interviewer observation data about the households and areas, call record data, and information on interviewers. (Another U.K. Census has recently been conducted; however, linked data of this type have not yet been available for analysis.)
A comprehensive survey of interviewers working for the ONS in 2001 was conducted, designed to coincide with the 2001 Census (Freeth, Kane, and Cowie 2002). The survey collected information on sociodemographic characteristics, interviewer work history and qualification, interviewer attitudes (including attitudes to the persuasion of reluctant respondents and working at different times and days of the week), interviewing behaviors, strategies, and doorstep approaches, in particular indicators of tailoring abilities. Conceptually, this interviewer survey builds on previous work by Groves and Couper (1998) and Hox and de Leeuw (2002). The response rate was relatively high (84 percent). The interviewer survey was designed to be conducted prior to the fieldwork in question. (A very small number of interviewers filled in the questionnaire slightly later than planned. However, the implications for data analysis are expected to be small.) Participation in the survey was voluntary and interviewers who participated in the survey were paid 1 hour for their time. The survey was not anonymous since identifying information was needed to link the interviewer information to all other data sources.
Call Record Data and Interviewer Observations
The call data contain, in addition to the standard recordings, such as day, time, and outcome of the call, information about the initial reactions of the householder, such as whether the householder asked any questions and whether any positive or negative comments were made. The interviewer also recorded basic characteristics of the householder at each call, such as gender and approximate age, and information on how the contact was established (i.e., face-to-face, via an intercom system, or through a window or door). This type of information, often not collected in standard call record data, allows analysis of the householder–interviewer interaction and possible tailoring effects. The outcome of each call, the dependent variable in our analysis, distinguishes cooperation, refusal, making an appointment for another time, and “postponements.” Such “postponements” are defined as broken appointments and circumstances where the interviewer withdraws to come back later, if the interviewer is unable to make contact with a responsible resident or feels threatened. Cooperation is defined as at least one household member agrees to respond to the survey. The call data are recorded at every call and therefore are call dependent, that is, time varying. In addition, the interviewer collected so-called interviewer observation variables, that is, basic information about the household and immediate neighborhood, such as type of accommodation, indications of the presence of children, the condition of the house relative to others in the area, if the house is part of a council housing estate and an indication of how safe the interviewer feels walking in the area after dark. These variables are collected only once—if possible, at the first call. They are therefore time invariant.
Quality of the Paradata
Following the increasing use of paradata, the quality of such data has been a topic of recent debate (Sinibaldi, Durrant, and Kreuter 2013; West 2013). Since call record data are not part of the standard survey data, they are not undergoing the same editing checks and may therefore require further cleaning and editing before use. Paradata may be subject to missing data and measurement error. For this study, we were able to work closely together with the U.K. ONS, which ensured the linkage of the paradata to the survey and census records and reduced the likelihood of errors using additional editing checks. The interviewer observed paradata is subject to a comparatively small amount of missing values ranging from less than 2 percent for standard call variables, such as date and time of the call or type of house, to about 12 percent for more difficult to observe call record variables, such as presence of children. Furthermore, the unique features of this data set, in particular the linkage to the U.K. Census records, made it possible to assess the measurement error properties of some of the interviewer observed variables. Comparing interviewer observations with Census records, such as the type of house, adults in employment, ethnicity, and presence of children, showed a high correspondence between the variables analyzed (between 88 and 97 percent agreement), meaning that the interviewer-observed variables are of comparatively high quality (Sinibaldi et al. 2013).
Analysis Sample
Our analysis sample contains 38,816 contact calls. Noncontact calls are not investigated in this analysis (for an analysis of contact/noncontact calls, see Durrant et al. 2011). Households that were never contacted, vacant, and nonresidential addresses, reissues and unusable records were excluded from the analysis. The final analysis sample includes a total of 15,782 households, nested within 565 interviewers. Most of the guidelines to interviewers provided by the survey organization refer to contacting strategies. Some general guidelines are provided to interviewers on how to avoid or deal with a refusal on the doorstep, including calling back at least once after a refusal. It should be noted that the call record data are not from a randomized experiment but reflect observational data, which has both advantages and disadvantages (see also the discussions of this issue in Conrad et al. 2013; Durrant et al. 2013; Groves and Couper 1998; Purdon et al. 1999). The data allow the analysis at each call, in particular the interaction process between householder and interviewer, a key component in determining response behavior. However, causal effects are difficult to assess.
The six face-to-face surveys included in the study are the Expenditure and Food Survey (EFS), the Family Resources Survey (FRS), the General Household Survey (GHS), the Omnibus Survey (OMN), the National Travel Survey (NTS), and the Labor Force Survey (LFS). The final refusal rates across the six surveys range from about 14 percent for the LFS to about 30 percent for the EFS, which may be explained by the differences in survey topics, interview length, length of data collection period, and additional requirements such as a diary.
Statistical Data Analysis
The Event History Model
The analysis makes use of a multilevel multinomial logistic discrete-time event history regression model (Durrant et al. 2013; Steele, Diamond, and Wang 1996), which accounts for the clustering of calls within households and interviewers. The dependent variable in the model, ytij
, the response outcome at call t, made to household i by interviewer j conditional on contact being achieved at call t, contains four response outcomes: (1) refusal, (2) appointment made, (3) other form of postponement, and (4) cooperation. Modeling the log odds of outcome s (s = 1, 2, 3) relative to outcome 4 (cooperation), the multilevel multinomial logistic model employed here can be expressed as follows:
where
To estimate the models, maximum likelihood estimation is employed, using the aML software (Lillard and Panis 2003). To evaluate model fit, likelihood ratio tests are used (Goldstein 2011). This allows the comparison of nested models, for example to evaluate if a model including call-record and interviewer-level variables, leads to an improvement compared to a standard model only including basic household information. Predicted probabilities are derived using the method described in detail in Durrant et al. (2013; see also Rasbash et al. 2009) to help interpret the results, in particular of interaction effects, and to investigate effect sizes, given the large number of contact calls.
Details of Modeling and Choice of Explanatory Variables
A multilevel modeling framework for the analysis of call record data on cooperation was first introduced by Durrant et al. (2013), providing a detailed justification for the method. That article presented a basic model, controlling only for standard household and call-level variables to illustrate the method. The model employed here extends this approach. In the current article, the focus is on the influence of interviewer characteristics, cross-level interaction effects, that is, statistical interaction effects between householder and interviewer characteristics, and the effects of time varying variables, in particular variables that describe the initial interaction with the household on the doorstep. It is of interest to what extent the observable interviewer characteristics explain part of the unexplained interviewer variance. Here, the model conditions on contact being made with the household, since the focus is on the interaction between the householder and interviewer.
The following modeling strategy is used here. First, random effect models without any covariates are fitted to explore the random structure. Then, various sets of explanatory variables are added into the model one at a time to investigate their relevance on the response process. After first controlling for household-level variables, from both census and interviewer observation data, call record variables and finally interviewer characteristics and cross-level interactions are included. The models aim to control for survey design differences. A model including call characteristics and interviewer observation variables but not census variables is also explored.
Motivated by the conceptual framework for response behavior (Groves and Couper 1998), the final model explores the influence of (a) call-level characteristics describing the initial interaction between householder and interviewer, (b) interviewer-level characteristics and strategies, and (c) cross-level interaction effects, all of which had not been explored in the basic model by Durrant et al. (2013). The model controls for basic call record variables and household characteristics, as already included in the simpler model by Durrant et al. (2013).
The advantage of using a multinomial model, rather than fitting separate binary logistic models for each type of outcome, is that the effects of household and interviewer characteristics on the probability of refusal, appointment, and postponement may be evaluated simultaneously and tested for equivalence. The specification in (1) allows for a different set of covariates to be included in the three outcome equations. For covariates in the model, their effects may differ for the three outcome types and it may be of interest to test whether a given characteristic has the same effect on the three outcomes. The final model investigates differential effects of interviewer characteristics across the three different forms of nonparticipation (dependence of
Often, cross-level interactions between householder and interviewer characteristics cannot be investigated since either information (or even both) are not available. An advantage of our data set is that the exploration of cross-level interactions, in particular with interviewer doorstep approaches, is possible. We are interested if certain interviewers are more effective in handling more difficult cases. For example, more confident and more experienced interviewers may be expected to be more successful in achieving response from younger people, single households, from people that make negative comments, have questions or households that do not allow face-to-face contact. Further, certain doorstep approaches may have a positive influence on the response behavior of certain types of households. The influence of tailoring strategies may also have an impact on response as main effects. Interviewers who use certain tailoring strategies may perform better on the doorstep. Although sociodemographic interviewer characteristics as main effects have not been found to play a role (Groves and Couper 1998; O'Muircheartaigh and Campanelli 1999), they may have an influence in the interaction with householder characteristics, particularly with characteristics of the person on the doorstep. For example, older interviewers may be more successful in achieving cooperation with older householders. The modeling approach here allows testing for such effects.
Results
Differential effects of observed and unobserved interviewer characteristics and the influence of time varying call record information on the three nonparticipation outcomes are investigated. First, the random effects structure is discussed and then characteristics of the call including initial reactions of the householder are explored. Finally, main and interaction effects of interviewer characteristics on the process leading to cooperation are presented.
Random Interviewer Effects
We investigate the influence of unmeasured interviewer characteristics, represented by ζ(s) vj , on the three forms of nonparticipation. Different specifications of model (1) were explored. Across all models, we found significant residual variation (σ v ) in the log odds of a nonresponse outcome between interviewers, which holds also after controlling for household and basic area characteristics, call record variables, and interviewer characteristics. This implies that interviewers have a significant influence on the response outcome of a household, as would be expected in line with previous research on final response outcome (O’Muircheartaigh and Campanelli 1999; Pickery and Loosveldt 2002; Durrant et al. 2010). Slightly surprisingly, we did not find evidence for differential random interviewer effects on the three nonparticipation outcomes due to unobserved interviewer characteristics: The loadings ζ(s) (s = 1, 2, 3) on the interviewer random effect are assumed to be equal. (The likelihood ratio test statistic for a test of the null hypothesis H0: ζ(1) = ζ(2) = ζ(3) = 1 is 2.80 on 2 df, p = .246.) This means that unmeasured interviewer characteristics have the same effect on the log odds of each of the three nonparticipation outcomes. The final models explored are therefore a simplification of model with a random effect for interviewers, vj , but without loading ζ(s) (i.e., the interviewer random effect loadings ζ(s) are constrained to be equal to one).
Table 1 presents an overview of estimated household and interviewer random effect parameters for different specifications of the multilevel multinomial model, with the null model (model 0) only including random effects, model 1 with added household-level variables from both the census and interviewer observations, model 2 with added call record variables and model 3, the final model, with also interviewer level variables. We can see that, even after controlling for call-record and household characteristics, the common interviewer random variance remains significant, supporting the hypothesis that interviewers indeed play an important role on the nonresponse outcome at a particular call. Adding in interviewer-level and interaction effects explains part of this variation as would be expected and significantly improves the fit of the model (see likelihood ratio test between models 2 and 3). However, only a relatively small reduction in the interviewer variance can be observed. At the household level, the results also show significant residual variation in the log odds of a nonresponse outcome between households across all models. Contrary to the common interviewer random effect, there is, in addition, evidence of differential effects of unmeasured household characteristics uij across the three outcomes (based on t-tests that the loadings for postponement and appointment are equal to one: t = 3.1, p = .002 for H 0: λ(2) =1 and t = 5.1, p = 0.000 for H 0: λ(3) =1). These differential effects indicate a stronger household effect for postponement and a weaker effect for appointments across all models. As one may expect, the household random effect reduces by about half when household and call record variables are entered (models 1 and 2), implying that a significant part of the household variation is explained by these characteristics. The household random effect remains stable when interviewer effects are included (model 3). The likelihood ratio test statistic between models 1 and 2 indicates that adding in call record variables significantly improves the fit of the model which supports findings in Bates et al. (2008) that information on the call “greatly improve” nonresponse models. Slightly surprising at first sight is the observed increase in the interviewer random effect when call record variables are entered. However, as discussed in Snijders and Bosker (1999:217, 228-29) in the case of a multilevel logistic model, entering highly significant level-1 variables (here the call record variables) tends to increase the random effects of higher levels (here interviewer random effects), which is explained on the basis of the threshold representation.
Estimated Household and Interviewer Random Effects from Different Specifications of the Multilevel Multinomial Logistic Regression Model.
Note: Likelihood ratio test statistics: Between model 0 and 1: Likelihood ratio test statistic is 2 × 287.73, on 33 df, p = .000. Between model 1 and 2: Likelihood ratio test statistic is 2 × 13,366.41, on 84 df, p = .000. Between model 1 and 3: Likelihood ratio test statistic is 2 × 13,423.06, on 108 df, p = .000. Between model 2 and 3: Likelihood ratio test statistic is 2 × 56.65, on 24 df, p = .000.
aConstrained to equal 1.
***Significantly different from zero at the 1 percent level.
Household–interviewer Doorstep Interactions
Table 2 presents parameter estimates of two multilevel multinomial models. For easier comparisons, we have included the estimates from the basic model (a), which only controls for time-invariant household-level variables and some basic call record variables but neither interviewer nor interaction effects (this model was presented in Durrant et al. 2013). Then, the parameter estimates for the extended model (b) are presented, including in addition call record variables describing the interaction at the doorstep, interviewer characteristics, and cross-level interaction effects, which is the focus of this article.
Estimated Coefficients (and Standard Errors in Parentheses) of Two Multilevel Multinomial Logistic Models: (A) Basic Model Controlling Only for Household Characteristics and Basic Call Variables (but without interaction call variables, interviewer characteristics, and cross-level interactions; model presented in Durrant et al. 2013), and (B) Extended Model Including in Addition Call Variables from the Contact Process at the Doorstep, Interviewer Level Variables and Cross-level Interviewer Interactions (Model 3).
Note: Time-invariant variables included in both model as controls (coefficients not shown here; for further information on their effects, see Durrant et al. 2013): Interviewer observations: type of accommodation, house in a better or worse condition than others in area; Household-level variables: household type, preschool children present, London indicator, urban/rural indicator, indicator if adults in employment, qualification of household reference person, survey indicator. The models are estimated using full information maximum likelihood. As a closed form solution to the maximum likelihood function does not exist, the residuals at each level are “integrated out” numerically using Gauss-Hermite quadrature. The number of quadrature points used is 16. Approximate standard errors are computed based on an approximation to the Hessian matrix. The missing value categories have been suppressed to save space. Coding of time of call: a.m. = .00–12.00, p.m. = 12.00–17.00, eve = 17.00–.00.
*Significant at the 10 percent level. **Significant at the 5 percent level. ***Significant at the 1 percent level.
Comparing both models (models A and B), the size of the effects of the basic call record variables, change slightly, but there are no noticeable differences in the significance and the direction of these effects, as one would expect. The basic call record variables, such as the day and time of the call, if an appointment was previously made, number of calls until first contact, and number of intermediate noncontact calls are all found to be significant in predicting response at a call. For example, if the previous call was an appointment, this reduces the probability of a refusal at the next call. These basic call record effects were already explored in the example model in Durrant et al. (2013) to illustrate the method of analyzing time varying variables and are not discussed here further.
We now turn to the effects of the initial interaction between householder and interviewer on the doorstep and the effects of interviewer characteristics (model B), the main focus of this article. As may be expected, the way the contact between the interviewer and the householder was made on the doorstep—if directly face-to-face, or indirectly via a closed window or door, or through an intercom system—seems to make a difference: Direct contacts lead to lower rates of refusals, appointments, and postponements. Not opening the door for the interviewer may indicate a higher suspicion toward strangers (It should be noted that this effect remains after controlling for household and area characteristics.).
The direct response from the householder on the doorstep is found to be a good indicator of cooperation, in line with the findings in Bates et al. (2008). If the householder shows some interest by asking at least one question refusal, appointments and postponements are less likely to occur. Similarly, if the householder makes at least one positive or neutral comment, the likelihood for a refusal or a postponement is reduced in comparison to no comment made. Interestingly, the likelihood for an appointment increases in this case. As might be expected, if the householder makes at least one negative comment, refusal, appointments, and postponements are significantly more likely. The findings support the results in Bates et al. (2008) who also report significant effects of variables describing the initial reaction of the householder to the survey request. Such information improves models predicting response in comparison to models based only on time-invariant information or models only accounting for the basic call history, such as number of previous calls.
In addition, characteristics of the person on the doorstep, such as gender and approximate age, seem to be useful in predicting the outcome of a call. Women are significantly more likely to make appointments than men; postponements are also more likely to occur. This may reflect a greater reluctance toward strangers or a fear of crime among women (Clemente and Kleiman 1977; Morton-Williams 1993). Other factors due to differences in lifestyles may also contribute to this effect. Women may be more likely to be looking after children when at home and may prefer the interviewer to call back at a more convenient time. No differences in the immediate refusal rates between men and women are observed. The older the householder on the doorstep, the less likely are refusal, appointments, and postponements, in particular for householders aged 60 years and older. If the person is less than 16, refusals, appointments, and postponements are highest.
Interestingly, with an increasing number of contact calls refusals, appointments and postponements seem to be less likely to occur, that is, the odds of cooperation increase with each additional contact made. This supports the findings of Groves and Heeringa (2006) and Sangster and Meekins (2004) who also report a significant positive effect of a prior contact with the household on the likelihood of a main interview. The number of contact calls may be interpreted as an indirect measure of an interaction between the householder and interviewer. The finding may imply that an ongoing interaction between the interviewer and the householder may be more likely to lead to a positive outcome, which would support the interaction hypothesis of Groves and Couper (1996, 1998) that an ongoing interaction may impact positively on the likelihood of response. The effect could also indicate that interviewers are persistent in returning to a household if they feel they have a chance of a positive outcome.
Influence of Interviewer Characteristics and Cross-level Interactions
A range of interviewer level characteristics are explored that may constitute part of the significant interviewer variance. We first comment on potential interviewer main effects before describing cross-level interactions. First, the model controls for interviewer work experience and qualification, often found to be significant predictors in the analysis of interviewer effects (Groves and Couper 1992; Groves and Couper 1998; Hox and de Leeuw 2002). We find that interviewers with up to two years’ experience have significantly higher immediate refusal rates than interviewers with three or more years. In fact, when we used the experience variable with a finer categorization of four groups we found that with increasing number of years of experience the immediate refusal rate decreased, and was particularly low for interviewers with nine or more years’ experience. This supports findings in Groves and Couper (1998) who report a linear relationship between (final) response rates and length of experience. Interestingly, interviewers with lower levels of experience also show higher appointment and postponement rates. One may expect higher refusal rates for interviewers with a lower qualification. Here, no noticeable differences in the immediate refusal rates for different levels of interviewer qualification have been found, at least not after controlling for interviewer experience. Interviewers with low or no qualifications are less likely to experience appointments or postponements.
In line with previous research, we find significant evidence for the importance of interviewer attitudes (Groves and Couper 1998; Hansen 2007; Hox and de Leeuw 2002). An indicator of the interviewer’s self-confidence seems to be predictive of the outcome: Interviewers who are more confident in their ability to convince reluctant respondents have indeed significantly less refusals. Interestingly, these interviewers also experience significantly less appointments and postponements. Interviewers who agree that they should persuade most reluctant respondents are also significantly less likely to experience a refusal. No differences on making appointments are observed.
With regard to making appointments, we also find that all of the household-level variables included in the model reveal significant effects on making appointments, for example households with children, living in a house or in an urban area are significantly more likely to book appointments (estimates of coefficients not individually listed in Table 2). To summarize, the likelihood of making appointments therefore seems to be dependent on all three types of characteristics: interviewer, household, and call characteristics.
A number of variables indicating the tailoring ability of the interviewer are explored. However, we did not find much conclusive evidence. The variables were either only marginally significant (e.g., if interviewer finds it difficult to modify their approach depending on the respondent) or not significant (e.g., if interviewers think they can vary their approach). For one variable, the significant coefficient was in the opposite direction to that anticipated (interviewers who disagree with the statement that they can vary their approach from situation to situation are less likely to get a refusal). Once we included important controls, such as interviewer experience, these effects were not significant any more in the final model. Such tailoring indicators are explored further in interactions with the householder on the doorstep (see subsequently). Similarly, interviewing strategies, such as altering the introduction depending on the household they visit or how best to introduce themselves and the survey, are explored but again, once variables such as interviewer experience and qualifications are controlled for, these variables are not significant any more. A potential problem with most of these variables is that they report what the interviewer generally does and reflect the interviewer’s own perception of their behavior, but do not necessarily indicate what actually happens at each call (see also the argument in Groves and Couper 1998). One variable, however, is recorded at the call level (in addition to the self-report from the interviewer on what they generally do), which allows testing of call-specific influences. Leaving a card or message behind, recorded for each call, reduces the likelihood of a postponement at the next call but otherwise does not show a significant effect (and is therefore not included in the final model).
We have already noted that interactions with the householder and tailoring ability of the interviewer have been hypothesized to play an important role in successful interviewing strategies. A number of interaction effects are investigated and two are included in the final model (Table 3). We hypothesized that more experienced interviewers may be better at responding to comments and questions. We do not find any difference in the various levels of interviewer experience regarding their ability to handle negative or positive comments. However, if there are no comments, possibly indicating a lack of engagement with the survey request and a potentially more difficult respondent, more experienced interviewers seem to be better at achieving cooperation and have a lower probability to receive a refusal than less experienced interviewers. Similarly, we find that, if the householder does not ask any questions, potentially indicating a more difficult case, then less experienced interviewers have lower cooperation, higher refusal, and higher postponement rates (effect not included in final model). If the householder asks at least one question, then both experienced and less experienced interviewers seem to perform equally well. Experience did not significantly interact with the way the contact was made on the doorstep or with demographic characteristics of the person on the doorstep, for example with age and gender. Interactions between the level of confidence and questions asked or comments made were either not significant or not easily interpretable. If a householder does not ask any questions, indicating a potentially more difficult case, then interviewers that can use a wide variety of approaches have higher cooperation, slightly lower refusal, and lower postponement rates. This may support the hypothesis that interviewers that are more able to tailor their approach to the respondent may be more successful (Groves and Couper 1998). Householder–interviewer interactions on sociodemographic characteristics (such as age and gender of interviewer with age and gender of person on the doorstep) were all not found to be significant. This is in line with previous results that also did not find much support for sociodemographic interaction effects (Durrant et al. 2010; O'Muircheartaigh and Campanelli 1999).
Predicted Probabilities of Each Outcome (in Percentage) for Cross-level Interviewer Interactions (Model 3).
Summary and Implications for Survey Practice
This article presents analysis of interviewer effects in the nonresponse process using interviewer call record data. Of particular interest are interaction effects between householders and interviewers on the doorstep, influences of call characteristics, tailoring strategies of interviewers, and the effects of interviewer characteristics. The aim is to better understand the process leading to cooperation or refusal, analyzing response outcome at each call, rather than focusing on predicting final response. The main findings are as follows:
Call characteristics are important predictors when analyzing the response outcome of each call. Characteristics of the interaction process between the interviewer and the householder and information about the initial reaction of the householder on the doorstep measured at the call-level are of particular relevance, including how contact was established, sociodemographic interviewer observations of the person at the door, and whether this person asked questions or made comments.
We find that the more contact calls made, the higher the odds of cooperation. This may provide some evidence that keeping in contact with the household may increase the chances of a successful interview. The finding could support the hypothesis expressed in Groves and Couper (1996, 1998) that maintaining the interaction with the household is more likely to lead to cooperation. Rather than pressing for an immediate cooperation, the interviewer may be advised to keep the conversation and the contact with the household going, for example by making an appointment for another time.
Unmeasured interviewer characteristics have a significant effect on nonparticipation outcomes, in line with previous research that investigated final response outcome. Interestingly, no evidence for differential effects due to unmeasured interviewer characteristics on the three nonparticipation outcomes are found, that is, the influence of the interviewer random effect is the same across the three nonparticipation outcomes.
A number of interviewer characteristics are found to have significant effects on the process leading to cooperation or refusal. In line with previous research on final response outcome, the attitude of the interviewer toward refusal conversion and the interviewer’s self-confidence play an important role. The length of interviewer experience is significantly negatively associated with refusal on the doorstep (although, using observational data, we cannot disentangle whether interviewer experience is cause or outcome of refusal probability). Interestingly, interviewers with lower levels of experience also show higher appointment and postponement rates. We find some evidence that more experienced interviewers handle more difficult cases better. Analyzing main interviewer effects, we do not find conclusive evidence that interviewers who indicate to tailor their approach to be more effective. We do not find much support for differential interviewer effects on the three nonparticipation outcomes. For example, interviewer’s experience, qualification, and confidence impact all three nonparticipation outcomes.
The likelihood of making an appointment seems to be dependent on all three types of influences: interviewer and household characteristics and the circumstances of the call. For example, householders who are female, younger than 60 years of age (in particular if younger than 16), live in a house or have preschool children are more likely to make an appointment. If the call is made in the evening, the probability of appointment is significantly higher than for a call during daytime.
A potential limitation of the data is that, although the majority of characteristics are recorded at the call level, some information on specific interviewing strategies only reflect what an interviewer does in general. More information at the call level may therefore be beneficial to identify general trends on interviewer tailoring abilities. Another potential limitation is that the data are not obtained via a controlled experiment but reflect observational data and statements about causal effects may be limited. Response patterns may change over time and although our models include a wide variety of variables to control for these effects, further research will be required to analyze trends over time, for example using call record data from a longitudinal study.
The findings exhibit various implications for survey practice. Such models and the variables identified here as important may be used in responsive survey designs (Groves and Heeringa 2006; Kirgis and Lepkowski 2013; Laflamme et al. 2008), where the continuous measurement and monitoring of the process and survey data offers the opportunity to alter the design during the course of the data collection. The overall aim is not nonresponse adjustment for postsurvey use and estimation, but to inform the survey process during data collection, for example to reduce costs and to improve the quality of the resulting survey data. Specific recommendations for survey practice and responsive survey designs include the following:
Time-varying call record information, such as features of the call history and of the current call, play a key role in predicting the outcome of each call. The routine collection of such variables would therefore be beneficial to survey agencies. An increased number of initial or intermediate noncontacts and certain comments and questions from a householder already indicate a reduced likelihood to respond at a future call. Such signs help interviewers and survey agencies to flag more difficult cases early on and to inform intervention schemes that survey agencies can employ before the end of the data collection period to reduce final nonresponse rates. The findings may help survey agencies to determine how best to approach a household at the next call. The survey agency can then respond to such early clues by changing the contacting strategy, by offering a higher incentive, and sending a more targeted invitation letter or a more experienced interviewer. Survey agencies may use the information to inform when best to call to achieve cooperation, in particular if no prior appointment has been made.
The models also inform improvements to interviewer calling strategies, interviewer training, interviewer selection, and evaluation of interviewer performance. Interviewers may be trained to pick up important clues about the characteristics of the household or the future response behavior early on and to feed these back to the field management via an automated system particularly useful for responsive survey designs. More recently, some survey agencies have started to ask the interviewer to evaluate a household’s willingness to respond which has been shown to be a good indicator for future response behavior (Copas and Farewell 1998; Eckman 2011; Wagner and Guyer 2005). The research also identifies areas where interviewers may be better trained in responding to initial reactions of the householder on the doorstep. Interviewer’s experience and confidence of the interviewer were found to play a key role with more experienced and more confident interviewers showing higher likelihoods to achieve cooperation. Less experienced and less confident interviewers showed significantly higher appointment and postponement rates.
Interviewer observations such as characteristics of the house and neighborhood have been shown in previous studies to be useful in predicting response (Durrant et al. 2011; West and Kreuter 2011). Here, we find that call characteristics, observations of the initial interaction and characteristics of the householder play a key role. Survey agencies may therefore collect such information routinely on their surveys.
Footnotes
Acknowledgment
We thank two anonymous referees for their very helpful comments.
Authors’ Note
This work contains statistical data from Office for National Statistics (ONS) which is Crown copyright and reproduced with the permission of the controller of Her Majesty's Stationery Office (HMSO) and Queen's Printer for Scotland. The use of the ONS statistical data in this work does not imply the endorsement of the ONS in relation to the interpretation or analysis of the statistical data. This work uses research data sets that may not exactly reproduce National Statistics aggregates.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was funded by the U.K. Economic and Social Research Council (ESRC), “The Use of Paradata in Cross-Sectional and Longitudinal Research,” grant number: RES-062-23-2997.
