Abstract
Survey records are increasingly being linked to administrative databases to enhance the survey data and increase research opportunities for data users. A necessary prerequisite to linking survey and administrative records is obtaining informed consent from respondents. Obtaining consent from all respondents is a difficult challenge and one that faces significant resistance. Consequently, data linkage consent rates vary widely from study to study. Several studies have found significant differences between consenters and nonconsenters on sociodemographic variables, but no study has investigated the underlying mechanisms of consent from a theory-driven perspective. In this study, we describe and test several hypotheses related to respondents’ willingness to consent to an earnings and benefit data linkage request based on mechanisms related to financial uncertainty, privacy concerns, resistance toward the survey interview, level of attentiveness during the interview, the respondents’ preexisting relationship with the administrative data agency, and matching respondents and interviewers on observable characteristics. The results point to several implications for survey practice and suggestions for future research.
Keywords
Introduction
Sample surveys are facing significant challenges due to falling response rates (Curtin, Presser, and Singer 2005; De Leeuw and de Heer 2002), noncoverage of the target population (Blumberg and Luke 2007), and increasing survey costs. All of these challenges have the potential to threaten the quality of the collected survey data and reduce confidence in the conclusions drawn from surveys. Despite these challenges, the demand for more detailed and comprehensive data structures has grown in the research and policy realms. To meet these demands, survey organizations may be forced to rely more heavily on complex, but less proven, methodologies to overcome response rate problems, reduce survey costs, and improve the quality and utility of the data.
One promising method toward achieving these objectives is to link administrative data to survey data. Linking these data sources can significantly enhance the quality and utility of the survey data and enable data users to answer important substantive and methodological research questions that cannot be addressed using a single data source (Calderwood and Lessof 2009; Lillard and Farmer 1997). Administrative data linkage may also lead to shorter interviews, less respondent burden, and an overall reduction in survey costs. Furthermore, administrative data can potentially offer a significant increase in the number of auxiliary variables that may be used for nonresponse bias adjustment.
For most surveys, a necessary prerequisite to directly linking survey and individual-level administrative records is obtaining informed consent from respondents. Informed consent is needed to ensure that respondents are aware of the risks and benefits involved in releasing and linking their administrative records for research purposes (GAO 2001). Obtaining linkage consent from all respondents is a challenging task. Linkage consent rates (conditional on survey response) vary substantially within and between surveys, thereby increasing the risk of bias for key estimates. Table 1 shows the variability in consent rates (and corresponding response rates) across different surveys and administrative data types. The range of consent rates is large, ranging from a low of 19.0 percent (McCarthy et al. 1999) to a high of 96.5 percent (Rhoades and Fung 2004).
List of Studies Linking Survey Records and Administrative Data Sources
Because respondent consent is not universal, there are concerns that inferences drawn from linked data sets may be biased. Several studies have found systematic differences between consenters and nonconsenters based on sociodemographic characteristics (see Kho et al. 2009 and Dunn et al. 2004 for reviews), however, there is little consistency across studies. For example, some studies find that older persons are more likely to consent than younger persons (Bryant et al. 2006; Dunn et al. 2004), while other studies find that younger persons are more likely to consent than older persons (Huang et al. 2007; Yawn et al. 1998), and yet some studies find no age effect at all (Buckley et al. 2007; Harris et al. 2005). Gender and income exhibit similar inconsistent relationships with the likelihood of consent: gender and income are significantly related to consent in some studies (Huang et al. 2007; Olson 1999) but not others (Al-Shahi, Vousden, and Warlow 2005; Harris et al. 2005). The inconsistent findings make it difficult to advance the study of linkage consent for purposes of minimizing bias and informing survey practice.
The inherently different populations that have been studied (e.g., patients, caregivers, older birth cohorts) may partially explain the lack of consistency across studies. The conflicting findings may also be driven by different underlying mechanisms of consent. The existing literature does not adequately develop and test theory-driven hypotheses on why some respondents consent to data linkage and others do not. For example, one hypothesized reason why respondents may consent to administrative data linkage is because they are unable to accurately self-report information contained in their administrative record (e.g., self-reports of unemployment spell duration in a labor market survey linked to employment records) and therefore find it easier to provide this information through data linkage. Another hypothesis is that respondents refuse to consent due to privacy concerns or general resistance toward the survey interview. These two hypotheses have never been tested jointly on the same survey data set. Other hypothesized mechanisms of consent involve the effects of interviewers on influencing respondent consent, and whether respondents have a preexisting relationship with the administrative data agency, among others.
The purpose of this article is to develop and test hypotheses of data linkage consent using a national population-based panel survey of older adults who were asked for consent to link earnings and benefit records at the end of the interview. In doing so, this article fills an important research gap by utilizing theory-driven hypotheses to examine respondents’ likelihood of linkage consent.
Hypothesized Mechanisms of Consent
Accuracy/Uncertainty
Administrative records (e.g., tax records, medical billing records) often contain detailed and relatively complex information collected over a long period of time. Such information can be burdensome for respondents to recall with great precision and expensive for survey organizations to collect during the survey interview. Still, it is common for surveys to ask respondents to report on some variables that can be found in administrative records (eg, household income, health care utilization) in conjunction with the linkage request. A respondent who is unable to report their administrative information with certainty may be more likely to consent to the subsequent data linkage request in order to provide the survey organization with a more accurate assessment of their situation.1,2 This hypothesis of consent has never been tested.
Privacy/Confidentiality Concerns
One of the most frequently posited hypotheses concerning why many respondents are unwilling to consent to administrative data linkage is because they are concerned about a potential breach of confidentiality, they fear that their records could be used for nefarious purposes, or feel that the information contained in the records is too personal to be released. Studies exploring this notion have the disadvantage of looking at relatively indirect indicators of privacy concerns, such as refusal to answer sensitive items (e.g., income; Jenkins et al. 2006; Sala, Burton, and Knies 2012), rather than direct or interviewer-observed assessments of such concerns. Obtaining more direct assessments of respondents’ privacy concerns and relating them to their likelihood of consent would be useful for purposes of understanding when nonconsent is likely to occur and the mechanism behind its occurrence.
General Resistance Toward the Survey Interview
During the survey recruitment process, sample members may be resistant to the interview request due to lack of interest, burden, and/or distrust of the survey sponsor, among other reasons (Groves and Couper 1998). Despite their initial resistance, many persons eventually agree to participate in the survey interview. It has been hypothesized that these initially reluctant respondents are less thoughtful (and less accurate) in answering survey questions (Fricker 2007; Olson 2006). It is also plausible that these resistant respondents may be less willing to comply with in-survey requests, such as linkage consent. For example, there is evidence in the biomarker consent literature that respondents who required the most contact attempts (an indirect indicator of resistance) or were observed by interviewers to be uncooperative during a prior-wave interview were less likely to consent to biomarker participation in the current wave (Sakshaug, Couper, and Ofstedal 2010). Determining whether respondents who are initially resistant to the survey interview, but ultimately participate, are less likely to give linkage consent is an important step that could help survey practitioners to preidentify (and possibly intervene on) respondents who are potentially resistant to the linkage request.
Preexisting Relationship With the Administrative Data Agency
A respondent’s relationship with the administrative data agency may also influence whether they consent to the linkage request. One hypothesized reason why respondents are more likely to consent to a specific linkage request is because they receive services or benefits from the administrative data agency. For example, respondents who are currently receiving social security payments may feel incentivized to consent to link their benefit records due to the saliency of the information being requested, familiarity with the government agency charged with administering their benefit, and reciprocation for the benefits received. There is some evidence in the medical literature that tangentially supports this hypothesis. Dunn et al. (2004) found that survey responders who reported the health symptoms under investigation were more likely to consent to a follow-up interview. Woolf et al. (2000) reported that people in poorer health were more likely to consent to review of medical records. Petty et al. (2001) found that people receiving more repeat prescriptions were more likely to consent to medication review. These studies suggest that respondents who receive services or benefits or who have experience or familiarity with the information being requested are more likely to give consent. How this relates to linkage consent and whether respondents with a vested interest in the type of administrative information being requested are overrepresented in data linkage studies are unclear.
A counterhypothesis is that service or benefit receipt from the administrative data agency may actually reduce the likelihood of consent. Respondents who have a preexisting relationship with the agency may be more concerned about the potential linkage, especially if the relationship has been negative (e.g., loss of eligibility due to a negative assets assessment by the agency). Furthermore, benefit recipients may be concerned about someone “checking up” on them and examining whether they are claiming benefits fraudulently. This would motivate some respondents to refuse the linkage and negate the reciprocation effect put forward in the first hypothesis.
Acquiescence/Attentiveness
Several studies have shown that respondents have a tendency to acquiesce when presented with difficult survey questions (Gage, Leavitt, and Stone 1957; Hanley 1962; Trott and Jackson 1967). In general, this notion is related to the theory of survey “satisficing,” which suggests that subsets of respondents take various cognitive shortcuts that lead to suboptimal response formation (Krosnick 1991, 1999). Given the complexities involved with describing the purpose, process, and implications of data linkage, it is plausible that acquiescent or inattentive respondents may give consent without taking the time to consider all of the information provided to them about the linkage request. In a sense, these respondents are choosing the simpler task, which is the provision of consent over the more demanding task of carefully reading all of the linkage materials before they make a decision. An alternative hypothesis is that acquiescent respondents may be more likely to reject the consent request because they perceive that to be the easier task. The notions of acquiescence and satisficing have not been studied in the context of linkage consent.
Interviewer Effects
A relatively unexplored source of consent variation is that due to the interviewer. Interviewers are charged with administering the consent request, helping respondents understand what is being asked for, and addressing any related respondent concerns. Given that interviewers are usually not incentivized to obtain linkage consent from respondents nor are they instructed to attempt consent refusal conversions, it is unclear how effective they are in obtaining consent. It is also unclear whether interviewers vary in their ability to obtain consent in the same way that interviewers vary in their ability to recruit survey respondents (O’Muircheartaigh and Campanelli 1999), though this seems quite plausible. Furthermore, it is unknown whether the same interviewer attributes that have been linked to increases in survey participation are also associated with linkage consent. Some studies have shown that interviewer attitudes toward survey refusal conversion and interviewer personality traits (measured using the “Big Five” personality inventory; John and Srivastava 1999) are positively related to recruitment of survey respondents (Jäckle et al. 2010; Kennickell 1999; Lehtonen 1996). Whether these interviewer traits are also related to the likelihood of consent is an open question, but most evidence points to the lack of a relationship. In a study of consent to link health and benefit records in the United Kingdom, Sala et al. (2012) found no relationship between interviewers’ attitudes toward persuading reluctant persons to participate in the survey and their likelihood of obtaining consent from respondents. The authors also found no relationship between interviewer personality traits and respondents’ likelihood of consent. If, as the evidence suggests, interviewer personality and persuasion traits do not explain the significant interviewer-level variation in consent rates (Sala et al. 2012), then other potential interviewer-level attributes should be considered. Identifying such factors may lead to improved interviewer training resulting in higher consent rates.
One hypothesis of consent is that interviewers who possess critical views toward data privacy are less likely to obtain linkage consent from respondents, especially if those respondents possess similarly critical views. Interviewer demographic characteristics and their interaction with respondents may also play a role in the likelihood of obtaining consent. It has been shown that matching respondents and interviewers on demographic characteristics can have a positive influence on survey participation (Webster 1996). But what is unknown at this point is whether matching interviewers to respondents based on common demographic characteristics can lead to a higher likelihood of consent. Alternatively, matching interviewers and respondents on demographic characteristics could backfire if they both share similarly sensitive views on data privacy. Interviewer experience may also influence whether a respondent provides consent. Prior interviewing experience has been shown to be beneficial for the recruitment of survey participants (Lipps and Pollien 2010), but whether it helps interviewers obtain consent from respondents is an open question. The expectation is that experienced interviewers perform better than their inexperienced counterparts in obtaining consent from respondents due to their greater familiarity with administering the consent request and addressing respondent concerns.
In the following sections of this article, we address these hypotheses using data from the 2008 Health and Retirement Study (HRS; http://hrsonline.isr.umich.edu/), a nationally representative panel survey of older adults. This rich longitudinal data source allows us to test the aforementioned hypotheses using indicators derived from current- and prior-wave data collections, interviewer observations, and call record histories.
Method
Sample and Data Source
The HRS is a federally funded longitudinal survey of adults over the age of 50 conducted by the Institute for Social Research (ISR) at the University of Michigan. The study began in 1992 with a cohort of then preretirement-aged individuals born between 1931 and 1941. New cohorts were added in 1993 and 1998 to round out the sample over age 50, and additional cohorts are enrolled every 6 years (e.g., in 2004, 2010) to refresh the sample at the younger ages. Response rates range from 70 percent to 82 percent in the baseline wave (depending on birth cohort and entry year), and from 87 percent to 89 percent at each follow-up wave. 3 The HRS conducts about 20,000 interviews every 2 years using a combination of telephone and in-person interviews. In 2006, HRS launched what is referred to as an enhanced face-to-face (E-FTF) interview, which includes a set of physical measures and biomarkers and a self-administered questionnaire on psychosocial topics. A random half of the sample was assigned to receive the E-FTF interview in 2006 and the other half in 2008. The plan is to repeat the E-FTF interview every other wave (or every 4 years) on the alternating half sample.
Consent Procedures
In accordance with an agreement with the Social Security Administration (SSA), HRS respondents are periodically asked to grant ISR permission to obtain respondents’ earnings and benefit histories as reported to the SSA. Until 2004, the SS linkage consent was retrospective, that is it covered linkage of past earnings histories and benefits up to the date on which the consent was granted. Starting in 2006, HRS began asking for consent for prospective linkage of SS records. HRS has had the best success when the SS linkage request is administered during an in-person interview (as opposed to a telephone interview). As a result, the SS linkage request was folded into the E-FTF interview starting in 2008. In that wave, a random half of respondents (the 2008 E-FTF sample) were asked for prospective consent to release their SSA records. The consent form contains the following statements:
We would like to obtain a history of your earnings and any benefits from programs administered by the Social Security Administration applied for or received through 2023. Since most people cannot recall this information very well, we are asking for your permission to obtain from government records the following: 1) Your earnings reported to Social Security. 2) Any information about benefits from programs administered by the Social Security Administration applied for or received through 2023.
SSA requires separate signatures for earnings records and benefits records. Consenting respondents were asked to provide their social security number to facilitate the linkage, but this piece of information was not required to perform the linkage. The consent request was administered at the end of the interview. A small minority of respondents in the E-FTF sample (about 5 percent) declined the in-person interview but agreed to complete their interview by telephone. These individuals were still asked for SS linkage consent at the end of the interview. The permission form was mailed to them and they were asked to complete the form and send it back to the home office. Overall, about 68 percent of respondents consented to the SS linkage.
Indicators of Consent Mechanisms
Accuracy/uncertainty
One hypothesis of consent to be examined in this study asserts that respondents who do not know or are unable to report their administrative information with high accuracy are more likely to consent to the linkage request in order to facilitate the transaction of this information. The HRS asks respondents to report on several variables that can be found in the linked SSA earnings records, including Social Security income (Do you currently receive any income from Social Security?), Supplemental Security Income (Did you receive any income last month from Supplemental Security Income, also called SSI?), and other sources of government income receipt, including welfare (Did you receive any income from welfare in the last calendar year, not including SSI?), veteran benefits or military pension (Are you currently receiving any income from veteran benefits or a military pension?), and food stamps (Did you receive government food stamps at any time since month of previous interview?). An additive index was created that is the sum of “don’t know” responses given to these five income items (range: 0–5). This uncertainty index is used as an indicator of respondents’ uncertainty about their earnings. The hypothesized expectation is that the index will be positively related to the likelihood of consent.
Privacy/confidentiality concerns
Another hypothesis of consent is that respondents are less likely to consent to data linkage if they have concerns regarding privacy and data confidentiality. We test this hypothesis using interviewer observations collected during the prior-wave HRS interview in 2006. 4 In addition, we construct an indicator which is the number of item refusals given to the five financial questions used to construct the uncertainty index as described in the preceding paragraph. At the end of each interview, interviewers were asked to report on whether respondents ever inquired about the need to know the answers to some of the survey items, whether respondents expressed concerns about the confidentiality of their responses, and whether they believed respondents answered truthfully to the financial questions. The specific questions asked of interviewers were as follows:
“During the interview, how often did the respondent ask you why you needed to know the answer to some questions? (never,
“During the interview, how often did the respondent express concern about whether his or her answers would be kept confidential? (never,
“How truthful do you believe the respondent was regarding his or her answers to financial questions? (completely truthful,
We treat these items as indicators of respondents having concerns regarding privacy and data confidentiality. 5 A confidentiality concern index (range: 0–3) was created by summing the number of times any indication of a confidentiality concern (denoted by the underlined responses) was given to these three items. 6 Our expectation is that this index, as well as the number of item refusals to financial questions, will be negatively related to the likelihood of consent.
General resistance toward the survey interview
This hypothesis states that some respondents express initial resistance toward the survey interview due to a variety of different reasons (e.g., lack of interest, burden, etc.) and may be less likely to fully cooperate during the interview and less likely to comply with in-survey requests, such as data linkage. The HRS collects several indicators of resistance and uncooperativeness based on call records and interviewer observations measured at the end of each interview. The call record data includes the number of call attempts needed to complete the interview, whether the respondent initially refused the survey request, whether the respondent failed to participate in any of the previous wave interviews, and whether the respondent refused the E-FTF interview and opted instead for a telephone interview. Interviewer observations of cooperation include an assessment of how often the respondent asked how much longer the interview would last, a rating of the respondent’s cooperation during the interview, a rating of the respondent’s resistance level, and a rating of the respondent’s enjoyment with the interview. The exact wording of the relevant interviewer observations are as follows:
“During the interview, how often did the respondent ask how much longer the interview would last? (never,
“How was respondent’s cooperation during the interview? (excellent,
“How would you describe the level of resistance from the respondent? (low/passive,
“How much did the respondent seem to enjoy the interview? (a great deal, quite a bit, some,
We use these four items, collected in the prior-wave interview, to create an additive uncooperativeness index (range: 0–4). A value of 1 was assigned to each item if the interviewer observation yielded an indication of uncooperative behavior (denoted by the underlined responses) and 0 otherwise. 7 This index was constructed using prior-wave interview observations (2006) instead of the current-wave (2008) observations to avoid the interviewer impressions being influenced by whether the respondent complied with the consent request. The hypothesized expectation is that the uncooperativeness index will be negatively related to the likelihood of consent.
Preexisting relationship with the administrative data agency
Another mechanism of consent under consideration relates to respondents’ relationship with the agency responsible for maintaining the administrative data being requested. In particular, respondents who have a preexisting relationship with the administrative data agency may be more likely to comply with the linkage request if they receive services or benefits from the agency. Conversely, receiving services from the agency may actually decrease the likelihood of consent if the respondent’s relationship with the agency has been negative, or if the respondents fear that the linkage would lead to examination of whether they are claiming benefits or services fraudulently. Two important services that the SSA provides are the distribution of Social Security and Supplemental Security Income. HRS asks respondents to report whether they are currently receiving these two income sources. These two variables are combined into a single binary indicator of whether benefits are being received from SSA. We also create a binary indicator of whether a respondent currently receives any other government benefits, including welfare, veteran benefits or military pension, and/or food stamps. Our hypothesis is two sided. If receiving services or benefits from SSA (or another government agency) incentivizes respondents to consent to the linkage, then we would expect the likelihood of consent to be higher for those receiving such benefits. Alternatively, if respondents who receive these benefits are concerned about the linkage for the reasons discussed above, then we would expect these respondents to have a lower likelihood of consent.
Acquiescence/attentiveness
We test the notion that inattentive or acquiescent respondents are either more or less likely to consent based on their perception of what the easier task might be. Without any direct measures of respondents’ acquiescent or inattentive behavior during the survey interview or at the time of the linkage request, we utilize a proxy indicator of respondents’ attentiveness from the prior-wave interview. At the conclusion of the prior-wave interview, interviewers were asked to rate respondents’ level of attentiveness during the interview [“How attentive was the respondent to the questions during the interview? (
Interviewer effects
There are several interviewer-related hypotheses that we attempt to address in this study. The first hypothesis is that interviewers who possess critical views toward data privacy are less likely to obtain consent from respondents. We do not have direct measurements of interviewers’ attitudes toward data privacy. Instead, we use interviewer education as a weak proxy. We assert that interviewers with the lowest levels of education possess the most negative attitudes toward data privacy. Our rationale is based on studies that find that persons with the lowest levels of education are more likely to express concerns about privacy and data confidentiality in survey research studies (Singer, Mathiowetz, and Couper 1993; Westin 1990). In addition to main effects, we test the interaction between interviewers’ and respondents’ level of education to assess whether matching respondents with interviewers who share the same education level (a weak proxy for shared attitudes on data privacy) produce lower or higher rates of linkage consent relative to other interviewer–respondent education combinations.
The second interviewer-related hypothesis is that matching interviewers and respondents on demographic variables increases the likelihood of consent as it seems to increase the likelihood of survey participation (Webster 1996). We examine the effects of matching based on age, gender, and race/ethnicity variables collected on both HRS respondents and interviewers. Our expectation is that matching on any one of these variables will increase the likelihood of consent to a greater extent than that of not matching. Finally, the third interviewer-related hypothesis we consider claims that experienced interviewers obtain consent at a higher rate than inexperienced interviewers. We operationally define interviewer experience to be dichotomous, where “inexperienced” interviewers are those who have never worked on the HRS prior to the current wave and “experienced” interviewers are those who have worked on at least one HRS wave prior to the current one. 8 In addition, two running count variables indicating the number of consent requests administered by the interviewer in all interviews prior to the current one, as well as the number of consents obtained in those interviews, are used as dynamic measures of interviewer experience accrued during the field period. 9 We expect that the number of consent requests administered by the interviewer and number of consents that they obtain prior to the current interview will have a positive effect on obtaining linkage consent.
Outcome Measure
The primary outcome measure is a dichotomous indicator of whether HRS respondents consented to the SSA data linkage request. Consent is operationally defined as returning the signed consent form to the HRS interviewer or to ISR. Consenting respondents were asked to provide a Social Security number to facilitate the linkage, but this was not required. In the preliminary analysis, two different definitions of consent were considered: (1) consent with Social Security number (conditional) and; (2) consent with or without Social Security number (unconditional). 10 Both outcomes yielded similar results with the unconditional consent definition producing slightly stronger effects than the conditional definition. We use the unconditional definition of consent for all analyses presented below.
Statistical Analyses
A multilevel random effects logistic regression model was used to quantify respondents’ likelihood of consenting to the data linkage request conditional on respondent- and interviewer-level covariates. The logistic regression was performed using the NLMIXED procedure in SAS 9.2 (Littell et al. 1996). HRS utilizes complex sample survey design features that were incorporated into the statistical analysis. The sample consists of 112 primary sampling units paired within 56 sampling strata. Sampling weights were used to adjust for differential probabilities of selection, nonresponse, and sample noncoverage. The weights were incorporated into the NLMIXED procedure to obtain valid point estimates. Standard errors were computed using a jackknife variance procedure (with 56 replications) that account for the effects of stratification and clustering (Berglund 2002). A total of 6,613 randomly selected respondents belonging to the target population of noninstutionalized adults over the age of 50 received the consent request. Interviewer observations from the prior-wave interview were missing for 229 respondents and these cases were excluded from the analysis. This yielded a final analytical sample size of 6,384. No significant differences were found between cases that were included or excluded based on sociodemographic variables.
Results
Sample Characteristics
The overall consent rate was 67.8 percent (weighted and unweighted). Table 2 shows weighted estimates of the target population of adults over the age of 50 who were eligible for the data linkage consent request. Overall, males were more likely to provide consent than females (69.1 percent vs. 66.9 percent), white respondents were more likely to give consent than blacks and other racial groups (69.1 percent vs. 60.4 percent vs. 65.8 percent), and married respondents provided consent at a higher rate than separated/divorced, widowed, and never married ones (69.5 percent vs. 64.9 percent vs. 65.3 percent vs. 65.9 percent). There were no statistically significant consent rate differences by birth cohort, Hispanic ethnicity, and education.
Characteristics of Sample Respondents in the 2008 Health and Retirement Study (Weighted; N = 6,384)
Note. Parenthetical entries are standard errors.
Comparison is statistically significant (P < .05).
Model Building and Evaluation
A weighted logistic regression model for consent was fit by incorporating covariates related to respondent background characteristics and indicators of the hypothesized mechanisms. To assess the relative impact of each mechanism on consent the set of indicators associated with each mechanism was removed from the full model one at a time. Then the reduced model was fit and contrasted against the full model. Three diagnostic measures were used to assess the impact of each mechanism. The first diagnostic measure is the likelihood ratio (LR) statistic, which is computed as,
The second diagnostic measure is the relative Akaike’s information criterion (AIC) computed as,
Finally, the third diagnostic measure is the relative Schwarz criterion (SC), which is computed as,
All of these measures produce values that are greater than or equal to zero. The diagnostic values increase as a function of the relative importance of the mechanism (and its corresponding set of indicators) in terms of improving model fit. Table 3 presents the diagnostic results. In general, the interview resistance indicators had the biggest impact on model fit based on the diagnostic measures (LR, 283.62, p<.01; Rel AIC, 3.63; Rel SC, 2.73), followed by the privacy/confidentiality indicator (LR, 43.48, p<.01; Rel AIC, 0.56; Rel SC, 0.44). All other mechanisms had relatively minimal impact on model fit.
Model Fit Statistics Computed After Removing Indicators for Each Mechanism in the 2008 Health and Retirement Study
Note. AIC = Akaike’s information criterion; SC = Schwarz criterion.
Likelihood of Consent Based on Indicators of Hypothesized Mechanisms
Table 4 shows weighted odds ratios (ORs) of the likelihood of linkage consent. After controlling for respondent background characteristics and the indicators of each hypothesized mechanism, education was the only respondent demographic characteristic that yielded a statistically significant relationship with consent: college graduates tended to be more likely to consent relative to persons who did not complete high school (16+ years: OR, 1.32; 95 percent confidence interval, CI [1.02, 1.70]). A marginally significant result (p < .10) was observed for black respondents who were less likely to consent relative to white respondents (OR, 0.80; 95 percent CI [0.63, 1.01]). A respondent’s foreign citizenship status, self-reported health rating, and numeracy skill (based on a numeracy assessment module) were unrelated to the likelihood of consent.
Final Logistic Regression Model of Consent to Data Linkage on Respondent Background Characteristics and Indicators of Consent Mechanisms
Note. HRS = Health and Retirement Study; SSI = Supplemental Security Income; SS = Social Security; AHEAD = Study of Asset and Health Dynamics Among the Oldest Old; CODA = Children of the Depression; WB = War Babies; EBB = Early Baby Boomers.
p < .10. *p < .05. **p < .01. ***p < .001.
Substantive financial variables had a mixed relationship with consent. Household income 11 was found to be unrelated to consent, while net worth 12 was significantly related to consent. As net worth increased, the likelihood of consent decreased (second quartile: OR, 0.75; 95 percent CI [0.62, 0.92]; third quartile: OR, 0.63; 95 percent CI [0.48, 0.82]; fourth quartile: OR, 0.54; 95 percent CI [0.38, 0.76]).
Effect of Financial Uncertainty on the Likelihood of Consent to Social Security Record Linkage
We did not find any support for the uncertainty hypothesis of consent. The number of “don’t know” responses given to financial questions in either the prior wave (2006) or the current wave (2008) did not appear to provoke respondents to consent to the SSA linkage request in the current wave (2006: OR, 1.21; 95 percent CI [0.52, 2.83]; 2008: OR, 0.83; 95 percent CI [0.45, 1.52]). We also examined the percentage of “don’t know” responses to financial questions across consent propensity strata. Based on the final consent model, consent propensity scores were generated for all cases. Following suggestions from Rosenbaum and Rubin (1983), five equal-sized propensity score groups were formed, ordered from low-to-high propensity of consent. Respondents were assigned the value of 1 in the newly formed variable if their propensity to consent lay between 0.0006 and 0.60, the value of 2 if their estimated propensity fell within the interval of 0.60 and 0.69, the value of 3 for propensities between 0.69 and 0.74, value of 4 if the estimated propensity to consent ranged from 0.74 and 0.78, and value of 5 for the remaining respondents, indicating the highest consent propensity range of 0.78 and 0.91. Table 5 shows the percentage of “don’t know” responses across the propensity stratum. Contrary to expectations, a roughly negative trend is observed for each year: the largest percentages of “don’t know” responses to financial questions occur among the lowest consent propensity groups and the smallest percentages occur among the highest propensity groups (2006: trend test, p = .333; 2008: trend test, p = 0.083). Overall, we conclude that there is no evidence supporting the hypothesis that persons who are less familiar with their administrative information are more likely to give linkage consent.
Percentage of Item “Don’t Know” and Refusals to Financial Questions by Consent Propensity Stratum in the 2008 Health and Retirement Study
Note. Parenthetical entries are standard errors.
Effect of Privacy/Confidentiality Concerns on the Likelihood of Consent
In contrast to the uncertainty hypothesis, we find strong support for the privacy/confidentiality hypothesis. The confidentiality concern index is negatively related to the likelihood of consent (OR, 0.62; 95 percent CI [0.51, 0.74]). That is, as the number of confidentiality related concerns expressed by the respondent during the 2006 wave interview increases, the likelihood of consent in 2008 decreases. A second indicator of privacy concerns is the number of times a respondent refused to answer relevant financial questions in the survey. This indicator, measured in both 2006 and 2008, is significantly related to consent in the current wave (2006: OR, 0.88; 95 percent CI [0.79, 0.97]; 2008: OR, 0.68; 95 percent CI, [0.47, 0.99]); as the number of item refusals increase, the likelihood of consent decreases. In addition, we examined the percentage of item refusals across consent propensity strata. Table 5 shows a negative trend indicating that the largest percentages of item refusals occur among the lowest consent propensity groups and the smallest percentages among the highest propensity groups (2006: trend test, p = .017; 2008: trend test, p = .083). All indications suggest that there is strong support for the hypothesis that privacy and confidentiality concerns are negatively related to the likelihood of consent.
Effect of Interview Resistance on the Likelihood of Consent
We find support for the notion that respondents are more likely to refuse the consent request if they exhibit initial resistance toward the survey interview. The uncooperativeness index—constructed from interviewer assessments of respondents’ cooperativeness in the prior-wave interview—is significantly related to the likelihood of consent (OR, 0.76; 95 percent CI [0.70, 0.82]). More specifically, interviewer observations of uncooperative behavior in the prior-wave interview tend to be negatively correlated with the likelihood of consent in the current wave. Other indicators of interview resistance were negatively related to the likelihood of consent, including initial refusal to be interviewed in the current wave (OR, 0.57; 95 percent CI [0.38, 0.85]) and at least one noninterview in a previous wave of HRS data collection (OR, 0.64; 95 percent CI [0.50, 0.82]). Respondents who refused the E-FTF interview in favor of a telephone interview were also less likely to consent to data linkage (OR, 0.34; 95 percent CI [0.23, 0.48]). Overall, we find strong support for the notion that interview resistance is negatively related to the likelihood of consent.
Effect of Preexisting Relationship With the Administrative Data Agency on the Likelihood of Consent
We find mixed support for the preexisting relationship hypothesis. Respondents receiving Social Security or Supplemental Security Income were neither more nor less likely to consent to the SSA data linkage request compared to persons not receiving income from the SSA (OR, 1.05; 95 percent CI [0.87, 1.28]). However, respondents receiving other types of government income (including welfare, veteran benefits, food stamps) were more likely to consent to SSA data linkage (OR, 1.34; 95 percent CI [1.10, 1.65]). Thus, there is mixed support for the hypothesis that respondents receiving services from a government agency are more likely to consent to a government data linkage request.
Effect of Respondent Attentiveness on the Likelihood of Consent
There is some evidence that the level of attentiveness—a proxy indicator for acquiescence—of respondents during the survey interview influences their likelihood of consent. Specifically, we find that respondents who were rated by the interviewer as being not “very attentive” in the prior-wave interview were more likely to consent to the linkage in the current wave (OR, 1.22; 95 percent CI [0.96, 1.56]). Although marginally significant (p < .10), this suggests that inattentive respondents may acquiesce during the linkage request and give consent because that is perceived to be the easier task relative to not giving consent.
Interviewer Effects
We find little support for the interviewer-related hypotheses. Interviewer education—a weak proxy for attitudes toward data confidentiality and privacy—was not related to the consent outcome. The interaction between respondent and interviewer education levels (not shown) was also not statistically significant. In general, we find no evidence that matching respondents and interviewers on demographic characteristics leads to a higher likelihood of consent based on gender, race/ethnicity, age, and education. Furthermore, inexperienced HRS interviewers were not more or less successful at obtaining consent relative to experienced interviewers. The number of consent requests administered by the same interviewer in previous interviews had no effect on the likelihood of consent in the current interview, but the number of previous consents obtained by an interviewer had a marginally significant (p < .10) and positive effect on the likelihood of consent (OR, 1.03; 95 percent CI [1.00, 1.06]). This suggests that interviewers tend to benefit from early success when obtaining consent in subsequent interviews over the course of the field period. We also find a marginally significant interviewer variance component (
Discussion
Sample surveys are increasingly being linked to administrative data sources to enhance the quality of the survey data and increase research opportunities for data users. However, obtaining linkage consent from respondents remains a persistent challenge that threatens the validity of inferences obtained from linked data sources. Little is known about the underlying mechanisms driving the consent decision. We investigated this issue by addressing several hypotheses of linkage consent in the HRS, a large nationally-representative panel survey of older adults linked to SSA records.
We did not find any support for the hypothesis that respondents are more likely to consent to the linkage if they are unable to self-report items contained in their administrative record. That is, it does not appear that respondents compensate for their uncertainty by increasing their likelihood of consent. The fact that the consent request was not framed this way may explain the lack of support for this hypothesis. That is, respondents who were unable to answer certain administrative items were not encouraged to overcome their uncertainty by consenting to the data linkage. Indeed, it may matter whether this connection is brought to the respondent’s attention or not during the consent request. Experimental research on the framing of the consent request is needed to further explore this possibility.
An alternative post-hoc hypothesis is that respondents choose not to release their records to survey organizations because they are uncertain of what the record contains. We find some evidence supporting this hypothesis: the highest frequency of “don’t know” responses to administrative financial items occurred among respondents who had the lowest likelihood of consent. On the other hand, a “don’t know” response may in fact be a passive refusal motivated not by uncertainty but rather unwillingness to divulge confidential information to the survey organization. If so, this is likely to decrease the likelihood of consent, as is evidenced by our finding that explicit refusal to answer financial items led to a lower likelihood of consent.
We find strong support for the privacy and interview resistance hypotheses based on prior-wave interviewer observations of respondent behavior and item refusals to income questions in the prior- and current-wave interviews. The fact that privacy is related to the consent outcome is not surprising and is consistent with the literature (Bates 2005). Regarding interview resistance, the evidence suggests that uncooperative behavior exhibited in the prior-wave interview tends to be associated with a lower likelihood of consent. These findings have important practical implications for ongoing surveys planning future data linkages. Indications of interview resistance or confidentiality concerns observed prior to the linkage request may be used to preidentify (and possibly intervene on) persons who are less likely to consent. For cross-sectional surveys, respondents’ refusal to answer income questions could be used for similar preidentification and intervention strategies for subsequent linkage requests.
A respondent’s preexisting relationship with the administrative data agency denoted by whether they currently receive income from the SSA was not related to consent. That is, we find no evidence that Social Security income recipients are overrepresented in the consent group. However, respondents who received other types of government income tended to consent at a higher rate than those who were not receiving government income. This finding is somewhat consistent with the medical literature, which finds that patients with the characteristic of interest tend to consent to medical record review and follow-up interview at a higher rate than those who do not have the characteristic of interest (Dunn et al. 2004; Petty et al. 2001; Woolf et al. 2000). This finding raises potential overrepresentation issues that could introduce bias. Accounting for the selectivity of consent through weighting adjustments or other statistical procedures may need to be considered for data linkage studies. HRS is currently in the process of developing weighting adjustments for nonconsent.
We found that inattentive respondents tend to provide linkage consent at a higher rate than attentive respondents. This acquiescent behavior begs the question of whether respondents take the time to consider all of the information about the linkage request that is provided to them by the survey organization. Linkage requests often contain technical and relatively complex administrative details that may be too burdensome for some respondents to pore over. Rather than expend the necessary effort needed to read through and fully understand the request, some respondents may simply give consent because that is the easier task. Future research on respondents’ comprehension of the linkage request is needed to determine whether current methods of administering the consent request are effective, or whether alternative methods should be considered.
Although interviewers play a crucial role in administering the consent request and addressing respondent concerns, we found little evidence of systematic interviewer effects. Specifically, we found no relationship between interviewers’ demographic characteristics and respondents’ likelihood of consent. Matching respondents and interviewers based on demographic characteristics did not increase the likelihood of consent. This result conflicts with the survey participation literature, which finds that “matching” interviewers with respondents has a positive effect on participation (Webster 1996). We also found no strong effect of interviewer experience on the likelihood of consent. However, the number of consents obtained by an interviewer in previous interviews tended to have a positive effect on obtaining consent in the current interview, which suggests that interviewers who have success obtaining consent from respondents early in the data collection period also tend to be successful later on.
The results also indicated unexplained consent variation attributable to the interviewer. Indeed, a wide range of consent rates were observed between interviewers: the 25th, 50th, and 75th quartiles for interviewer-level consent rates were 35.4 percent, 56.5 percent, and 69.1 percent, respectively. It may be of interest to survey organizations to monitor these rates and intervene when interviewers are performing below expectation. In addition, it is worthwhile to explore other interviewer-level factors that may influence a respondent’s likelihood of consent. One potential factor, interviewers’ attitudes toward data privacy, may play a role in how effective interviewers are in administering the consent request and addressing respondents’ concerns. Findings based on interviewer assessments at the U.S. Census Bureau indicate that some interviewers share the same concerns about data privacy and confidentially as their respondents and are even skeptical of the promise of confidentiality that they themselves give to respondents during survey recruitment (Mayer 2001). More direct measures of interviewer attitudes are needed to explore how their attitudes interact with those of respondents and influence their likelihood of linkage consent. A key challenge for future work will be to identify interviewer-level factors that can be leveraged during interviewer recruitment and training in ways that could improve respondents’ likelihood of consent and minimize selection bias.
A major strength of the present study is the use of rigorously collected population-based data collected from respondents and interviewers over multiple waves of data collection. In addition, our study benefits from the joint modeling of respondent- and interviewer-level characteristics of consent which is rare in studies of consent. There are several limitations of this study that should be discussed. The observational nature of the data sources is informative for hypothesis generation and exploration but is limiting for making definitive statements about the factors affecting consent. Although our models were extensively built to control for possible confounders, it is possible that extraneous factors were not accounted for. Another potential limitation is due to the fact that we studied a sample of panel respondents who comprise the older adult population. Panel respondents may be viewed as being more cooperative than a cross-sectional sample of respondents and this may reduce the generalizability of our findings. Furthermore, the consent mechanisms of older age groups may be quite different than for younger age groups. We suspect that older people are more civic minded, yield higher response rates, and may be less concerned about privacy than younger adults, in which case these data may underestimate the consent problem. On the other hand, these panel respondents have demonstrated a vested interest in the survey (some respondents have participated in the HRS for almost 2 decades) and yet almost 35 percent of them refused the linkage request. Their persistent cooperation in the HRS yet nontrivial resistance toward linkage requests comprise an intriguing case study for studying the underlying mechanisms of consent.
Footnotes
Declaration of Conflicting Interests
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially funded by the Alexander von Humboldt Foundation. The HRS (Health and Retirement Study) is sponsored by the National Institute on Aging (grant number NIA U01AG009740) and is conducted by the University of Michigan.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
