Abstract
This study investigates the reliability of Florida’s voter registration files through a phone survey, asking respondents to verify their records. We find 17.7% of registrants fail to verify at least one identifying piece of information. Applying the total survey error (TSE) framework, we classify these errors as due to coverage error, measurement error, or processing error. These inconsistencies create election administration and campaign inefficiencies, which lead to poorer voter experiences, and challenge the validity of some research based on these data. Furthermore, if registration records do not accurately capture the members of protected groups, the data are less helpful in both government monitoring and enforcement. We suggest voter registration forms should be treated like survey questionnaires so as to improve data quality with better form design, and that some vote overreport bias is attributable to limitations of voter file data, not to respondents’ vote misreporting.
State and local election officials maintain voter registration files that contain a wealth of information, including registrants’ names, birthdates, current addresses, and past vote histories. These data, which are critical to the administration of elections, are also extensively used by political campaigns to target messages and mobilize supporters. In addition, scholars use these data as sampling frames in voter persuasion and mobilization experiments (e.g., Gerber & Green, 2000), making inferences about the effects of registration laws on who registers and votes (e.g., Fraga et al., 2018; Holbein & Hillygus, 2016; Shino & Smith, 2018), and validating survey respondents’ self-reports (e.g., Ansolabehere & Hersh, 2012; Silver et al., 1986), among other uses. Importantly, voter registration lists determine who may cast ballots in an election, which means registration errors could either permit votes by those who are ineligible or, more likely, complicate or prevent voting by people with mismatches between their registration record and the identification they provide at the polls (Ansolabehere & Hersh, 2014). Thus, the accuracy of these data is important for the efficiency and fairness in election administration, measurement validity, and the ability of individuals to exercise their fundamental right to vote.
Despite the importance of these records for election administration, campaigns, research, and voters, some have cautioned their reliability. After the American National Election Studies (ANES) discontinued its valiant, though frustrating and costly, efforts to validate respondents’ self-reports of turnout in the 1980s by checking official records, Traugott et al. (1992) warned that “ . . . administrative records should be treated with some care” (p. 13). The 2002 Help America Vote Act’s requirement that states maintain and regularly update centralized electronic voting files has reduced the costs of voter validation efforts but has not completely eliminated concerns about their reliability (Berent et al., 2016; Burden, 2014; McDonald, 2007).
Here, we explore the reliability of information found in Florida’s voter registration file. Our insight is that a voter registration application is similar to a survey questionnaire, or even a ballot (Kimball & Kropf, 2005). We explore voter registration errors employing the total survey error (TSE) framework (Groves & Lyberg, 2010; Weisberg, 2005) by employing a test/retest methodology (Guttman, 1945) where we ask a sample of registered voters to verify their voter registration record. This approach mitigates potential false negative and positive matches in prior vote survey validation efforts that may arise from record linkage methodologies (Elmagrmid et al., 2007). Unlike a prior test/retest mail-back study (Ansolabehere et al., 2010), we employ an adaptive phone survey design to probe causal explanations as to why individuals may provide a survey response inconsistent with their administrative record. We find 17.7% of respondents fail to verify at least one field included in the publicly available Florida voter file, including name, address, birthdate, sex, or race. These inconsistencies create election administration and campaign inefficiencies, lead to poorer voter experiences, and challenge the validity of some prior research based on these data.
A Voter Registration Application as a Survey Questionnaire
Florida’s paper voter registration application, asks some open-ended responses (the registrant’s name, address, and other identifying information) and some closed-ended responses (gender, party affiliation, and race/ethnicity). 1 The National Voter Registration Form asks for much of the same information, with some differences regarding party affiliation, race, and gender. Using Groves and Lyberg’s (2010) TSE framework, we classify three validity challenges that arise as they relate to voter registration records: coverage error, measurement error, and processing error. TSE has more elements, but we focus here on those endemic to the accuracy of voter registration records.
Coverage error occurs when the voter registration list does not accurately reflect the population who should qualify as registered voters in a given jurisdiction. In this context, overcoverage is commonly referred to as “deadwood”: registration records that have been rendered obsolete by changes in residence or death of the registrant (Ansolabehere & Hersh, 2014). Undercoverage, in this context, can include people who were improperly purged from the voter rolls, but who are legally entitled to continue voting in the jurisdiction. One particular challenge is to update voter registration addresses in a mobile population, as federal and state laws govern the process of updating and removing records of persons no longer eligible to vote at an address.
Measurement error occurs when respondents fail to give accurate responses to questions that are asked. Some measurement error might be attributable to respondents who satisfice by providing a quick answer in lieu of the best answer, or those who have difficulty understanding the question due to illiteracy or language barriers. Poorly designed forms can also induce measurement error, as when a crowded form causes a respondent to miss or misread a question, or when an abbreviation (“M” and “F”) is ambiguous or not understood by all respondents. Social constructs, such as race, are particularly susceptible to measurement error as they depend on social norms. Respondents may have multiple race identifications, or considerations, that lead to unstable responses in repeated surveys (Harris & Sim, 2002). Florida’s application is particularly vulnerable, as race and ethnicity are combined into a single question that forces a single response. In contrast, many survey organizations ask separate questions that permit respondents to self-identify their race and their Hispanic ethnicity, and the U.S. Census allows multiple responses to the race question rather than using a “multiracial” category.
Processing error occurs when election officials incorrectly enter data from voter registration applications. These errors arise from the operator’s misinterpretation of ambiguous responses (such as difficulty reading illegible handwriting), carelessness, or fatigue. For example, election officials erroneously recorded a North Carolina legislator who appeared to have voted in North and South Carolina until officials determined his mother signed the wrong poll book line (Ochsner, 2016).
Survey Design
To assess the relative frequency of these types of errors in Florida voter registrations, we conducted a survey based on a random sample of 60,000 registered voters selected from the February 2017, voter file. Our purpose and approach is similar to the Ansolabehere et al. (2010) study, which sent two first class mail pieces to a sample of registered voters in Florida and Los Angeles County, California. Our innovation is that we field an adaptive telephone survey that allows interviewers to ask follow-up questions of respondents whose survey answers differ from their registration records.
Although telephone number is labeled as optional on Florida’s registration form, 31.6% of our sample registrants provided one. L2, a voter list vendor, augmented our sample with phone numbers, yielding a total of 31,725 phone numbers, or 52.9% of our sample. From that sampling frame, we completed 402 interviews out of 6,227 unique phone numbers called, a 6.5% response rate. Using computer-assisted telephone interview software with voter file information, bilingual interviewers asked respondents questions identical to the voter registration application. 2 If a respondent provided an inconsistent answer, we asked follow-up questions to investigate these failures to help assure valid responses to the survey. A more complete description of our methodology is found in the Supplemental Material.
Results
In all, 71 of 402 respondents (17.7%) failed to verify at least one of their name, address, birthdate, sex, or race. We report unweighted statistics, as we analyze small frequency subsamples that could be distorted by weights. Ansolabehere et al. (2010) report a similar estimated total error rate in their mail-back survey, which included name, address, birthdate, sex, race, and questions regarding vote history and party affiliation. We organize our discussion of the results presented in Table A1 in the Supplemental Material by grouping questions into coverage, measurement, and processing errors. Likely, processing errors are present in the coverage and measurement error topics, but we cannot always easily disentangle processing errors from these other error sources.
Coverage Errors
Registration Status
Eight (2.0%) of the respondents sampled who confirmed other personal identifying information reported they were not registered to vote in Florida. Upon further questioning, five reported being registered in the past year, and two reported being registered within the past 3 years. None reported being registered to vote in another state or Florida county. It is possible that these respondents mistakenly thought they had been removed from the voter rolls.
Measurement Errors
Name
Four of our respondents (1.0%) reported that their name was listed incorrectly. Two are likely processing errors: one reported not having a middle name, though one was recorded in the voter file and another reported their first name was misspelled. The other two are likely measurement issues: two women reported a different last name than what was recorded on the official voter file, which could be due to a marriage or divorce, although we did not include a follow-up question to that effect.
Addresses
Incorrect information may be indicators of poor data integrity. In our phone survey, we find that 22 respondents, or 5.5% of the registrants who answered that question, reported a different physical address than the one recorded in the voter file. Measurement error explains if a voter’s address was accurate at the time of registration, but has since changed; however, processing error explains if a voter may have provided the wrong address at the time of registration, or the transcriber may have incorrectly input the wrong address from the registration form. Our findings indicate that when address errors are scaled nationally, there may be millions of similar problematic records.
Gender
Thirteen respondents (3.2%) reported that the gender designation in the voter registration record was incorrect. Of these, 12 were recorded with an “unknown” gender, but they provided a male or female response in our survey. The “unknown” gender on the official administrative record could be indicative of a respondent having an ambiguous or nonbinary gender at the time of registration, but it is more likely the result of an applicant simply failing to mark a response to the optional gender question. Although most new registrants provide an answer to the gender question, as we noted above, the gender “M” and “F” response boxes are small and are not positioned near other required information on the form. Applicants may thus inadvertently miss this question, fail to interpret the abbreviations “M” and “F,” or simply choose to skip this question as it is optional.
Race
The most frequent verification failure in our survey is with respect to race and ethnicity. As noted above, the Florida voter registration application offers six closed-ended response items to a single race and ethnicity question, plus an “Other” option with a very small box to write in one’s racial identification, while the National Voter Registration Form instructs registrants who opt to answer this question to write in one of the same seven options (including “other,” but with no additional space for a more specific response). Some 96.2% of all registrants supplied an answer corresponding to one of the six provided categories, 1.5% checked “other,” and 2.3% skipped the question on the registration form. In our survey, we first asked respondents to self-identify their race and ethnicity with response items identical to those on the voter registration form. Respondents who did not verify being “Hispanic” were asked a separate Hispanic ethnicity question. In response to our question that mimicked the question on the registration application, 37 of the 402 (9.2%) respondents initially reported a race or ethnicity that did not match their registration. Five respondents who were recorded as Hispanic in the voter file self-reported an identification other than Hispanic, but all five self-identified as Hispanic in response to our follow-up question. Similarly, we observed 22 of the 37 inconsistent self-reports were for registrants listed in the voter file as “Multiracial” or “Other.” Thus, a majority of the inconsistencies were from respondents with multiple identifications, either Hispanics who also identify as Black or White, or people who reported a mixed racial heritage. Those voters might have provided truthful answers (or considerations) on both the registration form and in our survey, as different identifications may have been salient (or made salient by the question format) to the registrant between the time of registration and the time of the survey.
Party Registration
Party registration is also a voluntary category on the voter registration form. Thirteen of 402 (3.2%) respondents failed to verify their party registration. There were no clear patterns among these respondents, as three reported their correct party registration was Democrat, four reported Republican, four reported No Party Affiliation, and one reported a Minor party affiliation. The initial registration dates for these respondents ranged from 1958 to 2016, or from 1 to 59 years prior to our survey. The median respondent in this group registered in 2007, so it is likely intervening party system dynamics in Florida results in mismatches between these voters’ original party registrations and their current party identifications. Notwithstanding, Florida’s status as a closed-primary state means that these voters will not be able to participate in their preferred party’s primary.
2016 General Election Vote
Most surveys estimate a greater percentage of individuals reporting voting than aggregate statistics indicate. While a number of factors contribute to this phenomenon, overreporting (nonvoters claiming to have voted) usually accounts for most of the gap between actual and reported turnout (Jackman, 2019). Our survey fits this pattern, as 14 respondents who self-reported 2016 general election participation did not have a vote history record in Florida, while only one respondent reported not voting when a voting record existed. Our novel finding is that five of these 14 “overreporters” said they voted in another state. While we did not verify whether they actually voted outside of Florida, this does suggest some overreporting in election surveys may be attributed to respondents registering and voting elsewhere, for whatever reason. An important implication is vote-validation research finding overreporters tend to be high propensity voters (Ansolabehere & Hersh, 2012), but this may be an artifact of people truthfully reporting their vote, but in another state.
Processing Errors
Birthdate
Anticipating that many respondents may be concerned about potential identity theft, we asked three sequential questions regarding birth year, birth month, and birthday rather than asking directly respondents to confirm their recorded birthdate. Sixty respondents refused to answer any of these questions. Among the 342 valid responses to these questions, 11 (3.2%) responded with information inconsistent with their voter registration record. Some responses appear to be processing errors, with a single incorrect digit in a day or a year. 3
Discussion
Big data provide opportunities for small errors. We find in our phone survey of Florida’s voter registration database that 17.7% of survey respondents sampled from the Florida voter file failed to verify at least one piece of their identifying information on their voter registration record: name, birthdate, address, gender, or race. This rate is similar to a 2010 mail-back survey in Florida finding 18% of respondents failed to verify similar information, in addition to vote history and race (Ansolabehere et al., 2010). We recognize our survey is limited in that we collected only 402 respondents, but the identified errors in our small sample is troubling if we scale these errors to the hundreds of millions of records nationally. Nevertheless, there are two advantages to our adaptive survey design approach when probing why respondents fail to verify their voter registration records. First, we have greater confidence that issues we observe are not caused by random respondent errors when they provide a causal justification for erroneous data on their record. Second, we can better understand the data-generating process that leads to potential recording errors on registration files.
The proportions of errors that Ansolabehere et al. (2010) and we find in any one field (race, birthday, vote history) are small, but they suggest caution for both researchers and election administrators (Ansolabehere & Hersh, 2014). The TSE framework can alert researchers to the presence of measurement error in concepts like self-identified party registration, race, and gender on voter registration forms. Responses to questions about racial identity not only produce the most frequent verification failures, they also reflect a troubling aspect of the reliability of these data. Southern states tend to ask registrants’ race so that these states, the Department of Justice, and the courts can use these data to evaluate when there are voting rights violations. If registration records do not accurately reflect how many registrants who are members of protected groups identify themselves, these data are less helpful in monitoring and enforcement.
Examining the accuracy of voter registration records from the TSE framework of processing error is another area worthy of study, which may help election administrators improve voter file accuracy. In many ways, filling out a voter registration form is akin to filling out a ballot (Kimball & Kropf, 2005; Kropf & Kimball, 2012). Considering them as a survey may yield low cost solutions that improving election administration and avoiding high costs. For example, understanding how sources of voter registration input (from motor vehicle offices, registration drives, or paper or online registration) affect error rates can illuminate trade-offs between costs and data quality. Survey methodologists find questionnaire design and mode affect data quality (Dillman & Christian, 2005), which suggests that different modes of data collection by paper, online registrations, and interactions with humans, such as department of motor vehicle (DMV) clerks or campaign workers, might affect processing error rates and measurement error biases. But there are trade-offs with the TSE framework, including constraints such as “costs, ethics, and time” (Lyberg & Weisberg, 2016, p. 27). Thinking of the voter registration application as a survey instrument thus helps illuminate potential biases of these data, assists in creating more savvy consumers of these data, and more reliable records for election administration.
Supplemental Material
APR_appendix – Supplemental material for Verifying Voter Registration Records
Supplemental material, APR_appendix for Verifying Voter Registration Records by Enrijeta Shino, Michael D. Martinez, Michael P. McDonald and Daniel A. Smith in American Politics Research
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The University of Florida’s Informatics Institute provided a $50,000 seed grant to conduct the survey for this study.
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
.
.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
