Abstract
In this article, we explore both unintentional and intentional anomalies that may arise in real voter registration data from a U.S. state. Through our collaborative efforts with the Idaho Secretary of State office, we identify and characterize various anomalies such as missing values in the required fields, abnormal age entries, unspecified gender types, non-unique driver’s license numbers, and formatting errors. Additionally, we present techniques, including a tailored approximate string matching algorithm, capable of detecting potential intentional anomalies in the real data. Gaining a comprehensive understanding of these anomalies is crucial for ensuring the integrity and accuracy of voter registration data. Therefore, we have developed software, in ongoing partnership with the Idaho Secretary of State office, that successfully identifies many of the anomalies. This software-based approach has proven effective and can be adapted for use in other states.
Keywords
Get full access to this article
View all access options for this article.
