Abstract
Policymakers are interested in the long-term services and supports (LTSS) needs of people living with dementia. The National Core Indicators-Aging and Disability (NCI-AD) survey is conducted to evaluate LTSS care needs. However, dementia reporting in NCI-AD varies across states, and is either obtained from state administrative records or self-reported during the survey. We explored the implications of identifying dementia from administrative records versus self-report. We analyzed 24,569 NCI-AD respondents age 65+, of which 22.4% had dementia. To assess dementia accuracy by data source, we fit separate logistic regression models using the administrative and self-reported subsamples. We applied model coefficients to the population whose dementia status came from the opposite source. Using the administrative model to predict self-reported dementia resulted in higher sensitivity than using the self-report model to predict administrative dementia (43.8% vs. 37.9%). The self-report model’s diminished sensitivity suggests administrative records may capture cases of dementia missed by self-report.
• Characterizes ascertainment of dementia in the National Core Indicators- Aging and Disability survey (NCI-AD). • Applies predictive modeling techniques to empirically compare different methods of measuring dementia.
• Analysts using NCI-AD data can understand systematic bias introduced by self-reported dementia, and adjust models and estimates accordingly. • NCI-AD can work with state partners to increase the number of states reporting administrative dementia records in future waves.What this paper adds
Applications of study findings
Introduction
Providing high-quality long-term services and supports (LTSS) to people living with dementia is a national priority (Wagner et al., 2021). Medicaid is the largest payer of LTSS, which encompasses an array of services including institutional care, home health care, homemaker services, and adult day services (Cacchione, 2020). While the design and structure of LTSS programs vary greatly between states, all states aim to design LTSS with the goal of helping people living with dementia meet their care needs (Cacchione, 2020; Cohen & Feder, 2018).
States may encounter clinical, conceptual, and administrative challenges to improving LTSS for people living with dementia. Clinically, Medicaid beneficiaries with dementia report poorer health and are more likely to live alone than their counterparts who are not receiving Medicaid (Garfield et al., 2016). In addition, care provided to people living with dementia is, on average, more difficult and time intensive. Conceptually, there is no consensus on the definition of quality in LTSS; however, there is growing recognition that person-centeredness is one key domain of LTSS quality, especially for persons living with dementia (Applebaum & Mahoney, 2016). Given state direction of LTSS program design and delivery, there are administrative challenges to collecting data on program outcomes, including quality (Applebaum & Mahoney, 2016). Dementia is also difficult to diagnose, which means identifying people living with dementia and their outcomes in data is subject to misclassification (Amjad et al., 2018; Borson et al., 2013; Eichler et al., 2014). When states lack high-quality data on dementia status they may be unable accurately estimate dementia prevalence, and consequently, may underestimate the dementia-specific considerations needed to improve their LTSS programs (Applebaum & Mahoney, 2016).
To fill data gaps and enable states to better understand LTSS outcomes, the Human Services Research Institute (HRSI) and advancing States developed the National Core Indicators-Aging and Disabilities (NCI-AD) survey. The NCI-AD survey is administered by state Medicaid, aging, and disability agencies to assess the quality of available services and unmet needs of the people they serve (NCI-AD, n. d.). NCI-AD is fielded to people living in nursing homes or receiving home and community-based services (HCBS) through Medicaid, state-specific programs, or the Older Americans Act. State participation in NCI-AD is voluntary. Annual data collection began in 2015, and as of 2019, NCI-AD surveys have captured information from approximately 54,000 respondents in 23 states. Approximately 15% of NCI-AD respondents are documented as living with dementia; however, the methods used to identify people living with dementia vary by state. Some state partners submit administrative records of dementia status (e.g., ICD-10 diagnosis codes or I4200 or I4800 Minimum Data Set fields). In other states, dementia status is ascertained during the survey by self-report. This variation in how dementia status is collected may introduce measurement error and inconsistent classification when evaluating LTSS outcomes for people living with dementia.
Both administratively determined and self-reported dementia status have been shown to result in dementia misclassification. For example, Grodstein et al. (2022) found that in a cohort of fee-for-service (FFS) Medicare beneficiaries with clinically determined dementia, claims-based algorithms detect dementia with high specificity (88–93%) but lower sensitivity (64–79%). McCarthy and colleagues (2022) found that various claims-based algorithms also detected dementia with low sensitivity (31.3–56.8%) when compared to dementia determined by the Health and Retirement Survey (HRS) cognitive assessments.
Self-report of dementia status is also prone to misclassification. Prior research using national survey data has found that respondents underreport dementia diagnoses. McGrath et al. (2021) found that over 90% of respondents classified as having possible dementia using cognitive assessments in the HRS did not report a dementia-related diagnosis. Additionally, Savva & Arthur (2015) found that 58% of respondents with dementia determined using the HRS Aging, Demographics and Memory Study (ADAMS) assessments did not report a dementia diagnosis. Finally, a study using data from the National Health and Aging Trends Study (NHATS) found that among 585 respondents classified as having “probable dementia” via cognitive assessment, 39.5% did not have a recorded dementia diagnosis in their Medicare FFS claims, and an additional 19.2% had a recorded dementia diagnosis but self-reported never being diagnosed with dementia (Amjad et al., 2018). While these findings show that administrative and self-reported dementia underestimate dementia prevalence, more research is needed to understand how well these sources classify dementia both relative to one another and in the context of LTSS.
To our knowledge, there has been a systematic evaluation of measurement heterogeneity of dementia in NCI-AD survey data. To fill this gap, we aimed to use predictive modeling approaches to assess measurement heterogeneity of dementia diagnosis using NCI-AD data. Specifically, we applied a parallel modeling approach, where we treated both the administrative and self-report subpopulations as having their dementia status determined via gold-standard, and then used each subpopulation to predict dementia status for the opposite subpopulation. We then compared the relative performance of these models in order to detect differences in dementia classification. We hypothesized that the models fit with administrative and self-reported dementia diagnosis (yes or no) would perform similarly if the two data sources captured dementia status with equal efficiency. If one model outperforms the other, it is suggestive that the better performing data source more reliably captures dementia status.
Materials and Methods
Data
We used 4 years of NCI-AD data (2015–2018) for this analysis. The NCI-AD survey collects annual, cross-sectional data through in-person interviews. In order to use the NCI-AD, states must agree to interview at least 400 people who receive LTSS funded by Medicaid, other state programs, or the Older Americans Act. The survey consists of four sections: 1) pre-survey form, 2) background information section, 3) full in-person survey, and 4) the interviewer feedback form. If a LTSS service recipient is unable to participate in the survey, a proxy is interviewed in their place. A service recipient can also ask a proxy to help answer specific questions. If an observation had someone other than the service recipient respond to more than 50% of the questions, we treated the observation as a proxy respondent.
For our analyses, we extracted data from the NCI-AD background information section. The background information section collects demographic and service use information. When possible, HSRI pre-populates the background information with data from states’ administrative records. If items cannot be obtained from administrative records, surveyors are instructed to ask questions during the in-person survey. The in-person survey consists of approximately 90 questions that assess LTSS quality in domains such as safety, community inclusion, and care planning and coordination.
We excluded respondents under age 65 (n = 21,045), who were missing information regarding the type of LTSS waiver they were enrolled in (n = 2145), who were missing dementia status (yes or no; n = 3279), who had a traumatic brain injury (n = 2196), and who lived with a developmental disability (n = 486). This resulted in a final analytic sample of 24,569 respondents from 23 states.
Measures
Dementia
We used the background section of the NCI-AD survey to determine whether a respondent had a dementia diagnosis (yes or no) and the source of diagnostic data (state administrative record or self-reported). When a dementia diagnosis is not populated using state administrative records, the surveyor asks the respondent the following question: “has the person been told that he/she has Alzheimer’s disease or other dementia (decline in memory or other thinking skills that may reduce a person’s ability to perform everyday activities).” We categorized respondents who answered “yes” as having dementia and respondents who answered “no” as not having dementia.
Predictors
For this analysis, we selected demographic and memory indicators available from NCI-AD survey data that were consistently collected across the 4 years of data. Demographic predictors included respondent age, gender, race/ethnicity, marital status, language (English, Spanish, other), whether a proxy completed the survey, and ZIP code density (metropolitan, micropolitan, small town, and rural). Race is reported using six race/ethnicity indicators: American Indian or Alaska Native, Asian or Pacific Islander, Black or African American, white, Hispanic or Latino, and other/multiracial. Our study team manually reviewed indicator patterns and a free text other/specify field to code respondents who reported more than one race/ethnicity.
Functional need predictors included indicators for whether a respondent lived in a nursing home during the interview, has a physical disability, had 2+ falls in the previous 6 months, needs round-the-clock care, and whether the respondent indicated they forget things (yes or no) or talked to a medical provider about forgetting things (yes or no).
Analysis
We descriptively evaluated the frequencies of predictors for the sample overall and by source of dementia status. We then applied our predictive modeling approach which is illustrated in Figure 1. First, we trained models using 70% of the population whose dementia status was administratively determined and then tested these models with the remaining 30% of the population (i.e., internal validation data; Model A in Figure 1). We also tested the models on the entirety of the population whose dementia status was obtained from self or proxy report (i.e., external validation data; Model B in Figure 1). Second, we trained models to predict dementia using 70% of the survey population, tested these models with the remaining 30% of the survey population (Model D in Figure 1) and on the entirety of the population whose dementia status came from administrative records (Model C in Figure 1). We applied this parallel predictive modeling strategy hypothesizing that the models would perform similarly, regardless of being fit with administrative or survey data, if the two data sources captured dementia status with equal efficiency. Dementia status source subpopulations and predictive modeling approach.
We fit logistic regression models using all candidate predictors to train our 70% administrative and self-report models. We also explored applying a LASSO regression with a binomial distribution, logit link, and Bayesian information criterion (BIC) selection criteria in order to create a more parsimonious prediction model. LASSO regression applies a shrinkage constant to coefficients, driving them towards the null. Across LASSO iterations, the shrinkage factor is made smaller, allowing more parameters into the model until a full model is reached (Fu, 2003). The iteration with the lowest BIC score was selected for the predictive model (Kuha, 2004).
We classified observations with ≥50% predicted probability of dementia as having dementia, and compared predicted dementia status to the dementia status reported in the survey using sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). We descriptively examined these metrics across the four model validations to detect differences in dementia classification in the two data sources. We also produced and visually compared receiver operating curves (ROC) for the four modeling approaches to further examine differences in model performance.
Missingness of predictors ranged from 0% to 5%, and we addressed this using single imputation by the discriminant function method with categorical variable effects included (SAS Institutes Inc, 2020). All analyses were performed using SAS version 9.4. Our Institutional Review Board deemed the work exempt from human research subject protocols.
Results
Our analytic sample consisted of 24,569 respondents of whom 10,540 (42.9%) had data on dementia diagnosis (yes or no) from administrative records and 14,029 (57.1%) who self-reported their dementia status. Among the 10,540 respondents with prepopulated data from administrative records, 3043 (28.9%) had a dementia diagnosis. Among the 14,029 respondents without prepopulated data from administrative records, 2459 (17.5%) had a self/proxy reported dementia diagnosis.
Predictor Frequencies by Dementia Status Source in NCI-AD 2015-2018, Respondents age 65+.
*p-values comparing frequencies and administrative and self-report subpopulations generated using chi-square test.
Dementia Prediction Using Administrative Data
In the full logistic regression model fit with the same 70% training subsample, proxy status, older age, nursing home residence, round-the-clock care, and discussing forgetting things with a medical provider were significantly, positively associated with dementia diagnosis. Speaking a non-English language, Black race relative to white, and physical disability were significantly, negatively, associated with dementia in the administrative model. All coefficients selected in the LASSO model had the same directionality, but lower magnitude as those observed in a full multivariable logistic regression model (Supplementary Table 2). Due to the similar performance of the logistic and LASSO regression models, we focus on the logistic model results in the interest of interpretability.
Logistic Predictive Model Validation Agreement Frequencies and Accuracy Measures.
PPV = positive predictive value; NPV = negative predictive value.
ROC for each of the model validations are displayed in Figure 2. The area under the curves (AUC) was generally high for all ROCs examined. Applying the administrative model to administrative data resulted in an AUC of .78, and applying the same administrative model to a self-reported dementia population produced an AUC of .82. Receiver operating curves for logistic model validation data.
Dementia Prediction Using Self-Report Data
Multivariate Logistic Training Model Coefficients for 70% Administrative and Self-report Dementia Source NCI-AD Populations.
As above, we report the predictive accuracy of the logistic model. Applying coefficients from the logistic model fit with the 70% self-report survey training sample to the 30% self-report validation sample (see D in Figure 1) resulted in sensitivity of 49.7%, specificity of 96.4%, PPV of 74.7%, and NPV of 90.0%. When applying these same coefficients to predict dementia status in the administrative population (see C in Figure 1), sensitivity, specificity, PPV, and NPV all decreased (37.9%, 93.7%, 70.9%, and 78.8%, respectively).
ROC plots for these model validations are displayed in Figure 2. Applying the self-report model to the 30% self-report subsample resulted in an AUC of .86, and applying the same self-report model to the administrative dementia population produced an AUC of .76.
Discussion
We compared the predictive accuracy of models fit with different data sources (administrative vs. self-report) used to identify people with dementia in the NCI-AD survey, a voluntary effort by states to evaluate the quality of their LTSS programs. NCI-AD provides valuable data for policymakers and researchers seeking to evaluate the quality of LTSS for vulnerable beneficiaries, including people living with dementia. However, differences in state administrative record keeping, methods for obtaining medical diagnoses, and missing diagnosis data present challenges for identifying NCI-AD participants with dementia. We provide the first assessment of dementia accuracy in NCI-AD by source of diagnostic data. Of note, administrative records may be more reliable than self-report when identifying NCI-AD respondents with dementia; yet, only 42.9% of respondents had any administrative data related to their diagnosis. As more states participate in NCI-AD, it is important to ensure states embed their administrative records within the survey. Practically, our findings also provide an approach for investigators to impute dementia status for NCI-AD respondents who have missing diagnoses. Investigators should be aware that using the logistic coefficients to impute dementia status will likely under estimate the true number of people living with dementia.
Overall, our predictive models had low sensitivity when tested using the internal and external validation samples. These findings indicate our models do not reliably identify many NCI-AD respondents that either had a record of dementia or reported a dementia diagnosis. It is not uncommon for dementia prediction algorithms to have a low sensitivity. For example, the sensitivity of algorithms developed using the HRS to predict dementia were as low as 53% (Gianattasio et al., 2019), and a Medicare claims algorithm produced a sensitivity of 51% (Zhu et al., 2019). Previous HRS models have incorporated cognitive assessment data into their prediction algorithms. NCI-AD does not perform or collect cognitive assessments, so it is unsurprising that our models produced lower sensitivity than HRS based models. HRS-based models were also able to draw on a much larger set of demographic and socioeconomic predictors that are not collected in NCI-AD, such as educational attainment (Gianattasio et al., 2019), which has been shown to be associated with dementia diagnosis and cognitive performance (Contador et al., 2017; Lövdén et al., 2020; Xu et al., 2016). NCI-AD’s focus is on understanding LTSS consumers’ experience of care and the survey has limited clinical and demographic measures. Predictive models from HRS compared to our study highlight the importance of clinical data and cognitive assessments to more accurately predict dementia. Predictive models should also include socioeconomic measures such as education or literacy, since cognitive assessments may be biased by these measures (Ramos-Henderson et al., 2022).
The NCI-AD based dementia prediction models performed comparably to models fit with other data sources, but we also sought to explore the implications of combining different sources of dementia status. To understand the implications of combining self-reported and administratively determined dementia status, we assessed the relative predictive accuracy of the two sources by running parallel predictive models. In this approach, we treated each of these sources as the gold standard for prediction. As stated earlier, all models resulted in high specificity, regardless of which sub-sample was used to fit the model or which data set the model was applied to. However, there are evident differences in sensitivity between the two data sources. The predictive model based on administrative data produced a sensitivity of 41.0% in the internal validation sample, and sensitivity increased modestly to 43.8% when applied to cases with self-reported dementia diagnosis. In contrast, the model fit with the self-report sample produced a sensitivity of 49.7% when applied to other self-report cases, but sensitivity decreased to 37.9% when applied the administrative population. These findings suggest that the administrative dementia source may be capturing more, potentially milder, cases of dementia that are not as reliably identified through self-report.
Our initial sensitivity and specificity were based on a 50% predicted probability threshold for identifying dementia cases. We produced receiver operating curves (ROC) to better understand how our data sources would perform under various dementia identification thresholds. The models had higher area under the curve (AUC) values when applied to a self-report population (.82 and .86) than when applied to an administrative population (.78 and .76), regardless of which population was used to fit the model. This ROC performance, along with the similar statistically significant coefficients in the administrative and self-report logistic models, indicates that the accuracy of these NCI-AD models is relatively agnostic to the subpopulation used to fit the model. Rather, accuracy is more a function of the “gold standard” dementia cases in the subpopulation the models are applied to. We suspect that the self-report subpopulation produces higher AUC because self/proxy reported cases may generally be more severe and have higher predicted probabilities in the models. This allows the model to more efficiently discriminate between cases that did and did not report dementia; where the administrative data may flag some milder cases (e.g., poor performance on a Minimum Data Set item) that produce lower predicted probabilities in our models. This is consistent with prior work from Savva and Arthur where ADAMS participants who had cognitive scores consistent with dementia, but did not report a dementia diagnosis, showed less severe cognitive impairment than participants who reported a dementia diagnosis (Savva & Arthur, 2015).
Though these results suggest administratively identified dementia captures a broader range of dementia severity in NCI-AD relative to self-report, it is still unclear how well either source captures true dementia status. Both administrative and self-reported dementia sources are prone to false-negative dementia cases (Amjad et al., 2018; McCarthy et al., 2022; McGrath et al., 2021), and this bias may lead programs to underestimate the total LTSS need associated with consumers who have dementia. For example, Black Americans living with dementia are more likely to be underdiagnosed by both self-report and claims algorithms (McCarthy et al., 2022; McGrath et al., 2021), so underestimating dementia-related LTSS need may further compound long-term care disparities for Black communities. This research demonstrates the challenges that may arise from relying on self-report diagnostic data, and will hopefully encourage states to share administrative records of dementia status with NCI-AD in future waves of data collection. Future research should make efforts to collect gold-standard cognitive assessment data for NCI-AD respondents so states can more accurately estimate dementia prevalence and associated LTSS need in the populations they serve.
Limitations
There are several limitations to this study. Previous dementia predictive modeling research has utilized rigorous clinical assessments to determine gold-standard cases of dementia (Gianattasio et al., 2019; Grodstein et al. (2022); Power et al., 2020; Wu et al., 2013). NCI-AD data is not linked to clinical cognitive assessment data, so gold-standard dementia status is not available for survey respondents. Yet, the NCI-AD survey provides a rich and important source of consumer data, and is one of only a few existing sources of data on national HCBS use. Though we assume dementia cases in our data are true cases for the sake of comparison, both sources likely misclassify some respondents. Nevertheless, our analysis provides data on the potential implications of heterogeneous dementia classification in NCI-AD survey data. NCI-AD survey questions primarily focus on individual preferences and experiences with the LTSS program the respondent is enrolled in, rather than general health and functional status. We used a limited number of demographic and service utilization variables from the NCI-AD background information section as candidate predictors. This relatively small predictor pool limited our models’ predictive capability, but still allowed them to discriminate between respondents with and without dementia reasonably well. Dementia diagnosis in NCI-AD data has been identified through multiple administrative sources across states including ICD-10 codes, Minimum Data Set items, and state-specific databases, however the NCI-AD dataset does not differentiate between these specific sources. This introduces additional heterogeneity within the administrative subpopulation that we were unable to analyze. Finally, our models are specifically fit using NCI-AD respondents age 65 and older, and these models should not be applied to outside data sources or the under 65 NCI-AD population without validation.
Conclusion
NCI-AD is one of the only state level surveys to evaluate LTSS outcomes. NCI-AD survey data detects dementia with low sensitivity and high specificity, and relying on self-reported dementia status may be more prone to generating false-negatives. Our findings highlight the importance of understanding nuance around data collection methods when working with NCI-AD survey data, or any other data source that relies on combining data from agencies operating in different states or localities. These findings specifically demonstrate how differences in state level data collection may lead to systematic underrepresentation of service need for consumers with dementia. Moving towards more uniform and rigorous reporting standards for dementia status could produce more accurate measures of dementia prevalence, and allow for a better understanding of states true dementia-related LTSS need.
Supplemental Material
Supplemental Material - Comparing Dementia Classification by Self-Report and Administrative Records in the National Core Indicators-Aging and Disability Survey: A Predictive Modeling Approach
Supplemental Material for Comparing Dementia Classification by Self-Report and Administrative Records in the National Core Indicators-Aging and Disability Survey: A Predictive Modeling Approach John F. Mulcahy, Taylor Bucy Tetyana Shippee and Eric Jutkowitz in Journal of Applied Gerontology.
Research Ethics And Research Participant Consent
The University of Minnesota Institutional Review Board deemed the work exempt from human research subject protocols.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research reported in this publication was supported by the National Institute On Aging of the National Institutes of Health under Award Number RF1AG069771. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Data were provided by the Human Services Research Institute and Advancing States.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
