Predicting Conversion Time from Mild Cognitive Impairment to Dementia with Interval-Censored Models

Abstract

Background:

Mild cognitive impairment (MCI) patients are at a high risk of developing Alzheimer’s disease and related dementias (ADRD) at an estimated annual rate above 10%. It is clinically and practically important to accurately predict MCI-to-dementia conversion time.

Objective:

It is clinically and practically important to accurately predict MCI-to-dementia conversion time by using easily available clinical data.

Methods:

The dementia diagnosis often falls between two clinical visits, and such survival outcome is known as interval-censored data. We utilized the semi-parametric model and the random forest model for interval-censored data in conjunction with a variable selection approach to select important measures for predicting the conversion time from MCI to dementia. Two large AD cohort data sets were used to build, validate, and test the predictive model.

Results:

We found that the semi-parametric model can improve the prediction of the conversion time for patients with MCI-to-dementia conversion, and it also has good predictive performance for all patients.

Conclusions:

Interval-censored data should be analyzed by using the models that were developed for interval- censored data to improve the model performance.

Keywords

Alzheimer’s disease interval-censored data MCI-to-dementia conversion model selection random forest survival data

INTRODUCTION

Alzheimer’s disease (AD) is a progressive neurological disorder predominantly impacting cognitive abilities such as memory, executive function, and behavior.1,2, 1,2 It stands as the leading cause of dementia among the elderly.3,4, 3,4 Mild cognitive impairment (MCI) patients are at a higher risk of developing dementia than healthy normal controls in later years. Among MCI patients, some of them may remain stable for a long time while more than 10% MCI patients will progress to dementia annually. 5 The early identification of MCI patients who are at risk of conversion to dementia (MCI-to-dementia conversion), along with the proper estimation of the potential timeline for this progression, is critically significant. Such knowledge is instrumental in shaping clinical decision-making, helping determine the most beneficial interventions for specific patients, and assisting families in planning for future care if needed.6,7, 6,7

Current research in predicting dementia encounters several challenges. First, numerous models depend on sophisticated biomarkers obtained from positron emission tomography (PET) scans or cerebrospinal fluid (CSF) analyses. These are not widely accessible due to their cost, invasiveness, and the specialized setting required for their use.8,9, 8,9 The predictive performance by using blood biomarker data only is not as good as the model by using clinical data. The predictive performance may be improved by adding the blood biomarker data to the model with clinical data, although the improvement is small. 10 Consequently, the utilization of easily obtainable clinical data is crucial, as it provides wider and more immediate applicability. Second, most studies concentrate on classification models which only take binary outcomes (either MCI-to-dementia conversion or not). The traditional logistic regression often has worse predictive performance than the machine learning methods. 11 The conversion from MCI to dementia often spans several years, sometimes decades, with many individuals not reaching a dementia diagnosis before the end of a study. 12 These models do not effectively utilize the MCI- to-dementia conversion data and the censoring information. 13 Several studies applied traditional time-to-event statistical methods based on survival models for right-censored data.13 - 15 In fact, the exact MCI-to- dementia conversion time is unknown. We only observe the time interval of the dementia onset event. This type of censoring data is known as interval-censored data. The statistical methods for right-censored data are not applicable to interval-censored data since the time to the event cannot be directly observed, and event times are often poorly estimated.16,17, 16,17 Misuse of regression methods may lead to biased estimates and invalid statistical inferences.

Several regression analysis methods for interval-censored data were developed in the literature including the Cox proportional hazards (PH) model.18 - 26 A general and flexible modeling framework referred to be as the transformation models which include the Cox model as a special case was developed to relax the restrictive assumption of the Cox model.27 - 30 The penalized spline method was used to analyze interval-censored data to reduce the bias of parameter estimation. 30

In addition to these semi-parametric regression models, one may also consider machine learning and deep learning methods for interval-censored data. Cho et al. proposed interval-censored recursive forests as a solution to split bias arising from discrepancy among survival probabilities in traditional tree-based techniques. 31 This approach refines survival estimates through an iterative process. 32 Yao et al. developed the conditional inference forest model which is a random forest approach based on the weighted Kaplan-Meier estimate.33 - 36 As compared to the traditional random survival forests that treat all terminal nodes with equal weights, the conditional inference forest assigns large weights to terminal nodes with a substantial number of subjects at risk.33,37 - 39 , 33,37 - 39 Sun and Ding introduced an innovative neural network designed for interval-censored data. 17 Their neural network leverages Bernstein polynomials to address the challenge of approximating the baseline cumulative hazard function and covariate effects. These methods (e.g., random forest) flexibly handle complex interactions and non-linear relationships, enabling greater adaptability in modeling, but they remain a challenge to obtain the direct relationship between the outcome of interest (e.g., MCI-to-dementia conversion) and each measurement.

METHODS

Data sets

We utilized two data sets in this project: 1) the National Alzheimer’s Coordinating Center (NACC) downloaded on the date of December 21, 2023, and 2) the AD Neuroimaging Initiative (ADNI) downloaded on the date of January 15, 2024. From both data sets, we selected patients who were diagnosed with MCI at baseline. There are three versions of NACC data spanning from 2005. We combined the version 1 and version 2 of the NACC data (NACC-v1v2) as the training data with patients from 2005 to 2015. The third version (referred to be as NACC-v3) was used as the validation data set with patients from March 2015 in the NACC study. The ADNI MCI cohort was used as the testing data set.

We prepared the data sets by using a similar approach as these in the literature. 14 For example, any codes representing missing values (e.g., -4) were all transformed into NA, and variables that had more than 50% missing values were removed from the analysis. We utilized the first visit as the baseline date. For MCI-to-dementia conversion patients, that event was recorded in the study. Suppose T_i is the MCI-to-dementia conversion time from baseline and (L_i, R_i] is the time interval for the dementia onset event for the i-th patient, i = 1, 2, ⋯⋯, N, where N is the total number of participants in a study. For interval-censored data, R_i is the time of the MCI-to- dementia conversion visit from baseline, and L_i is the time of the visit right before the conversion from baseline. It is only known that the dementia onset time T_i is between L_i and R_i. For MCI stable patients who remained stable as MCI at the end of a study, they are considered as right-censored with R_i as ∞ and L_i as the time of the last visit from baseline.

Statistical models for interval-censored data

We considered two models for interval-censored data: the Cox PH model (Cox-I) and the random forest method based on the conditional inference approach (RF-I) by Yao et al.33,40, 33,40 As compared to the RF-I model, the Cox-I model has the PH assumption. But, the Cox-I model provides hazard ratios for each measure, therefore improving our ability to interpret the relationship between the survival outcome and each measure. The Cox- I model and the RF-I model can be implemented by using the R packages icenReg and ICcforest, respectively.33,41, 33,41

We also included the random forest model for right-censored data (RF-R) from the R package randomForestSRC in the model comparison as it was frequently used in the literature although it may not be proper here for interval- censored data. When data were treated as right-censored, we only utilized the data of (L_i = 0, R_i] for MCI patients who reported the dementia onset during a visit where R_i was the time of that visit with the dementia onset report from baseline. For MCI stable patients, data used in the right-censored model was the same as that in the interval-censored model. It can be seen that the interval-censored models have more detailed information on L_i for patients with MCI-to-dementia conversion. One goal of this research is to gain the knowledge of the model prediction improvement by analyzing the interval-censored data properly.

Model evaluation performance metrics

We calculated the following three performance metrics to compare different models: 1) the integrated Brier score (IBS), 2) P_within: the proportion of the predicted medium survival time within the observed time interval, and 3) P_before: the proportion of the predicted medium survival time before the MCI-to-dementia conversion. For the value of P_before, we will only calculate that proportion for MCI-to-dementia conversion patients, as there is no observed dementia conversion time for MCI stable patients.

The value of P_within can be calculated for all patients, MCI-to-dementia conversion patients, and MCI stable patients, as P_within(all), P_within(MCI-to-dementia), P_within(MCI-stable), respectively. The value of P_within(MCI- to-dementia) was calculated from the sub-MCI populations who had dementia onset event, as the proportion of the predicted medium survival time being within the observed time interval, while P_within(MCI-stable) can be computed similarly from the MCI stable patients. A model with large values of P_within and P_before is preferable.

The IBS is a weighted squared distance between the estimated survival function and the empirical survival function as the overall model performance. Suppose S (t|Z_i) is the survival function given measures Z_i from the i-th patient. Let t_max be the largest observed follow-up time of all MCI patients. Then, the IBS is defined as

$\begin{matrix} IBS (\hat{S}) = \frac{1}{N} \sum_{i = 1}^{N} \frac{1}{t_{max}} \\ \int_{0}^{t_{max}} {[I ((T_{i} > t) | Z_{i}) - \hat{S} (t | Z_{i})]}^{2} dt, \end{matrix}$ (1) where I() is the empirical survival function. 17 A model with a low value of IBS is preferable.

The medium survival time can be directly obtained from the fitted two interval-censored models (Cox-I and RF-I). For the RF-R model, the survival probabilities at each possible time point were provided. We calculated the median survival time as the earliest time such that the survival probability is below 50%.

Variable selection in interval-censored statistical processing

The NACC-v1v2 data set was used to build the predictive model based on the Cox-I model which provides the coefficient estimates of the relationship between the outcome (e.g., MCI-to-dementia conversion) and each measure such as age, and clinical dementia rating- sum of boxes (CDR-SB). In the NACC-v1v2 data set, there were 211 variables after removing the categorical variables whose highest frequency is above 97% to avoid model fitting issues. We also removed the measures from the clinician judgment of symptoms form as these measures are often not available in other studies. After that, we had 195 measures as the initial variables. These variables are from demographic data (e.g., age), physical measures (e.g., BMI), genetic data (Apolipoprotein E ɛ4), neuropsychological battery scores including sub-scales (e.g., Trail-B score, MMSE, CDR-SB), and functional activities questionnaire (FAQ) data. The FAQ assesses instrumental activities of daily living (IADLs), which require more cognitive ability than basic daily tasks. 42

After data cleaning, there were 3,529 MCI patients, with 1,453 MCI stable patients and 2,076 MCI-to-dementia patients. We excluded the MCI patients who had multiple conversions during the follow-up visits, such as MCI- dementia-MCI. We then performed the following steps to determine the measures in the final predictive model. 9

Step 1: Perform the Cox-I model with the MCI-to-dementia conversion time as the interval-censored outcome and each measure as a covariate to calculate the log-likelihood of each model. A model with a higher log-likelihood is better. These measures are ranked by the log-likelihood values from the largest to the smallest. The measures from the clinician judgment of symptoms are often not available in other studies. For that reason, these measures are re-ranked to the bottom of the list.

Step 2: From the fitted Cox-I model, if the direction of the estimated coefficient is not as expected, such measures are moved to the bottom of the list. The first 30 variables with the largest log-likelihood values are selected in the following forward selection method.

Step 3: The first measure with the largest log-likelihood value among the 30 measures from Step 2 is selected as X₁.

Step 4: We add each of the remaining 29 measures (X_i, i = 2, 3,⋯⋯, 30) to the model already having X₁. Suppose the model with X₁ and X_i as covariates has the largest log-likelihood and their relationships with the outcome are as expected. Then, we select X_i as the second measure in the final model.

Step 5: We repeat the approach in Step 4 to select the third measure from the remaining 28 measures. The next 15 measures can be determined by using a similar approach with a total of 18 measures.

After the forward model selection approach, we fitted the Cox-I model by using the first K measures to calculate the IBS score where K = 1, 2,⋯⋯, 18. The model with the best performance with regards to the IBS was selected as the final model. In Step 2, we selected 30 variables as the initial set of measures that have the best predictive performance. In the literature, the final selected measures are often below 10. 14 For that reason, we selected the first 18 top measures to compare the predictive performance.

RESULTS

Predictive model

We used the Cox-I model to select the top 18 measures by following the forward model selection approach as presented in above. We then utilized the Monte Carlo Cross Validation (MCCV) with 2,000 simulations to compare the three predictive models: the Cox-I model, the RF-I model, and the RF-R model. The MCCV has better performance than the traditional k-fold cross-validation with regard to prediction accuracy when the study sample size is small to medium. 43 In each simulation, 90% of the data from NACC-v1v2 were used as the training data, and the remaining 10% data were used as the testing data to calculate the model performance evaluation metrics. The NACC-v1v2 data set had a sample size of 3,529. The MCI patient demographic characteristics of the NACC-v1v2 data set were presented inTable 1.

The average of each performance metric as a function of the number of measures (K) in the predictive model was presented in Fig. 1 for IBS, Fig. 2 for P_within, and Fig. 3 for P_before. In Fig. 1, the computed IBS of the Cox-I model was almost independent of the number of measures. Its IBS was lower than the IBS values of the RF-I model and the RF-R model. The RF-R model had the worst performance with regard to IBS. In Fig. 2, the Cox-I model had the highest P_within(MCI-to-dementia), followed by the RF-I model and the RF-R model. For MCI stable patients, the RF-R model had some advantages over the Cox-I model for P_within(MCI-stable) when the number of measures was 8 or above. Figure 3 compared P_before for patients with MCI-to-dementia conversion, the Cox-I model had better performance than the RF models.

Table 1

Patient baseline characteristics of NACC-v1v2 MCI cohort

Sample size	Overall	MCI stable	MCI-to-dementia	p
	3,529	1,453 (41.17%)	2,076 (58.83%)
Follow-up time in months (SD)	34.1 (27.44)	36.5 (30.50)	32.4 (24.90)	0.0004
Age in years (SD)	74.7 (8.28)	73.9 (8.28)	75.3 (8.23)	<0.0001
Sex, n (%)				0.0298
Male	1,827 (51.8%)	720 (49.6%)	1,107 (53.3%)
Female	1,702 (48.2%)	733 (50.4%)	969 (46.7%)
Education in years (SD)	15.2 (3.36)	14.9 (3.41)	15.4 (3.31)	<0.0001

Fig. 1

The average IBS from MCCV simulations for all data as a function of the first K measures in the three models, by using the training data set: NACC-v1v2.

Fig. 2

The average P_within(MCI-to-dementia) for MCI-to-dementia conversion patients, P_within(MCI-stable) for MCI stable patients, P_within for all patients, from MCCV simulations as a function of the first K measures in the three models, by using the training data set: NACC-v1v2.

Fig. 3

The average P_before from MCCV simulations for MCI-to-dementia conversion patients as a function of the first K measures in the three models, by using the training data set: NACC-v1v2.

The Cox-I model had the lowest IBS for the model with first 7 measures. These 7 measures included the CDR-SB score, number of Apolipoprotein E (APOE) ɛ4 copies, logical memory IIA delayed score, trail B score, total number of vegetables named in 1 minute, remember dates from functional assessment questionnaire (FAQ), and the MMSE score. The identified 7 measures were listed in the first column in Table 2. The second column was the equivalent measurement names in the ADNI study with the third and fourth columns for the detailed information of these measures.

Table 2

Selected 7 variables in predictive models

Name in NACC	Name in ADNI	Description	Value range
CDRSUM	CDRSB	CDR - sum of boxes	0.0, 0.5, 1.0, 1.5,..., 18.0
MEMUNITS	LDELTOTAL	Logical memory IIA - delayed - total number of story units recalled	Integers from 0 to 25
TRAILB	TrailB	Trail making test part B — total number of seconds to complete	Integers from 0 to 300
NACCNE4S	APOE4	Number of APOEɛ4 alleles	0 = No ɛ4 allele; 1 = 1 copy of ɛ4 allele; 2 = 2 copies of ɛ4 allele
VEG	CATVEGESC	Vegetable — total number of vegetables named in 60 seconds	Integers from 0 to 7
REMDATES	FAQREM	In the past four weeks, did the subject have any difficulty or need help with: Remembering appointments, family occasions, holidays, medications	0 = Normal; 1 = Has difficulty, but does by self; 2 = Requires assistance; 3 = Dependent
NACCMMSE	MMSE	MMSE score	Integers from 0 to 30

Model testing

After we determined the 7 measures in the final predictive model, we applied the developed models on another two data sets: NACC-v3, and the ADNI MCI cohort. The MCI patient demographic characteristics of these two data sets were presented in Tables 3 and 4, respectively. As compared to the training data of the NACC-v1v2 with a sample size of 3,529, the NACC-v3 and the ADNI MCI cohort had 1,199 and 350 complete data with 7 measures and the conversion status data. In the NACC-v3, the measure of CRAFTDVR was used to replace the measure of MEMUNITS after rescaling the range of CRAFTDVR to 0–25 to match the range of MEMUNITS, and the MMSE score was converted from the MoCA score.44,45, 44,45

Table 3

Patient baseline characteristics of NACC-v3 MCI cohort with complete data

Sample size	Overall	MCI stable	MCI-to-	p
	1,199	699 (58.30%)	dementia
			500 (41.70%)
Follow-up time in months (SD)	30.0 (17.50)	32.1 (18.7)	27.0 (15.3)	<0.0001
Age in years (SD)	72.5 (7.32)	72.4 (7.39)	72.7 (7.21)	0.4183
Sex, n (%)				0.1000
Male	639 (53.3%)	358 (51.2%)	281 (56.2%)
Female	560 (46.7%)	341 (48.8%)	219 (43.8%)
Education in years (SD)	16.3 (2.87)	16.3 (2.84)	16.2 (2.93)	0.8508

Table 4

Patient baseline characteristics of the ADNI MCI cohort with complete data

Sample size	Overall	MCI stable	MCI-to-	p
	350	152	dementia
		(43.43%)	198 (56.57%)
Follow-up time in months (SD)	41.7 (40.14)	56.3 (49.40)	30.4 (26.30)	<0.0001
Age in years (SD)	74.9 (7.23)	75.1 (7.47)	74.7 (7.04)	0.5225
Sex, n (%)				0.8292
Male	229 (65.4%)	98 (64.5%)	131 (66.2%)
Female	121 (34.6%)	54 (35.5%)	67 (33.8%)
Education in years (SD)	15.6 (2.93)	15.6 (3.02)	15.7 (2.86)	0.8860

Patients in the NACC-v3 had shorter follow-up times than those in the NACC-v1v2, and the rate of MCI-to- dementia conversion was lower in the NACC-v3:41.7% as compared to 58.8% in the NACC training data set. Meanwhile, we found that the ADNI MCI cohort had a similar MCI-to-dementia conversion rate as compared to the NACC training data: 56.6% versus 58.8%. The MCI-to-dementia conversion group had an average follow-up time of 32.4 months in the NACC-v1v2, 27.0 months in the NACC-v3, and 30.4 months in the ADNI, respectively. The average age in MCI-to-dementia conversion and MCI stable groups were found to be similar within each MCI cohort with the mean baseline age from 72 to 75. In the NACC-v1v2 data set, the MCI-to-dementia conversion group exhibited slightly higher baseline ages than MCI stable group by 1.5 years. The NACC had a good balance on sex with almost equal number of females and males, while the ADNI study had more males than females. The cognitive outcomes were worse in the MCI-to-dementia conversion group as compared to the MCI stable group in all the three cohorts as presented in Supplementary Tables 1-3.

We conducted external validation on the NACC-v3 data set by using the final predictive model with the identified 7 measures. The computed IBS values were 0.118, 0.123, and 0.124 for the Cox-I model, the RF-I model, and the RF-R model, respectively. For the MCI-to-dementia conversion group, the Cox-I model had a much higher value of P_within(MCI-to-dementia) and P_before as compared to the other two RF models (9% to 14% difference). For MCI stable patients, the RF-R model had the highest P_within(MCI-stable) value of 77%, followed by the Cox-I model of 65%, and the RF-I model at 59%. These findings were similar to the results from using the NACC-v1v2 data set.

For the ADNI MCI cohort, the Cox-I model had the lowest IBS of 0.099, followed by the RF-R model of 0.110, and the RF-I model of 0.113. For the MCI-to-dementia conversion patients, the P_within(MCI-to-dementia) of the Cox-I model was close to 16% which was higher than 12% from the RF-I model and 12% from the RF-R model. The P_before of the Cox-I model was above 55% while the P_before of the RF-I model was below 45%. For MCI stable patients, the Cox-I model had P_within(MCI-stable) = 45% chance to predict the dementia onset time beyond the last observed time, while the RF-I model reduced that probability close to P_within(MCI-stable) = 42%.

DISCUSSION

When interval-censored data are analyzed by using right-censored methods, the parameters in the model are often biased as the right censored models assume that the event time can be exactly observed if it occurs. In the AD research to study MCI-to-dementia conversion, the exact time is unknown. For that reason, the right-censored models are not appropriate. The considered interval-censored models have better performance with much lower IBS scores than the right-censored models. Another contribution of this article is the feature selection technique that utilizes the forward model selection and the relationship between the outcome and each variable. Without the feature selection process, inconsistencies may arise between the variable correlation to the outcome and the estimated coefficient from survival models.9,46,47 , 9,46,47

We identified the 7 measures to predict MCI-to-dementia conversion. Among these 7 measures, three measures were found in the predictive model by using right-censored models: the CDR score, the logical memory score, and the difficult level in remembering appointments. 14 For the remaining 4 measures, Trail B score is commonly used for evaluating cognitive impairment, and it is often associated with cognitive performance. 48 The MMSE score at baseline was found to be predictive of conversion from MCI to dementia. 49 The APOE ɛ4 is the primary genetic risk factor of dementia.50,51, 50,51 The last measure, total number of vegetables named in 60 seconds, was selected in a model to predict disease progression from normal to MCI and MCI to dementia. 52

It is noted that the P_within(MCI-stable) value had a 2% increase for the model with K = 8 measures from the model with K = 7 measures. The 8th measure was TAXES from the FAQ in the NACC study to measure the level of difficulty in assembling tax records, business affairs or other papers. This finding may indicate that the TAXES measure can improve the prediction of the dementia onset time among MCI stable patients. That trend did not occur in the MCI-to-dementia conversion group.

The proportion of data censoring may impact the model performance in survival models. This has been discussed in several medical research studies in which the focus was primarily on the right-censored data. Rahman et al. observed that censoring levels affected the variability in predictive accuracy measures, particularly in data with a medium to high censoring rate. 53 Persson and Khamis found that the accuracy of hazard ratio estimates from the Cox PH model was affected by censoring types and proportions in different hazard scenarios. 54 Their study showed that increased censoring proportions generally led to higher biases in estimates, especially under the early censoring in which the actual censoring time was shorter than the original simulated censoring time. These findings emphasized the importance of accounting for censoring proportions in survival analysis to ensure the reliability and validity of survival models.

In machine learning methods (e.g., RF), both parameters and hyperparameters are essential factors that have influences on model performance. Parameters include the regression coefficients corresponding to each measure. Hyperparameters such as mtry (number of variables randomly sampled as candidates at each split), and node size are adjusted to achieve a balance between individual tree robustness and minimizing correlation among trees. As the optimal hyperparameter values are contingent upon the dataset, default settings may not ensure the optimal performance. 55 Among these hyperparameters, mtry has a significant influence, with its optimal value related to the number of variables incorporated in the model. Notably, addressing over-fitting is crucial during hyperparameter tuning, as it can cause the model to overly fit the training data, potentially leading to poor performance on new data sets. Employing cross-validation during the tuning process can alleviate this issue to some extent but not completely. 56 We consider this as future work to further improve the performance of RF methods.

AUTHOR CONTRIBUTIONS

Yahui Zhang (Data curation; Methodology; Software; Writing – original draft; Writing – review & editing); Yulin Li (Formal analysis; Writing – original draft); Shangchen Song (Resources; Software; Writing – original draft); Zhigang Li (Conceptualization; Writing – original draft); Minggen Lu (Conceptualization; Methodology; Writing – original draft); Guogen Shan (Conceptualization; Formal analysis; Funding acquisition; Investigation; Methodology; Software; Supervision; Validation; Writing – original draft; Writing – review & editing).

Footnotes

ACKNOWLEDGMENTS

We would like to thank the comments from the editor, associate editor, and reviewers, that help us to improve the manuscript.

FUNDING

Shan’s research is partially supported by grants from the National Institutes of Health: R03AG083207, R03CA248006, and R01AG070849.

DATA AVAILABILITY

The dataset used in this study was obtained from a third-party organization “Alzheimer’s disease Neuroimaging Initiative” (ADNI) database. The data are available from the ADNI database (adni.loni.usc.edu) upon registration and compliance with the data usage agreement. For up-to-date information, see www.adni-info.org. The proposed algorithm uses this ADNI data repository. The data that support the findings of this study are available on reasonable request from the authors. All ADNI studies are conducted according to the Good Clinical Practice guidelines, the Declaration of Helsinki, and U.S. 21 CFR Part 50 (Protection of Human Sub- jects), and Part 56 (Institutional Review Boards). Written informed consent was obtained from all participants before protocol-specific procedures were performed. The ADNI protocol was approved by the Institutional Review Boards of all of the participating institutions. This study was approved by the Institutional Review Boards of all of the participating institutions, such as the Office for the Protection of Research Subjects at the University of Southern California. A complete listing of ADNI investigators and affiliations can be found at . Informed written consent was obtained from all participants at each site. The investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. More details can be found at adni.loni.usc.edu.

The NACC database is funded by NIA/NIH Grant U24 AG072122. NACC data are contributed by the NIA- funded ADRCs: P30 AG062429 (PI James Brewer, MD, PhD), P30 AG066468 (PI Oscar Lopez, MD), P30 AG062421 (PI Bradley Hyman, MD, PhD), P30 AG066509 (PI Thomas Grabowski, MD), P30 AG066514 (PI Mary Sano, PhD), P30 AG066530 (PI Helena Chui, MD), P30 AG066507 (PI Marilyn Albert, PhD), P30AG066444 (PI John Morris, MD), P30 AG066518 (PI Jeffrey Kaye, MD), P30AG066512 (PI Thomas 66Wisniewski, MD), P30 AG066462 (PI Scott Small, MD), P30 AG072979 (PI David Wolk, MD), P30 AG072972 (PI Charles DeCarli, MD), P30 AG072976 (PI Andrew Saykin, PsyD), P30 AG072975 (PI David Bennett, MD), P30 AG072978 (PI Neil Kowall, MD), P30 AG072977 (PI Robert Vassar, PhD), P30 AG066519 (PI Frank LaFerla, PhD), P30 AG062677 (PI Ronald Petersen, MD, PhD), P30 AG079280 (PI Eric Reiman, MD), P30 AG062422 (PI Gil Rabinovici, MD), P30 AG066511 (PI Allan Levey, MD, PhD), P30 AG072946 (PI Linda Van Eldik, PhD), P30 AG062715 (PI Sanjay Asthana, MD, FRCP), P30 AG072973 (PI Russell Swerdlow, MD), P30 AG066506 (PI Todd Golde, MD, PhD), P30 AG066508 (PI Stephen Strittmatter, MD, PhD), P30 AG066515 (PI Victor Henderson, MD, MS), P30 AG072947 (PI Suzanne Craft, PhD), P30 AG072931 (PI Henry Paulson, MD, PhD), P30 AG066546 (PI Sudha Seshadri, MD), P20 AG068024 (PI Erik Roberson, MD, PhD), P20 AG068053 (PI Justin Miller, PhD), P20 AG068077 (PI Gary Rosenberg, MD), P20 AG068082 (PI Angela Jefferson, PhD), P30 AG072958 (PI Heather Whitson, MD), P30 AG072959 (PI James Leverenz, MD).

CONFLICT OF INTEREST

The authors have no conflict of interest to report.

The supplementary material is available in the electronic version of this article: .

References

Ravina

, Cummings

, McDermott

, et al. Clinical Trials in Neurology: Design, Conduct, Analysis. Cambridge University Press; 2012.

Cummings

, Lee

, Nahed

, et al. Alzheimer’s disease drug development pipeline: 2022. Alzheimers Dement (N Y) 2022; 8: e12295.

van Dyck

, Swanson

, Aisen

, et al. Lecanemab in early Alzheimer’s disease. N Engl J Med 2023; 388: 9–21.

Sims

, Zimmer

, Evans

, et al. Donanemab in early symptomatic Alzheimer disease. JAMA 2023; 330: 512.

Shigemizu

, Akiyama

, Higaki

, et al. Prognosis prediction model for conversion from mild cognitive impairment to Alzheimer’s disease created by integrative analysis of multi-omics data. Alzheimers Res Ther 2020; 12: 145.

Cummings

. Meaningful benefit and minimal clinically important difference (MCID) in Alzheimer’s disease: Open peer commentary. Alzheimers Dement (N Y) 2023; 9: e12411.

Wang

, Chen

, Du

, et al. Plasma p-tau181 level predicts neurodegeneration and progression to Alzheimer’s dementia: a longitudinal study. Front Neurol 2021; 12: 695696.

Shan

, Bernick

, Caldwell

JZK

, et al. Machine learning methods to predict amyloid positivity using domain scores from cognitive tests. Sci Rep 2021; 11: 4822.

Shan

, Lu

, Li

, et al. ADSS: A composite score to detect disease progression in Alzheimer’s disease. J Alzheimers Dis Rep 2024; 8: 307–316.

10.

Planche

, Bouteloup

, Pellegrin

, et al. Validity and performance of blood biomarkers for Alzheimer disease to predict dementia risk in a large clinic-based cohort. Neurology 2023; 100: E473–E484.

11.

Kuang

, Zhang

, Cai

, et al. Prediction of transition from mild cognitive impairment to Alzheimer’s disease based on a logistic regression– artificial neural net- work– decision tree model. Geriatr Gerontol Int 2021; 21: 43–47.

12.

James

, Ranson

, Everson

, et al. Performance of machine learning algorithms for predicting progression to dementia in memory clinic patients. JAMA Network Open 2021; 4: e2136553.

13.

Grueso

and Viejo-Sobera

. Machine learning methods for predicting progression from mild cognitive impairment to Alzheimer’s disease dementia: a systematic review. Alzheimers Res Ther 2021; 13: 162.

14.

Song

, Asken

, Armstrong

, et al. Predicting progression to clinical Alzheimer’s disease dementia using the random survival forest. J Alzheimers Dis 2023; 95: 535–548.

15.

Shan

, Banks

, Miller

, et al. Statistical advances in clinical trials and clinical research. Alzheimers Dement (N Y) 2018; 4: 366–371.

16.

, Wang

, Bandyopadhyay

, et al. Sieve estimation of a class of partially linear transformation models with interval-censored competing risks data. Stat Sin 33: 685–704.

17.

Sun

and Ding

. Neural network on interval-censored data with application to the prediction of Alzheimer’s disease. Biometrics 2023; 79: 2677–2690.

18.

Sun

, Sun

, Zhu

. Testing the proportional odds model for interval-censored data. Lifetime Data Anal 2007; 13: 37–50.

19.

Zhu

, Tong

, Cai

, et al. Maximum likelihood estimation for the proportional odds model with mixed interval-censored failure time data. J Appl Stat 2021; 48: 1496–1512.

20.

Zhang

, Hua

, Huang

. A spline-based semiparametric maximum likelihood estimation method for the Cox model with interval-censored data. Scand J Stat 2010; 37: 338–354.

21.

Shan

. Randomized two-stage optimal design for interval-censored data. J Biopharm Stat 2022; 32: 298–307.

22.

Wang

, Logovinsky

, Hendrix

, et al. ADCOMS: A composite clinical outcome for prodromal Alzheimer’s disease trials. J Neurol Neurosurg Psychiatry 2016; 87: 993–999.

23.

Finkelstein

. A proportional hazards model for interval-censored failure time data. Biometrics 1986; 42: 845–854.

24.

Pan

. Extending the Iterative Convex Minorant Algorithm to the Cox model for interval-censored data. J Comput Graph Stat 1999; 8: 109–120.

25.

Shan

. Optimal two-stage designs based on restricted mean survival time for a single-arm study. Contemp Clin Trials Commun 2021; 21: 100732.

26.

Betensky

, Lindsey

, Ryan

, et al. A local likelihood proportional hazards model for interval censored data. Stat Med 2002; 21: 263–275.

27.

Zhang

, Sun

, Zhao

, et al. Regression analysis of interval-censored failure time data with linear transformation models. Can J Stat 2005; 33: 61–70.

28.

Zhang

and Zhao

. Empirical likelihood for linear transformation models with interval-censored failure time data. J Multivar Anal 2013; 116: 398–409.

29.

Zeng

, Mao

and Lin

. Maximum likelihood estimation for semiparametric transformation models with interval-censored data. Biometrika 2016; 103: 253–271.

30.

, Liu

, Li

, et al. An efficient penalized estimation approach for semiparametric linear transformation models with interval-censored data. Stat Med 2022; 41: 1829–1845.

31.

Cho

, Jewell

and Kosorok

. Interval censored recursive forests. J Comput Graph Stat 2022; 31: 390–402.

32.

Efron

. The two sample problem with censored data. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability 1967; 4: 831–854.

33.

Yao

, Frydman

and Simonoff

. An ensemble method for interval-censored time-to-event data. Biostatistics 2021; 22: 198–213.

34.

Kaplan

and Meier

. Nonparametric estimation from incomplete observations. J Am Stat Assoc 1958; 53: 457–481.

35.

Shan

. Two-stage optimal designs based on exact variance for a single-arm trial with survival end- points. J Biopharm Stat 2020; 30: 797–805.

36.

Shan

, Wilding

, Hutson

, et al. Optimal adaptive two-stage designs for early phase II clinical trials. Stat Med 2016; 35: 1257–1266.

37.

Hothorn

, Lausen

, Benner

, et al. Bagging survival trees. Stat Med 2004; 23: 77–91.

38.

Shan

. Promising zone two-stage design for a single-arm study with binary outcome. Stat Methods Med Res 2023; 32: 1159–1168.

39.

Salerno

and Li

. High-dimensional survival analysis: methods and applications. Annu Rev Stat Appl 2023; 10: 25–49.

40.

Anderson-Bergman

. An efficient implementation of the EMICM algorithm for the interval censored NPMLE. J Comput Graph Stat 2017; 26: 463–467.

41.

Anderson-Bergman

icenReg: regression models for interval censored data in R. J Stat Softw 2017; 81: 1–23.

42.

Gonz'alez

, Gonzales

, Resch

, et al. Comprehensive evaluation of the Functional Activities Questionnaire (FAQ) and its reliability and validity. Assessment 2022; 29: 748–763.

43.

Shan

. Monte Carlo cross-validation for a study with binary outcome and limited sample size. BMC Med Inform Decis Mak 2022; 22: 270.

44.

Dodge

, Goldstein

, Wakim

, et al. Differentiating among stages of cognitive impairment in aging: Version 3 of the Uniform Data Set (UDS) neuropsychological test battery and MoCA index scores. Alzheimers Dement (N Y) 2020; 6: e12103.

45.

Monsell

, Dodge

, Zhou

, et al. Results from the NACC uniform data set neuropsychological battery crosswalk study. Alzheimer Dis Assoc Disord 2016; 30: 134–139.

46.

Shan

. A better confidence interval for the sensitivity at a fixed level of specificity for diagnostic tests with continuous endpoints. Stat Methods Med Res 2017; 26: 268–279.

47.

Shan

, Ritter

, Miller

, et al. Effects of dose change on the success of clinical trials. Contemp Clin Trials Commun 2022; 30: 100988.

48.

, Andersen

, Cosentino

, et al. Digitally generated trail making test data: analysis using hidden Markov modeling. Alzheimers Dement (Amst) 2022; 14: e12292.

49.

Arevalo-Rodriguez

, Smailagic

, Roqu'e-Figuls

, et al. Mini-Mental State Examination (MMSE) for the early detection of dementia in people with mild cognitive impairment (MCI). Cochrane Database Syst Rev CD 2021; 7: 010783.

50.

Stocker

, Perna

, Weigl

, et al. Prediction of clinical diagnosis of Alzheimer’s disease, vascular, mixed, and all-cause dementia by a polygenic risk score and APOE status in a community-based cohort prospectively followed over 17 years. Mol Psychiatry 2021; 26: 5812–5822.

51.

Raulin

, Doss

, Trottier

, et al. ApoE in Alzheimer’s disease: pathophysiology and therapeutic strategies. Mol Neurodegener 2022; 17: 72.

52.

Pang

, Kukull

, Sano

, et al. Predicting progression from normal to MCI and from MCI to AD using clinical variables in the National Alzheimer’s Coordinating Center Uniform Data Set Version application of machine learning models and a probability calculator. J Prev Alzheimers Dis 2023; 10: 301–313.

53.

Rahman

, Ambler

, Choodari-Oskooei

, et al. Review and evaluation of performance measures for survival prediction models in external validation settings. BMC Med Res Methodol 2017; 17: 60.

54.

Persson

and Khamis

. Bias of the Cox model hazard ratio. J Mod Appl Stat Methods 2005; 4: 10.

55.

Schratz

, Muenchow

, Iturritxa

, et al. Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data. Ecol Model 2019; 406: 109–120.

56.

Probst

, Wright

and Boulesteix

. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Min Knowl Discov 2019; 9: e1301.