Abstract
Current technologies quantifying cerebrospinal fluid biomarkers to identify subjects with Alzheimer’s disease pathology report different concentrations in function of technology and suffer from between-laboratory variability. Hence, lab- and technology-specific cut-off values are required. It is common practice to establish cut-off values on small datasets and, in the absence of well-characterized samples, to transfer the cut-offs to another assay format using ‘side-by-side’ testing of samples with both assays. We evaluated the uncertainty in cut-off estimation and the performance of two methods of cut-off transfer by using two clinical datasets and simulated data. The cut-off for the new assay was transferred by applying the commonly-used linear regression approach and a new Bayesian method, which consists of using prior information about the current assay for estimation of the biomarker’s distributions for the new assay. Simulations show that cut-offs established with current sample sizes are insufficiently precise and also show the effect of increasing sample sizes on the cut-offs’ precision. The Bayesian method results in unbiased and less variable cut-offs with substantially narrower 95% confidence intervals compared to the linear-regression transfer. For the BIODEM datasets, the transferred cut-offs for INNO-BIA Aβ1-42 are 167.5 pg/mL (95% credible interval [156.1, 178.0] and 172.8 pg/mL (95% CI [147.6, 179.6]) with Bayesian and linear regression methods, respectively. For the EUROIMMUN assay, the estimated cut-offs are 402.8 pg/mL (95% credible interval [348.0, 473.9]) and 364.4 pg/mL (95% CI [269.7, 426.8]). Sample sizes and statistical methods used to establish and transfer cut-off values have to be carefully considered to guarantee optimal diagnostic performance of biomarkers.
INTRODUCTION
Over the past two decades, numerous studies have assessed potential applications of cerebrospinal fluid (CSF) biomarkers in the field of Alzheimer’s disease (AD). There is a general agreement across studies that an initial decrease in CSF amyloid-β (Aβ1-42), followed by an increase in total tau and/or phosphorylated tau, is a reflection of ongoing neuropathology (amyloidopathy, tauopathy) of the AD-type in the brain of affected subjects [1, 2]. In addition, CSF biomarker analysis was integrated in (research) criteria for diagnosis of AD [3, 4]. Position emission tomography imaging has been approved by the Food and Drug Administration to identify subjects with ongoing amyloidopathy [5]. The European Medicine Agency qualified the combination of CSF Aβ1-42 and total tau for use as a tool for patient stratification and patient enrichment in clinicaltrials [6].
However, these guidance documents do not specify the use of a specific assay or technology, nor do they provide advices for the manufacturer on acceptance criteria for analytical and diagnostic performance requirements. A laboratory that desires integration of AD biomarker quantification in its portfolio has the choice among several commercially available assays, which differ with respect to their worldwide availability, ease-of-use, technology, design, critical raw materials (antibodies, calibrators), and regulatory status, as well as the level of validation [7].
At present, there is no reference standard or reference method that is fully representative for the endogenous peptide [8], and the same assay can generate different values across different laboratories [9–12]. Efforts to improve the assays’ between-lab variability are already ongoing [9, 14].
The use of a biomarker assay for patient classification implies the need to establish a cut-off value to assign patients to the desired categories (e.g., No-AD/AD). At present, these cut-offs cannot be derived universally and each laboratory has to establish its own cut-off for the assay of choice [15, 16]. These efforts are time-consuming and require well-characterized samples, preferably from subjects with autopsy-confirmed AD. The neuropathological examination is most helpful in ascertaining the true disease status if performed in a relatively short time after the analyses of the CSF biomarkers. This is often not the case in a chronic disease as AD. Nonetheless, it has been shown that cut-off values derived by using the clinical diagnosis as the reference test lead to a shift in the cut-off value as compared to using the autopsy-confirmed diagnosis [17], with possibly suboptimal sensitivity and specificity as a result. However, well-characterized samples like those obtained from clinical trials or worldwide consortia are not widely available in quantities sufficient for repeated testing. Therefore, the datasets used to derive a cut-off value are often small with a total of 100 to 200 measurements [18–22], resulting in cut-off values with high uncertainty. These precision aspects are typically ignored in practice [23] and, to our knowledge, have not been reported for AD CSF biomarkers.
When a laboratory would like to convert a test procedure for a specific analyte to a newer assay, the cut-off value has to be derived for this new assay. The best way to do this is by testing samples of well-diagnosed subjects. In the absence of these samples, however, it is a common practice to test available samples ‘side-by-side’ with the current and new assay and to transfer the measurements and/or the cut-off value of the current assay to the new assay by means of a linear regression formula [24–28]. To our knowledge, the validity and the effect of this cut-off transfer method on the clinical performance of the biomarker measured with the new assay has not been studied.
In this paper, we study the properties of the linear-regression-based method of transferring the cut-off value of a current assay to a new assay. Moreover, we compare it to a novel, Bayesian method that we have developed. Toward this aim, we undertake a simulation study and apply the methods to two sets of data with Aβ1-42 measurements from the BIODEM lab of the University of Antwerp. In the process, we also evaluate the precision of the obtained cut-off estimates as a function of the size of datasets.
MATERIALS AND METHODS
Methods
When no reliable disease status of the subjects is available, the cut-off for a new assay has to be transferred from an assay that is already in use. For such a situation, we consider a novel Bayesian ‘Two-stage’ approach and the commonly-used linear-regression-based method. For the case when the information about the disease-status of the subjects is available, we consider direct non-parametric estimation of a cut-off value.
Bayesian two-stage cut-off transfer method
This approach is a novel method that allows transferring the current-assay cut-off value to a new assay when the disease-status of the subjects, for which measurements for the current and new assays were collected, is unavailable or is only based on clinical diagnosis. Figure 1 provides a schematic illustration of this new method.
In the first step, information about the distribution of the current-assay measurements is obtained by analyzing the dataset, in which a reliable diagnosis is available. We will call the dataset a ‘current-assay cut-off’ dataset. More formally, let X c be the biomarker values measured with the current assay and X n the same biomarker measured with a new assay. Assume that the distribution of these biomarker values in the control population is bivariate normal with the means μc,0 and μn,0, variances and , and the correlation between Xc,0 and Xn,0 = ρ0. Similarly, in the AD population, the distribution of biomarker values is bivariate normal with the means μc,1 and μn,1, variances and , and the correlation between Xc,1 and Xn,1 = ρ1. In practice, this assumption of bivariate normality entails that there exists a monotonic transformation yielding a normal distribution for both assays’ biomarker values within each disease class. For instance, many biomarker measurements follow approximately a lognormal distribution (that is skewed to the right), so that taking the natural logarithm of the biomarker data will approximately yield the necessary normal distribution.
The goal of the first stage is to obtain the posterior distributions of μ c 0 , , μ c 1 , and . Toward this aim, a normal distribution is fitted to the ‘current-assay cut-off’ data set biomarker values for AD and control groups (Fig. 1a). Prior distributions are defined to be as uninformative as possible in terms of the diagnostic accuracy of the biomarker [expressed by the area under receiver operating characteristics (ROC) curve (AUC)]. The results (‘posterior distributions’) obtained from the model are used in the second step of the analysis.
In this second step, a ‘new-assay transfer’ dataset is analyzed. In this dataset, measurements of the current and new assays are available for all subjects, but there is no reliable clinical or autopsy-confirmation diagnosis. A Bayesian latent-class mixture-model is fitted to the multivariate ‘new-assay transfer’ data. The ‘latent class’ model predicts the unknown (‘latent’) disease status (AD or control) of the subjects using the biomarker values obtained with both assays (Fig. 1b). In this stage, the Bayesian model is fitted with informative priors based on the posterior information from stage 1. In particular, the prior distributions for φ–1 (AUC c ), μc,0, σc,0, and σc,0 are defined as normal distributions. The means and standard deviations of these normal distributions are taken to be the means and standard deviations of the posterior distributions of the corresponding parameters from stage 1. Since this could lead to problems for σ c 0 and σ c 1 , the normal prior distributions for these parameters are lower-truncated at zero. Note that the method assumes that the biomarker values obtained with the current assay follow the same distribution in the ‘current-assay cut-off’ and ‘new-assay transfer’ datasets.
The model then estimates the normal-distribution parameters (means and variances) for biomarker values of both assays in the AD and control populations (Fig. 1c).
A new cut-off value, c
n
, can be obtained by using the estimates of the underlying biomarker distribution parameters (Fig. 1d) to establishing a bi-normal ROC curve. The optimal cut-off, which maximizes the Youden-index, of the new assay can then be found as
The median of the resulting posterior distribution of the new assay cut-off (c n ) is considered as the point estimate, while the standard deviation of this posterior distribution comprises the standard error of this point estimate. A credible interval for the cut-off estimate can be computed with this point estimate and the associated standard error. A credible interval is the Bayesian version of a confidence interval with a somewhat different interpretation. A 95% credible interval means that, given the observed data, there is a 95% probability that the true value of the cut-off falls within this interval. Further details on Bayesian inference are provided in Supplementary Material (Section I).
Linear-regression-based transfer of the cut-off value
In this approach, which is used when the information about the disease-status of the subjects measured with the new assay is unavailable or only based on clinical diagnosis, the cut-off value of the new assay is obtained from the cut-off value of the current assay by using linear regression.
In particular, the relation between the current and new platform is estimated by fitting the linear regression model to the measurements for both assays:
Note that the regression model (1) is fitted to the entire dataset, which contains a mixture of control and AD subjects. Thus, the method assumes that the same linear relationship holds in the control and AD populations. If this is not the case, however, model (1) is wrong. Consequently, using its estimated coefficients in (2) may lead to a biased cut-off value for the new assay, what is further illustrated in Fig. 2. This hypothetical example shows that the linear-regression-based method may lead to biased results. In fact, it can be analytically shown that, assuming bivariate normality, unbiased results are obtained only if (i) the regression lines have the same intercept AND slope in the AD and control groups, or if (ii) the regression lines have the same slope in both groups AND the dataset contains an equal number of diseased and control subjects (see Supplementary Material, Section II).
Direct cut-off estimation from diagnostic information
A variety of ROC-curve estimation and cut-off selection methods exist [29] when the information about the disease-status of the subjects is available. It is beyond the scope of this paper to compare the performance of all methodologies. We focused on the fully non-parametric (empirical) ROC method, in which the cut-off value is obtained by selecting the (observed) biomarker value with the highest Youden index (i.e., the value of sensitivity + specificity -1) [30, 31]. This approach is often used to establish cut-off values for AD biomarkers [18–21]. It is appropriate because sensitivity and specificity are deemed of equal importance in AD diagnosis [16] and hence the prevalence of AD and the relative cost of a false-negative classification, as compared to a false-positive classification, are not included in the selection of an ‘optimal’ cut-off [23, 30, 32, 23, 30, 32].
The SE of the estimated cut-off value was estimated by bootstrapping the dataset. Toward this end, 10,000 bootstrapped datasets of the same size as the original dataset were generated by resampling the dataset with replacement. The sampling with replacement implicates that the same biomarker value can appear in the bootstrap sample multiple times. Next, the cut-off values were estimated for each resampled dataset, and the sample standard deviation of the resulting 10,000 cut-off values (representing the ‘empirical distribution’ of the cut-off values) was taken as the estimate of the SE. The (non-parametric) 95% confidence interval (CI) for the estimated cut-off value was obtained by selecting the 2.5 and 97.5% percentiles of the bootstrapped cut-off values.
Model fitting
The analyses were performed using R [33] version 3.0.1 and OpenBUGS [34]. The OpenBUGS codes to fit the Bayesian Two-stage models are provided as Supplementary Material (Section III). For the Bayesian models, five parallel chains were run, each chain consisting of 10,000 iterations (after 10,000 burn-in iterations). Convergence of the Markov Chain Monte Carlo (MCMC) algorithms was verified with the R-package ‘CODA’. After fitting the models, the median cut-off value was obtained from the posterior distribution.
Materials
To check properties of the methods described above, we used a simulation study. We also analyzed data from BIODEM, the Reference Center for Biological Markers of Dementia (University of Antwerp, Belgium) pertaining to two cohorts and one from Euroimmun AG.
Simulated data
To check properties of the methods, we simulated two scenarios. In the first scenario, a similar linear relationship between the current- and new-assay measurements was assumed for AD and control populations. The second scenario corresponded to the situation illustrated in Fig. 1. Further details about the simulation model and parameter values are provided in Supplementary Material (Section IV).
INNOTEST- EUROIMMUN data
This set of data consisted of two parts. The first part was a dataset with CSF Aβ1-42 values of 42 age-matched control and 42 autopsy-confirmed-AD subjects measured with the ELISA kit INNOTEST ® β-AMYLOID (1 -42) , tested in the BIODEM lab (referred to as unpublished CSF data in [19]). This is the ‘current-assay cut-off’ dataset, as it allows direct estimation of the cut-off value for the INNOTEST assay, given that the information of the autopsy-confirmed-AD status of subjects is available. Figure 3a presents the histograms of the INNOTEST measurements for the control and AD groups.
The second part was a dataset consisting of CSF Aβ1-42 values for 64 samples, tested side-by-side with the INNOTEST (‘current’) assay and the EUROIMMUN AG (‘new’) assay (Fig. 3b). This is the ‘new-assay transfer’ dataset, because there is no reliable disease status for the samples; hence, the cut-off for the new assay has to be transferred from the current assay.
INNOTEST-INNO-BIA data
This set of data contained CSF Aβ1-42 values of 95 control and 51 autopsy-confirmed AD subjects, measured with the commercially available single-parameter ELISA kit INNOTEST ® β-AMYLOID (1 -42) (‘current’ assay) and the multiplex xMAP format (Luminex Corp, Austin, Texas) with INNO-BIA AlzBio3 (‘new’ assay). In this dataset (described in [20]), it is possible to directly estimate the cut-off value for the new assay (because the information about the autopsy-confirmed-AD status of subjects is available), as well as to obtain it by transferring the cut-off value of the current assay. Moreover, it is possible to directly estimate sensitivity and specificity related to the different cut-off values.
Figure 4 presents the histograms of the INNOTEST and INNO-BIA measurements for the control and AD groups (Fig. 4a) and the scatter-plot of the measurements together with the estimated regression lines (Fig. 4b).
RESULTS
We first present the results of the simulation study to demonstrate the performance of the two transfer methods. Then we show the results obtained in the analysis of the INNOTEST-EUROIMMUN and INNOTEST-INNO-BIA datasets.
Simulations
We investigated the performance of the linear-regression-based method and the Bayesian approach using the simulated data.
In the first simulated scenario, a similar linear relationship between the current- and new-assay measurements in the control and AD populations was assumed. As expected, in this case, both methods result in unbiased estimates of the cut-off value, i.e., the estimated values are on average equal to the true value (Fig. 5). To make the results more interpretable, summary statistics of the sensitivity and specificity, corresponding to the estimated cut-off values, are presented in Figs. 6 and 7.
In the second simulated scenario, different linear relationships between the current- and new-assay measurements in the control and AD populations were assumed. In this case, the linear-regression-based method leads to biased estimates of the cut-off value, for the reasons explained in Fig. 1. On the other hand, the Bayesian approach provides unbiased estimates of the cut-off value (Fig. 5). Similar conclusions can be drawn for the sensitivity and specificity corresponding to the estimated cut-off values (Figs. 6 and 7).
For both simulation scenarios and for all sample sizes considered, the Bayesian method resulted in less variable estimates as compared to the linear regression transfer method (narrower CIs, Figs. 5–7).
INNOTEST- EUROIMMUN dataset
For the INNOTEST assay, the cut-off value was estimated directly from the diagnostic labels in the ‘current-assay cut-off’ dataset. In particular, the estimated value was equal to 638.5 pg/mL (SE 55.39, 95% CI [508.5, 728.0]), the same value as reported in [19]. The sensitivity and specificity corresponding to the estimated cut-off value were equal to, respectively, 0.87 (SE 0.083, 95% CI [0.64, 0.90]) and 0.62 (SE 0.069, 95% CI [0.52, 0.79]).
For the EUROIMMUN assay, the cut-off value was obtained by transferring the INNOTEST cut-off value by using the linear-regression-based method and the Bayesian approach. In particular, the cut-off value obtained by the linear-regression-based method was equal to 364.4 pg/mL (SE 39.48, 95% CI [269.7, 426.8]), while for the Bayesian approach it was equal to 402.8 pg/mL (posterior-distribution SD 31.68, 95% credible interval [348.0, 473.9]). The obtained cut-off values are quite different, but given their precision, they cannot be seen as statistically significantly different.
To visualize the importance of the uncertainty about the derived cut-offs for the INNOTEST and EUROIMMUN assays, the cut-off values and the accompanying 95% CIs were plotted on the scatter plot of biomarker values measured with both assays (Fig. 8). The 95% CI for the INNOTEST cut-off value ranges from 508.5 pg/mL to 728.0 pg/mL or, expressed on a relative scale, from –20% to +14% of the estimated cut-off value (vertical black dashed lines). The 95% CI for the EUROIMMUN -assay cut-off value obtained by the linear-regression-based method ranges from 269.7 to 426.8 pg/mL or from –26% to +17% of the estimated cut-off value (horizontal black dashed lines). The wide 95% CI for both assays imply that, by assuming different cut-off values within the CIs, the Aβ1-42-based disease-status could be potentially altered for 13 of 64 subjects (20% ) for the INNOTEST assay and 17 of 64 subjects (27% ) for the EUROIMMUN assay. The 95% credible interval for the EUROIMMUN-assay cut-off value obtained by the Bayesian Two-stage method ranges from 348.0 to 473.9 pg/ml or from –14% to +18% of the estimated cut-off value (horizontal grey shaded box). Ten of 64 subjects (16% ) were contained in this 95% credible interval.
INNOTEST-INNO-BIA dataset
We first obtained cut-off values and their SEs for both assays using the fully non-parametric ROC-curve (see Materials and Methods) estimated based on the available diagnosis information (Table 1, first two columns). Worth noting are wide 95% CIs for the estimated cut-off values, which indicate substantial uncertainty due to the limited sample size of the dataset. This is similar to the case of the INNOTEST- EUROIMMUN ‘current-assay cut-off’ dataset (see Results).
The estimated cut-off value for INNOTEST Aβ1-42 (539.5 pg/mL) is the same as the published value [20] and somewhat smaller than the value obtained in the INNOTEST- EUROIMMUN ‘current-assay cut-off’ dataset (638.5 pg/mL). However, taking into account the considerable uncertainty associated with the estimates presented in Table 1, the difference could be either due to random variation or could be caused by changes over time in laboratory equipment or assay reagents.
Similarly to the 95% CIs for the cut-off values, the CIs for sensitivity and specificity, implied by the estimated cut-off values, are also wide. They indicate substantial uncertainty about the diagnostic performance of the assays.
In the next step, we estimated the cut-off value for the INNO-BIA assay by applying the linear-regression-based and Bayesian methods to the INNOTEST-INNO-BIA ‘new-assay transfer’ dataset (see Materials and Methods). The slopes of the linear relationship between assays were significantly different between AD and control cohorts (p = 0.0374, regression lines shown on Fig. 4b). The results are presented in Table 1 (last two columns). The obtained cut-off values are equal to 172.8 and 167.5 pg/mL for the linear-regression-based method and the novel Bayesian approach, respectively. They are similar to the value of 159.1 pg/mL obtained by the direct estimation (see Table 1, column 2), especially taking into account the limited precision of the obtained estimates (see Table 1, columns 3 and 4).
DISCUSSION
The upcoming commercialization of a new generation of immunoassays for CSF AD- biomarkers will include the full automation of tests, improved between-center and between-lot variability, link to a reference method, and availability of run-validation or proficiency panels [29, 35–37]. It is hoped that these improved assays will enable the introduction of universal cut-off levels for the CSF AD biomarkers [16]. However, due to the lack of leftover samples from the most important observational studies that have been used to document the value for the markers, it will be difficult or almost impossible to confirm the clinical utility of the new biomarker assays using samples which have been analyzed previously with the first generation of assays.
In this paper, we have proposed a novel, Bayesian approach to the problem of transferring a cut-off value to a new assay. Results of the simulation study suggest that the method performs better than the often-used linear-regression-based method. In particular, the latter requires that there exists a common linear relationship between the current- and new-assay measurements in the control and AD populations. If this assumption is violated, the method produces incorrect estimates of the cut-off value for the new assay. The validity of the common linear relationship cannot be verified if no reliable diagnostic information is available; yet, this is exactly the reason why a transfer of an existing cut-off value may be needed. Note that for the INNOTEST-INNO-BIA Aβ1-42 dataset, the assumption could be verified and was shown not to hold (Fig. 4b).
The proposed Bayesian approach does not make an assumption of the linear relationships. In addition, the Bayesian method results in unbiased and less variable estimates of the cut-off as compared to the linear-regression method. The comparison of widths of the 95% CI for different sample sizes (Fig. 5, Supplementary Table 2) demonstrates that the differences in precision between both methods are substantial, with the precision of the Bayesian cut-off for the smallest sample sizes (84 and 64) almost equal to the precision of the linear regression cut-off for the largest sample sizes (300 and 300). The same holds true for the associated the sensitivity and specificity of the estimated cut-offs (Figs. 6 and 7, Supplementary Table 2)
Given that the Bayesian method provides unbiased cut-off estimates regardless of the linear relationships between assay results and makes better use of the available data, it is preferred over the linear regression method when a cut-off needs to be transferred.
The Bayesian approach does require that the biomarker measurements (or a transformed version thereof) are normally distributed. The Box-Cox transformation as a way of normalizing biomarker values has been shown to perform well in the ROC context [21, 32]. The Box-cox transformations are a family of power-transformations that include the logarithmic transformation. Many immunoassay measurements follow (approximately) a log-normal distribution. For the datasets considered here, Aβ1-42 biomarker values measured with the different assays were approximately normally distributed (Figs. 3a and 4a). Total Tau and Ptau-181p measurements could be normalized with a logarithmic transformation (data distributions not shown). If needed, the method could be adapted to a semi-parametric approach as the mixture of Dirichlet processes, which was proposed by [38] to establish an optimal threshold using a Bayesian approach when the disease status is unknown.
When both transfer methods were applied to the INNOTEST-INNO-BIA data, they resulted in similar cut-off values, close to the direct estimate. In the INNOTEST- EUROIMMUN data, the differences were larger. However, in both cases, it could not be ruled out that the differences were due to chance (random variation), because the precision of the obtained estimates was low due to the limited sample size of the datasets.
It is worth noting that also the directly estimated cut-off value for the INNOTEST (not transferred from another assay) showed a low precision, with a 95% CI that contained an important proportion (20% ) of biomarker values (Fig. 8). The limited sample size for cut-off studies is a common issue in practice, because of the difficulties in accessing well-characterizedclinical samples. Yet, as argued and illustrated here, it can have important consequences for the practical use of the cut-off values in patient management and enrichment in clinical trials. A careful consideration of sample sizes, driven by the necessary confidence about the cut-off estimate is essential when planning studies aimed at establishing cut-off values for diagnostic assays. Acceptance criteria for the confidence could be expressed as a maximal width for the 95% CI for the cut-off value or a maximal proportion of the target population with biomarker values contained within this CI. Defining the 95% CI as a ‘grey’ zone, in which the biomarker values are considered as inconclusive could be a more realistic approach than the current practice of treating the estimated cut-off value as the true one.
The current paper describes a new approach to harmonize results between assays in the absence of clinically well-characterized samples. This approach provides a long-term solution to the field and will assist in speeding-up the integration and regulatory approval of new promising assay formats.
Footnotes
ACKNOWLEDGMENTS
The authors are most grateful to prof. Dr. Sebastiaan Engelborghs (University of Antwerp, Reference Centre for Biological Markers of Dementia (BIODEM)) for sharing of the valuable datasets with autopsy confirmed diagnosis. The authors also thank Dr. Britta Brix, EUROIMMUN AG, for sharing the transfer dataset with INNOTEST and EUROIMMUN Aβ1-42 measurements.
