Abstract
The available evidence indicates a high performance of core cerebrospinal fluid (CSF) biomarkers in differentiating between Alzheimer’s disease (AD) and other dementias, and suggests that their characteristic alterations can be detected even at the prodromal stage of AD. On this basis, the ability of core CSF biomarkers to identify prodromal AD patients from pre-dementia of all causes can be postulated, a concept that is reflected in recent revisions of AD research criteria and a consensus statement. Following an overview on the role of biomarkers in the evolution of diagnostic criteria of AD in recent decades, this paper provides a critical review of the widely applied CSF biomarker study designs and evaluating approaches that address the ability of core CSF biomarkers to diagnose prodromal AD, with special focus on their potential limitations in terms of clinical interpretation and utility. The findings together raise the question of whether we are indeed able to establish a CSF biomarker-based diagnosis of AD at the prodromal stage.
INTRODUCTION
Alzheimer’s disease (AD) is known to be the most prevalent neurodegenerative disease worldwide, accounting for the highest proportion (∼60%) of all-cause dementia. The most representative pathological hallmarks of the disease were described by the German neuropathologist Alois Alzheimer as early as 1906, detecting neurofibrillary tangles and the extracellular formation of amyloid plaques together with the substantial shrinkage of the brain of a patient who died of a peculiar condition with a presenile deterioration of cognitive functions, especially affecting the memory. More than a century later, although substantial advances have been achieved in the understanding of the nature and pathophysiological background of the disease, we still do not have any therapeutic tool in hand with evidence to indicate that it is capable of even influencing the disease course. At the expense of an armada of clinical trials that have failed to prove the therapeutic effect of their candidates having been successful in preclinical settings, a novel concept has started to take shape as to how we should view AD and related disorders, and, more importantly, what we should regard as AD. This review paper summarizes the current understanding of the pathophysiology of AD with special focus on the biological markers (biomarkers) of core pathophysiological alterations and their effect on our view on patients with cognitive decline and dementia. A critical overview is given here of the most typical study designs and evaluation approaches as regards the diagnostic accuracy and potential of core cerebrospinal fluid (CSF) biomarkers in differentiating AD from other etiologies at both the dementia and pre-dementia (i.e., prodromal) stages.
HALLMARK PATHOPHYSIOLOGICAL ALTERATIONS
The most representative pathological alterations in AD include the region-selective synaptic and neuronal degeneration, deposition of extracellular amyloid consisting predominantly of an amyloid-β protein isoform with a length of 42 amino acids (Aβ42) responsible for the formation of neuritic plaques, diffuse plaques, cored plaques, subpial bands, and amyloid lakes, and the accumulation of hyperphosphorylated microtubule-associated protein Tau (pTau) in neuronal cells, leading to the formation of neurofibrillary tangles (NFTs) [1–3]. The preferentially affected brain territories include the entorhinal, hippocampal, temporal, and neocortical association areas, with the earliest and dominant psychological sign being the disturbance of episodic memory. While the association of the above changes in AD is apparent, the causative relationships between the alterations are subjects of extensive discussion.
The amyloid hypothesis holds that the increased presence of Aβ42 in the brain formed by the cleavage of amyloid-β protein precursor (AβPP) via the consecutive functions of β- and γ-secretases (this is also known as the amyloidogenic cleavage pathway) is the primary pathogenic factor in the cascade of events leading to NFT formation and subsequent neuronal degeneration [4]. Aβ42 is prone to self-aggregate to soluble oligomers of different sizes, which have been widely demonstrated to be toxic to synapses and neurons, accounting for the majority of amyloid-related toxicity [5], with mitochondrial dysfunction and glutamate-mediated excitotoxicity being heavily implicated [6, 7]. Aβ42 also readily aggregates to β-sheets to form insoluble fibrils and eventually plaques, which probably serve as a reservoir for toxic soluble forms and appear to be locally neurotoxic [8]. Furthermore, a body of experimental evidence supports the hypothesis that amyloid oligomers per se drive the hyperphosphorylation of Tau [9–13], providing a pathomechanistic rationale for Aβ being a primary etiological factor in the cascade of AD pathophysiological process. Notably, the plaque burden itself appears to correlate poorly with disease severity and cognitive impairment [14, 15], and Aβ plaque pathology is frequently found among the elderly without a symptomatic cognitive decline [16–23], also supporting an indirect role of amyloid deposition in neurodegeneration.
Microtubule-associated protein Tau is proposed to stabilize axonal microtubules and promote axonal function in a process regulated largely by the phosphorylation state of Tau by multiple phosphatases and kinases [24]. In AD, the rate of phosphorylation is abnormally high. Hyperphosphorylated Tau (pTau) is in turn prone to detach from microtubule proteins, resulting in the loss of axonal integrity and the cytosolic accumulation and aggregation of pTau in the form of paired helical filaments, which leads to the formation of NFTs and dystrophic neurites, ultimately rendering the affected neurons to degenerate and die [25]. The degree of neuronal loss and disease severity has generally been found to correlate better with Tau pathology than with amyloid plaque burden [14–16, 26]. Though alternative triggers such as mitochondrial dysfunction [27], oxidative stress [28], excitotoxicity, and neuroinflammation [29] have also been proposed, hyperphosphorylation of Tau is generally thought to be triggered by and therefore downstream of the amyloid pathology in the disease continuum, and the biochemical fingerprints of these pathologies are generally detectable in a timeline corresponding with this hypothesis [30]. However, recent publications of Braak and colleagues report a substantially earlier presentation of Tau histopathology especially in the subcortical areas of the brain as compared with the amyloid pathology [31, 32], whereas others have described a proportion of patients presenting with signs of neurodegeneration prior to the appearance of amyloid pathology via imaging modalities [33], observations which leave this question open for further discussion.
Although AD is characterized neuropathologically by the presence of amyloid plaques and NFTs in the predisposed brain areas affected by neurodegeneration, there is considerable evidence that elderly people can present with substantial amyloid as well as Tau pathology on autopsy without any signs of cognitive involvement detected antemortem [16–23]. Whereas such observations may theoretically suggest that the pathology defined as AD-type might not be sufficiently specific to AD, the currently available evidence indicates that such cases might represent preclinical (or clinically inappropriately phenotyped prodromal) stages of AD at death, which would have progressed into AD dementia provided they had lived long enough [34]. This concept is similar to the one that regards incidental Lewy-body disease as a presymptomatic phase of Parkinson’s disease (PD) [35]. The picture has become even more complicated with the increasing recognition of the substantial heterogeneity of neuropathological alterations not only among the non-demented elderly [16], but also among patients with hippocampal-type dementia accompanied by a dominant AD-type pathology [1]. Indeed, neuropathological substrates of vascular dementia (lacunary infarctions and white matter lesions as the most frequent concomitants [36]), frontotemporal lobar degeneration (FTLD; differentially localized NFTs and TDP-43 inclusions), diffuse Lewy-body disease (DLBD; α-synuclein deposits), PD (α-synuclein deposits pathognomically in the substantia nigra pars compacta), hippocampal sclerosis, and argyrophilic grain disease are those that most commonly coincide with AD-type pathology in brains with ‘probable AD’ clinical phenotype [1], with a proposed rate of neuropathologically ‘pure AD’ of less than 60% [37]. At least in part due to this underlying heterogeneity, the differential diagnosis of such conditions is often challenging, especially in cases of slowly progressive dementias with insidious onset. The real life importance of this issue is well indicated by data reporting the positive predictive value of the clinical diagnosis of AD as 70–81% when the endpoint includes AD as well as concomitant pathological conditions, decreasing to 38–44% when the evaluation is restricted to ‘pure’ AD cases [38]. In a more recent study in which the permissive threshold level for histopathological severity method was used to define autopsy-confirmed AD, i.e., a level considered sufficient to attribute to dementia irrespective of concomitant findings, the positive predictive value of clinically ‘probable AD’ diagnosis was 62.2–83.3% with corresponding sensitivities and specificities of 70.9–76.6% and 59.5–70.8%, respectively (the values depended on the applied minimum threshold levels of histopathological severity, with more permissive neuropathological definitions resulting in higher predictive value and specificity, and lower sensitivity) [39].
The issue of low accuracy values for clinical diagnosis in AD is of crucial importance in the setting of clinical trials, where the enrollment of clinically misdiagnosed patients or those with mixed pathology 1) seriously biases the statistical analysis, decreasing the power of the study to confirm a therapeutic effect, 2) raises the expense of the trials by treating an unnecessarily high number of patients [40], and 3) gives rise to ethical concerns as patients with different etiological background should not hope for a remedy from treatment approaches selectively targeting AD-related pathomechanisms. All these difficulties underpin the critical need for markers that reflect the underlying pathology with high accuracy in vivo, and are facile, standardized, and cost-effective enough for research and eventually for clinical use. In the past two decades, extensive efforts have been made worldwide to meet this need.
BIOCHEMICAL FINGERPRINTS OF CORE PATHOLOGICAL ALTERATIONS IN AD
The increasing recognition that amyloid and Tau/pTau pathologies are leading hallmarks in the pathogenesis of AD led to the discovery of their biochemical correlates in the CSF some 20–22 years ago [41–46]. Indeed the CSF level of Aβ42 has been found to be decreased by some 50%, and the levels of Tau and pTau to be elevated by some 250–300% in AD as compared with non-demented healthy individuals in multiple independent studies [47]. This constellation of alterations has often been referred to as ‘the AD signature’, ‘the AD CSF biomarker profile’, or briefly ‘the AD profile’, and the three markers are often referred to as ‘the core biomarkers’ of AD. Although the exact reason for the decreased CSF concentration of Aβ42 has not yet been fully elucidated, the increased formation of oligomers and their sequestration in the form of insoluble aggregates in the brain (thus the characteristic imbalance in the amyloid homeostasis) are generally thought to be attributable to the decrease in the monomeric form measured. The elevation of CSF Tau is thought to reflect axonal/neuronal degeneration and injury, whereas that of pTau most likely mirrors the kinase/phosphatase imbalance characteristic of the disease. The observed alterations appear to correlate well with autopsy findings [48–52], though contrasting reports have also been published [53]. In line with these, the diagnostic application of the above CSF alterations individually provide 79–86% sensitivity and 79–92% specificity when differentiating between AD subjects and healthy controls, with even higher values if used in combinations (85–94% sensitivity, 83–100% specificity) [54–56]. Notably, the individual specificity of these markers substantially decrease when the aim is to differentiate between AD and non-AD dementia (NONAD) (66–86%) [55]. Indeed, decreased CSF levels of Aβ42 have also been described in dementia with Lewy bodies (DLB) [57, 58], frontotemporal dementia (FTD) [59], and major depression [60], whereas elevated levels of Tau have been detected in multiple central nervous system (CNS) diseases associated with overt neuronal loss such as ischemic stroke [61], traumatic brain injury [62], DLB (though lower than in AD [57, 63]), FTD [64], normal pressure hydrocephalus [65], and most prominently in Creutzfeldt-Jakob disease (CJD) [66]. The elevation of pTau is considered to be more specific to AD [67–69], even though the cytosolic aggregation of pTau filaments leading to NFT formation are characteristic of all tauopathies. In addition to these, a number of studies have proposed elevated levels of Tau proteins as well as alterations in Aβ42 levels in the CSF of patients with multiple sclerosis, which findings, however, could not be confirmed by our group, among others [70]. Notably, whereas the individual markers fail to provide sufficient specificity to accurately distinguish between different forms of dementia, their combined application demonstrates median specificity and sensitivity values > 85% across multiple studies [71–82] and in a recent systematic review [55], suggestive of reaching the threshold of meeting the established criteria for the minimum required accuracy of biomarkers for clinical differential diagnosis [83, 84]. While this is indeed an advancement relative to the lower specificity values obtained from the purely clinical diagnosis of ‘probable AD’ alone, the true merit of a marker (or a panel of markers) would be the accurate identification of individuals who are at risk of developing AD dementia, but are either in prodromal (with cognitive changes suspicious of being due to AD, not yet demented) or asymptomatic (without cognitive impairment) stages of the disease at the time of sampling. This is of crucial importance as regards the designing of clinical trials, as the pathology of patients with full-blown AD dementia might be overly severe to be therapeutically influenced in a clinically meaningful extent. In line with this concept, current clinical trials tend to focus on patients with mild cognitive impairment (MCI) who are considered to be at risk of developing AD dementia in the future. It is reasonable that the selective enrollment of MCI patients harboring the biochemical fingerprints of the underlying pathology of AD could decrease the bias due to the overlapping phenomenology of pre-dementias. In this respect, a huge effort has been placed on a series of longitudinal follow-up studies evaluating the performance of the individual and/or combined use of core CSF biomarkers in predicting conversion of MCI patients to dementia (i.e., reaching the threshold of interfering with daily functioning) during their follow-up periods. While some of these studies have demonstrated promising sensitivity and specificity values (>80–85%) for the combined use of core CSF biomarkers [85–89], there are several limitations which must be taken into consideration when interpreting or meta-analyzing their performance in distinguishing between AD and NONAD at the prodromal stage, which will be specifically addressed in the upcoming sections. However, important information can be gleaned from theses analyses: Patients with prodromal AD who develop CSF fingerprints of both amyloid dyshomeostasis (i.e., Aβ42 decrease) and neurodegeneration (i.e., Tau and pTau elevation) are in advanced disease stage, and the expected time to develop a disabling condition (i.e., dementia) is rather short, generally a few years [90]. This concept is in accordance with the observation that CSF Aβ42 alteration may start earlier in the disease continuum, as in a longitudinal study with a median follow-up of more than 9 years, the decrease in CSF Aβ42 was observable in both the converters (who progressed into dementia of the AD-type) and the non-converters within the MCI group, though to different extents, whereas substantially high levels of Tau or pTau were present only among early converters (conversion within 0–5 years), but not in late converters (conversion within 5–10 years) [89]. This appears to be in homology with findings on patients with autosomal dominantly inherited familial AD, reporting the appearance of a decreased CSF Aβ42 and an elevated CSF Tau to precede the expected symptomatic onset by some 25 and 15 years, respectively [91].
THE EMERGENCE OF IMAGING BIOMARKERS: A BRIEF OVERVIEW
In parallel with the development of core biochemical markers in the CSF, potential biomarkers of different imaging modalities have been the subjects of extensive research. Among them, positron emission tomography (PET) CT scans involving the use of 11C-labeled Pittsburgh compound B (PiB) [92] or the more recently developed 18F radiotracers (florbetapir, flutemetamol, and florbetaben, among others [93]) as ligands are increasingly used to detect amyloid aggregate deposition in the brain, showing a rather good concordance with postmortem amyloid burden [94–97] and also with alterations related to CSF Aβ42 or Aβ42/(p)Tau ratios [98–107]. Furthermore, the accuracy of amyloid PET was found comparable to that of CSF Aβ42/Tau or Aβ42/pTau ratios in a most recent study in differentiating prodromal AD patients from healthy controls, with no additional benefit when the two modalities were used together [108]. Likewise amyloid pathology at autopsy, both positive PET findings and decreased CSF Aβ42 levels may accompany patients without cognitive decline, which may be regarded as cases in the preclinical phase of the AD continuum [107]. Notably, however, most recent results suggest that CSF Aβ42 decrease and amyloid PET retention represent different aspects of amyloid pathology [105, 109] and actually measure different forms of amyloid, i.e., monomeric in the CSF versus aggregated fibrils by the tracers in the CNS. More recently, a number of PET ligands for the in vivo detection of Tau pathologies have also been developed, the diagnostic applicability of which is under extensive research [67]. Of note, the ability of 2-(1-{6-[(2-(18)F-fluoroethyl)(methyl)amino]-2-naphthyl}ethylidene)malononitrile (18F-FDDNP), a PET tracer previously widely used to visualize both amyloid and Tau pathologies in the brain, has recently been questioned [110].
Other forms of CT-based imaging modalitieswidely used in AD research include 18F-fluorode-oxyglucose (FDG) PET-CT to measure decreased glucose metabolism indicative of cellular dysfunction and loss [111, 112], and single-photon emission CT (SPECT) to measure cerebral hypoperfusion [113, 114]. In both modalities, the typical brain regions detected to be predominantly involved in AD are the temporoparietal cortices. Magnetic resonance imaging (MRI) technology is a widely available modality utilized to rule out concomitant vascular or inflammatory etiology and to assess the characteristic atrophy of the medial temporal lobe (MTL) [115], an alteration that reflects regional neuronal loss in AD. Although the MTL (more specifically the entorhinal cortex and the hippocampus proper) is a region classically associated with MRI alterations in AD, the significant involvement of subcortical gray matter structures [116–118] along with the alterations of white matter microstructure [119–122] have also been recently emphasized. The in-depth presentation of the different imaging modalities is beyond the scope of this paper, and they have been extensively reviewed by others [123].
THE EVOLUTION OF DIAGNOSTIC CRITERIA IN AD
Back in 1984, the National Institute of Neurological and Communicative Diseases and Stroke/Alzheimer’s Disease and Related Disorders Association (NINCDS-ADRDA) published the criteria for the definition of AD, which remained the most widely applied diagnostic criteria in clinical trials for some 27 years to come [124]. The NINCDS-ADRDA recognized AD as a dementia characterized by an amnestic syndrome of hippocampal type with an insidious onset, and postulates that the diagnosis is probabilistic when the patient is alive (probable AD), whereas definite diagnosis could only be provided by autopsy (definite AD). The subsequent remarkable advances achieved in the fields of both biochemical and imaging biomarkers as well as the serial failures of clinical phase II and III trials to provide confirmation of the therapeutic effect of preclinically successful agents necessarily raised the demand for the innovation of the long-standing clinical diagnostic criteria of AD. As a result, in 2007, the International Working Group (IWG) for New Research Criteria for the Diagnosis of Alzheimer’s Disease published a position paper with proposed revised research criteria for probable AD [125]. Its core clinical criterion is the presence of progressive specific episodic memory impairment, whereas the recommendation incorporated the abnormalities of core CSF biomarkers in the supportive criteria, together with the presence of MTL atrophy, a characteristic PET pattern or an established autosomal dominant mutation within the immediate family. The paper proposes that the diagnosis of AD can be established in the presence of the core clinical criterion and at least one of the supportive criteria, and in the absence of exclusive criteria [127]. The main novelty in this concept is that it regards AD as a disease continuum and it permits the diagnosis of AD even in a prodromal phase, potentially based upon the support of core CSF biomarkers. A refinement for these criteria with a new lexicon of terms related to AD, including ‘presymptomatic AD’, ‘asymptomatic AD’, and ‘Alzheimer’s pathology’, was published by the same group in 2010 [126]. One year later, the National Institute on Aging– Alzheimer’s Association (NIA– AA) workgroups published an update on the clinical diagnostic recommendations of the NINCDS-ADRDA, incorporating CSF biomarkers in the guideline as well [127]. However, the guideline proposes that demented patients meeting the core clinical criteria of AD and having signs of AD pathophysiological process either in terms of alterations in core CSF biomarkers or as regards characteristic changes in PET and MRI can be regarded as ‘probable AD with evidence of AD pathophysiological process’, which feature only increases the certainty that AD is the underlying etiology of the patients’ dementia, but does not per se support the diagnosis. In the same year, an update was published by the same workgroups on the diagnostic research criteria for MCI [128], postulating that the evidence of (either CSF or imaging) biomarkers for both amyloid deposition and neurodegeneration yields ‘a high likelihood’ that MCI is due to AD, whereas the likelihood is considered ‘intermediate’ when there is evidence for only one of these two biomarker categories. In contrast, the IWG published their most recent revision for the research criteria of AD in 2014 [129] in a position paper postulating that ‘typical AD’ can be diagnosed at any stage of the disease continuum (either prodromal or dementia stages) when the core clinical criteria are accompanied by in vivo evidence of AD, including either the presence of ‘the CSF AD signature’ (i.e., the AD profile), increased amyloid PET tracer retention, or a proven mutation of an autosomal dominant familial AD gene (structural MRI and FDG-PET alterations were no longer included due to insufficient specificity). Focusing on core CSF biomarkers, the paper argues that the CSF AD signature has high accuracy in diagnosing AD at a prodromal stage, with ∼90% specificity and sensitivity in AD. In line with this, the Alzheimer’s Diseases Standardization Initiative published a consensus paper stating that ’changes in Aβ42, Tau, and pTau allow diagnosis of AD in its prodromal stage’, since ‘when all three classical AD CSF biomarkers are abnormal, a patient with MCI should be defined as having prodromal AD’ [130].
LIMITATIONS FOR CLINICAL INTERPRETATION
The following sections provide a critical review of the scientific background that promoted the evolution of the diagnostic criteria of AD, with special focus on the possible limitations of distinct types of CSF biomarker studies that aim to assess the differential diagnostic performance of core CSF biomarkers in the prodromal phase. Focus is not placed herein on but recognition is expressed of the enormous efforts of the Alzheimer’s Disease Association Quality Control program [131, 132], the Penn Biomarker Core of Alzheimer’s Disease Neuroimaging Initiative (ADNI) [30], the Alzheimer’s Biomarker Standardization Initiative [130, 133], the Global Biomarker Standardization Consortium (GBSC) [134], and the early cNEUPRO [135] in the field of the elaboration and standardization of pre-analytical and analytical protocols of CSF biomarker measurements in AD for different analytical platforms, including the singleplex ELISA tests and the multiplex Luminex xMAP and Inno-Bia Alzbio3 immunoassay. Their joint efforts will certainly move biomarker development closer to overcoming current methodological limitations such as the significant inter-laboratory variability and the lack of CSF-based standard reference material, which will undeniably promote the establishment of the methodological basis for the research and probably later clinical utility of CSF biomarkers in the diagnostics of AD.
As described above, in recent updates of the research diagnostic criteria for AD, arguments can be found supported by numerous references that scientific evidence is available indicating that CSF biomarkers can distinguish AD patients from other dementias with high accuracy, even at the prodromal stage. To analyze the validity of these arguments, we have systematically reviewed the literature in this field, identified the main questions addressed, and critically analyzed the most frequent approaches to answer them in terms of their ability to provide appropriate answers.
CSF biomarker-related studies can generally be divided into three categories. The first cross-sectional-type group that examines differences between the target disease (i.e., AD) and healthy controls and estimates the diagnostic accuracy of biomarkers to distinguish between them are out of scope of this section. The second (from the current perspective) more relevant type of study examines differences between the target disease and related disorders, in our case between AD and NONAD(s), and estimates the diagnostic accuracy of biomarkers to distinguish between them. This type of cross-sectional studies will be referred to throughout this chapter as ‘differential diagnostic studies’. The third main group of studies examines the diagnosticaccuracy of biomarkers to identify patients with MCI who have an AD pathological background or are at risk of converting to AD within a certain period of time. These studies are often dedicated to assessing the possibility of the prodromal diagnosis of AD, which is a topic of special importance for adequate patient enrollment in clinical trials to come. As such longitudinal studies use the conversion to dementia as a dichotomized outcome within the defined follow-up period in MCI patients, they will be collectively referred to as ‘conversion studies’.
Differential diagnostic studies
The majority of studies report sensitivity and specificity data, and less frequently predictive values, likelihood ratios, C-indices, and the area under the receiver operating characteristic curve (AUROC) values to characterize the performance of CSF biomarkers in differentiating AD dementia from other dementias. Though such studies provide fairly high accuracy values and are therefore promising, they appear to have several limitations. First of all, a remarkable proportion of studies establish diagnostic groups based solely on clinical consensus diagnosis, without autopsy confirmation. Even if the diagnosis is blinded to the CSF results (which is not always the case), the approach of estimating accuracy values for biomarkers based on diagnoses uncertain enough to drive and urge the development of the same particular biomarkers is on the edge of circular reasoning. Secondly, specificity values from these studies are obtained from diverse comparator groups ranging from isolated diseases (i.e., FTD, DLB, subcortical vascular dementia, etc.) to NONAD as a whole, which makes their collective clinical interpretation rather difficult. From a clinical perspective, accuracy values obtained from one-to-one comparisons (performed by a remarkable proportion of studies) can be useful when the differential diagnosis of a certain case has already been narrowed to AD versus one particular other form of dementia; however, the true predictive values in the real clinical context should be estimated as values controlled for the distinct prevalence rates of AD and the respective comparator condition, which adjusted values are usually not provided by the studies themselves (Fig. 1). As in a real clinical scenario, the differential diagnosis in many cases cannot be narrowed to two conditions, a real merit of CSF biomarkers would be to distinguish AD from all other relevant conditions potentially causing dementia, and accuracy values from studies examining AD versus NONAD would therefore be clinically helpful in the diagnosis (Fig. 1). In such a scenario, however, valid specificity and thus predictive values could be provided only if the NONAD group consisted of conditions that are represented in proportions reflecting the relations of real life prevalence rates of the respective conditions, otherwise the obtained specificity as well as other ‘negative-side’-related parameters such as predictive values are fairly biased, and are clinically less meaningful (Fig. 1). For example, the overrepresentation of CJD (as a rare differential diagnosis) within a NONAD group can falsely increase the specificity value of the combined use of CSF biomarkers, whereas the disproportionally low presence of vascular dementia, for instance (as a frequent differential diagnosis), could evoke the opposite effect. In fact, studies assembling NONAD groups from diverse conditions in proportions adequately reflecting their relative prevalence rates in the population are scarce. Once the comparator population is representative in terms of its constitution, the obtained predictive values should again be adjusted for the relative prevalence rates of AD versus the all-cause prevalence of the respective NONAD group to provide clinically meaningful and valid estimates.
Conversion studies
The main limitations of conversion studies are related in part to similar problematics as differential diagnostic studies. In addition to the complete absence of autopsy-confirmed diagnoses, and the high variability of follow-up periods, a number of concerns are fundamentally related to study design. On the basis of the published conclusions, we have found that conversion-type studies typically address two questions (sometimes merged into one): 1) By how many years does the appearance of the complete (or partial) CSF AD profile precede the conversion to AD dementia in prodromal AD patients?; 2) To what accuracy can CSF biomarkers identify MCI patients who will eventually develop dementia due to AD(i.e., who have prodromal AD)?
While the two questions are related, they are in fact slightly different entities, the first being a disease course-oriented question with in part pathophysiological interest, whereas the second being a prodromal differential diagnosis-oriented question with clinical interest, and their adequate answering requires slightly different study designs and evaluation approaches.
As regards the first, disease course-oriented question, an idealistic study design would enroll MCI patients with CSF samples obtained at baseline, documenting their latency to convert to AD (or any other forms of dementia) during the follow-up, excluding patients not meeting the criteria of AD at autopsy as a standard of truth (less probably including patients with alternative clinical diagnosis but diagnosed as having AD at autopsy), and estimating the frequencies of patients with complete or partial AD-type biomarker profiles (i.e., sensitivities) within subgroups stratified on the basis of well-defined intervals of the latency to convert into AD. This descriptive approach also enables the estimation of overall as well as latency-to-convert-adjusted sensitivity values, which have different roles in the interpretation of the diagnostic performance of CSF biomarkers (Fig. 2). We are aware of a single study that had a sufficiently long follow-up period (up to almost 12 years) to allow a similar way of stratification; its clinical diagnoses, however, have not yet been autopsy-confirmed [89]. To our knowledge, no conversion studies have yet been published with autopsy-validated diagnoses. The vast majority of studies estimate sensitivities for the prediction of clinical conversion within significantly shorter arbitrarily defined follow-up periods (usually 1–3 years).
As regards the second, prodromal differential diagnosis-oriented question, which aims to determine the accuracy of CSF biomarkers in predicting the diagnosis of AD in the prodromal phase, an idealistic study design would enroll consecutive MCI patients with CSF samples obtained at baseline, following them up through their conversion of different types of dementia (or remaining stable until death), confirming (or overwriting) their clinical diagnoses by autopsy as a standard of truth, and estimating the diagnostic accuracy of CSF biomarkers to differentiate between those who converted to AD (MCI-AD) and those who converted to any other developed forms of dementia (MCI-NONAD) pooled with the group of patients who remained stable or in infrequent cases became ‘backwashed’ to normal until death (study design MCI-AD versus MCI-NONAD+MCI-permanently stable, Fig. 2). This design provides a realistic differential diagnostic situation in the prodromal phase, is free from the uncertainty of clinical diagnosis alone, and is theoretically free from the bias of the potentially disproportionate representation of diagnoses within the MCI-NONAD group (as compared with a potentially significant bias addressed above regarding the cross-sectional ‘AD versus NONAD’ studies) as the development of different types of dementias from a heterogeneous MCI group with consecutive patients enrolled without any a priori filtering is ideally random and follows the natural prevalence rates of the diseases. A limitation of this design is the uncertainty of the relative contribution of a particular pathology in cases presenting with mixed pathology at autopsy, an issue that is especially relevant in cases with longer follow-up duration and higher age at death. We are not aware of any studies published with this design. Instead, studies addressing this question can be essentially divided into two subtypes (Fig. 3). Both subtypes work with arbitrarily set follow-up periods and without autopsy-validated diagnostic groups, as the majority of enrolled patients are still alive. The first subtype of study design estimates the diagnostic accuracy of biomarkers to distinguish between MCI patients who clinically convert to AD dementia (usually referred to as MCI-AD or MCI-C) and those who remain stable during the follow-up period (usually referred to as MCI-stable, MCI-NC, or MCI-MCI). Notably, this ‘MCI-AD versus MCI-stable’ design, an approach used in the majority of studies widely cited in support of the putative excellent accuracy of core CSF AD biomarkers in predicting the diagnosis of AD even in the prodromal phase [59, 136–148], has a severe and fundamental limitation in providing valid and clinically meaningful accuracy measures for prodromal differential diagnosis, as it disregards the expectation that a remarkable proportion (∼20–40%) of converters would develop NONAD in a real-life situation, a group that is in fact missing from these analyses. The provided specificity value in studies using this design therefore does not reflect anything other than the ratio of patients with a negative CSF profile among non-converters, with no information about its relation with parallel-developed other dementias at all. In other words, the ‘MCI-AD versus MCI-stable’ design does not indeed identify prodromal AD, but only provides sensitivity values for the detection of early converters (Fig. 3). The second and recently preferred way of estimating the accuracy of CSF biomarkers in identifying prodromal AD is more reminiscent of the idealistic approach delineated above (Fig. 3). This approach recognizes three groups at the end of follow-up, which are converters to AD (MCI-AD), non-converters (MCI-stable), and converters to a dementia other than AD (MCI-NONAD), and analyzes them in a study design comparing MCI-AD versus MCI-stable+MCI-NONAD in the ROC analysis (the latter pooled group is occasionally referred to collectively as MCI-NONAD) [86, 149–154]. The study with the longest follow-up period published to date (median 9.2 years) reported the following distribution of diagnoses at evaluation: MCI-AD representing 77% of all dementia and 54% of all MCI; MCI-NONAD representing 23% of all dementia and 16% of all MCI (these stand for an overall 70% conversion rate); and MCI-stable representing 30% of all MCI [89]. In contrast, another study group with an overall 35–38% conversion rate from MCI patients at baseline within 2-3-year follow-up periods described a 89–92% versus 8–11% representation for MCI-AD and MCI-NONAD, respectively [149, 150]. The remarkable differences in the rate of conversion, which is a natural dependent of the established length of follow-up period and the disease duration at baseline sampling, and in the distribution of converters between MCI-AD and MCI-NONAD altogether suggest a high inter-study variability in terms of the predictive values of CSF biomarkers independently of the sensitivity and specificity characteristics of the biomarkers themselves, which should be taken into consideration during meta-analysis and collective interpretation of the data (Fig. 3). This ‘MCI-AD versus MCI-NONAD+MCI-stable’ approach might indeed be useful and relevant when the aim is to enroll patients into clinical trials who are similar in terms of their expected latency to convert into dementia, and to identify prodromal cases in a late phase where CSF AD profile is established. It is also more proper compared to the ‘MCI-AD versus MCI-stable’ approach as their values related to the negative side (i.e., specificity, predictive value, etc.) are clinically meaningful. Notably, however, the ability of this approach to accurately assess the differential diagnostic performance of biomarkers is still limited, since due to the heterogeneity of the MCI-stable group, a remarkable proportion of the MCI-NONAD+MCI-stable pooled comparator group may indeed have AD as the underlying pathology at a prodromal stage as well (which may as well be as high as 30–40% depending on size of residual MCI-stable group and the length of follow-up). Briefly, this approach does not literally differentiate between prodromal AD and other pre-dementias, but differentiates prodromal AD cases in a fairly advanced stage from all other possible conditions, including late converters to AD (Fig. 3).
Minor, but relevant additional concerns regarding the conversion-type studies include the high chance that the group of MCI patients who convert into dementia during an a priori defined follow-up period may happen to be significantly older than those who do not convert to dementia, and/or have a higher female/male ratio, with age and female gender being significant risk factors of AD dementia. Though only few studies address these issues specifically, such scenarios appear indeed quite often [85–87, 155], whereas adjustment for these confounders is usually performed in independent multivariate Cox regression analyses, if at all, and the diagnostic accuracy values themselves remain frequently uncontrolled (Fig. 3). Another potential limitation of conversion studies in terms of providing differential diagnostic estimates is the potentially false presumption that all dementia diseases have similar dynamics regarding the propensity to convert; indeed, diseases with a slower conversion rate (or later dementia onset) as compared with AD will be overrepresented in the MCI-stable group and vice versa, and consequently, the relative proportion of the different conditions within the MCI-stable group changes dynamically during the follow-up period (and therefore differs between studies with different follow-up lengths), factors which together add further uncertainty to the constitution of the MCI-stable group (Fig. 3).
ARE WE ABLE TO ESTABLISH A PRODROMAL DIAGNOSIS?
On the basis of the published data and recent systematic reviews suggesting a high accuracy of combined CSF biomarkers in differentiating between AD and different dementias and proposing that CSF AD profile can be detected in AD patients at a prodromal stage, the indirect conclusion can logically be drawn that these markers should have the ability to differentiate prodromal AD patients from MCI patients with other etiological background. The need for a prodromal differential diagnosis of typical AD is indisputable, as it potentially represents a key for successful clinical trials. Indirect deductions, however, should be based on massive evidence.
According to our critical review, diagnostic accuracy data on the performance of combined CSF biomarkers to distinguish between AD and NONAD in the dementia phase in a cross-sectional design are biased to a certain extent, mainly owing to the paucity of autopsy validation and the frequently non-representative assembly of the NONAD groups in terms of real-life prevalence rates (Fig. 1). Nevertheless, there may be arguments suggesting that the diagnostic performance of CSF biomarkers from this respect may still be comfortingly high. Since AD represents the majority of dementia cases (∼60%; i.e., the chance of a random demented patient having AD is higher relative to all other diagnoses altogether), the adjustment for the prevalence rates increases the predictive values. The report proposing that the clinical diagnosis fairly underestimates the diagnostic performance of CSF biomarkers compared with autopsy diagnosis is also supportive in this respect [156]; however, this observation was not confirmed by others [71].
On the other hand, longitudinal conversion studies have likewise provided in part biased information about the predictive performance of the AD biomarker profile as regards early conversion to AD, which is mainly because of the omission of MCI-NONAD from the comparator group in the majority of studies addressing this question (‘MCI-AD versus MCI-stable’ design; Fig. 3). While respecting the incontestable clinical significance of studies using the ‘MCI-AD versus MCI-stable+MCI-NONAD’ design, it should be noted that such a design cannot specifically address the differential diagnostic accuracy due to the substantial heterogeneity of the comparator groups (i.e., the ‘unstable’ MCI-stable group; Fig. 3). Strictly speaking, the true differential diagnostic performance of CSF biomarkers in a prodromal phase cannot be accurately estimated until residual MCI-stable cases with the potential to convert to AD later are present in the evaluation; the term ‘the accuracy of AD diagnosis at the prodromal stage’ should therefore be used with caution, as the values obtained from these studies at most refer to ‘the accuracy of identifying early converters to AD’. While this distinction may sound academic, the two terms are essentially different. This is because, while there may indeed be a chance that the combined use of core CSF biomarkers may identify early converters to AD from all other possible outcomes, their overall differential diagnostic performance in the prodromal phase can be prognosticated to be rather poor. Since Tau and pTau elevations in the CSF appear to be preferentially present in MCI patients within 5 years before clinical conversion to dementia (i.e., in early converters) and not in those who convert later (as opposed to the relatively stable presence of decreased CSF Aβ42 in MCI) [89], the frequency of an altered CSF profile in MCI-AD patients (i.e., the sensitivity) presumably gradually decreases by the increase in the latency to convert to dementia (Fig. 2). This suggests that the overall sensitivity of the biomarker profile to identify MCI-AD cases among all MCI patients is less than it would be accepted as being of diagnostic value (i.e., 85%). This theoretical concept of gradually decreasing sensitivity is supported by the reported fall in sensitivity value for the combination of Tau and Aβ42/pTau from the excellent 95% [86] to a diagnostically insufficient 82% by the extension of the median follow-up with 4 years (from 5.2 to 9.2 years) [89], whereas in another study by a fall in sensitivity for the AD-like CSF pattern from 82.9% to 68.0% by a 2-year extension of the follow-up (from 1 to 3 years) [148]; furthermore, it is also confirmed by findings of a comprehensive recent meta-analysis of conversion studies estimating the differences between those with a follow-up ≤ or > 1 year [90].
In addition to the limitations of studies addressing the prodromal diagnosis of AD discussed above, the highest concern regarding arguments stating that core CSF biomarkers could identify AD in a prodromal phase with high scientific accuracy is that there is at present no meta-analytic study to support them. Indeed, in the past year, Ferreira et al. published a comprehensive meta-analysis on the available data, and reported a good 85-86% sensitivity, but only a modest 60–79% specificity for the combined use of core CSF biomarkers in identifying prodromal AD, with the Aβ42/pTau ratio providing the highest diagnostic performance; the meta-analysis, however, jointly analyzed studies with ‘MCI-AD versus MCI-stable’ and ‘MCI-AD versus MCI-stable+MCI-NONAD’ designs [90]. This is in line with our own calculations with even higher number of relevant and additional recent studies included [85–89, 154], yielding a mean sensitivity ∼ 85% (ranging 80–100%), but a mean specificity as low as <70% (ranging 35–95%) for the combined use of core CSF biomarkers in identifying prodromal AD, with only a slight improvement in specificity when separately analyzing studies with the ‘MCI-AD versus MCI-stable+MCI-NONAD’ design [86, 154] (Tables 1 and 2, see methods in Supplementary Material). Even though our calculations are not of meta-analytic value, these data together with the recent meta-analysis suggest an insufficient diagnostic accuracy for core CSF biomarkers to identify prodromal AD, due to low specificity.
CONCLUDING REMARKS
The available accuracy data in the literature suggest a high performance of the combined use of core CSF biomarkers in differentiating between AD and other dementias, and propose that their characteristic alterations can be detected even at advanced prodromal stages of AD. On this basis, it is tempting to presume their ability to differentiate prodromal AD patients from MCI patients of all causes, a concept reflected by the recent revisions of AD research criteria and a consensus statement. According to our critical review on the widely applied study designs and evaluating approaches, however, the available evidence on the accuracy of CSF biomarkers in differentiating between AD and other dementias as well as in identifying MCI patients who convert into AD dementia are biased mainly by a disproportionate representation of differential diagnoses within the NONAD group, the frequent non-adjustment for confounders such as age and gender, the omission of MCI-NONAD cases from the analysis, the potentially dynamic heterogeneity of the MCI-stable group, and as a common source of confounders the lack of autopsy confirmation of the clinical diagnosis. Though unbiased direct evidence on the performance of CSF biomarkers to distinguish between prodromal AD and other pre-dementias is virtually absent, theoretical considerations in line with the reported data suggest that the overall sensitivity may fall below the acceptable value with the gradual extension of follow-up. While accurate identification of early converters to AD among MCI patients would per se be of outstanding clinical relevance, the calculated specificities from the currently available studies do not reach the level of diagnostic accuracy, in line with the results of a recent meta-analysis. While further prospective studies with an unbiased evaluation design and consecutive autopsy validation are eagerly awaited, at present there is no massive scientific evidence to support the use of CSF biomarkers in the differential diagnosis of prodromal AD, either in research or in clinical platforms.
Footnotes
ACKNOWLEDGMENTS
This project was supported by the Hungarian Brain Research Program - Grant No. KTIA_13_NAP-A-II/18., the European Union and the State of Hungary, co-financed by the European Social Fund in the framework of TÁMOP 4.2.4. A/2-11-1-2012-0001 ‘National Excellence Program’, TÁMOP-4.2.2/B-10/1-2010-0012, and TÁMOP-4.2.2.A-11/1/KONV-2012-0052. We are grateful to Dr. David Durham for proofreading the manuscript.
