Abstract
Diagnostic distinction of primary progressive aphasias (PPA) remains challenging, in particular for the logopenic (lvPPA) and nonfluent/agrammatic (naPPA) variants. Recent findings highlight that episodic memory deficits appear to discriminate these PPA variants from each other, as only lvPPA perform poorly on these tasks while having underlying amyloid pathology similar to that seen in amnestic dementias like Alzheimer’s disease (AD). Most memory tests are, however, language based and thus potentially confounded by the prevalent language deficits in PPA. The current study investigated this issue across PPA variants by contrasting verbal and non-verbal episodic memory measures while controlling for their performance on a language subtest of a general cognitive screen. A total of 203 participants were included (25 lvPPA; 29 naPPA; 59 AD; 90 controls) and underwent extensive verbal and non-verbal episodic memory testing, with a subset of patients (n = 45) with confirmed amyloid profiles as assessed by Pittsburgh Compound B and PET. The most powerful discriminator between naPPA and lvPPA patients was a non-verbal recall measure (Rey Complex Figure delayed recall), with 81% of PPA patients classified correctly at presentation. Importantly, AD and lvPPA patients performed comparably on this measure, further highlighting the importance of underlying amyloid pathology in episodic memory profiles. The findings demonstrate that non-verbal recall emerges as the best discriminator of lvPPA and naPPA when controlling for language deficits in high load amyloid PPA cases.
Keywords
INTRODUCTION
Primary progressive aphasia (PPA) is a neurodegenerative clinical syndrome characterized by an isolated language disorder at presentation. InternationalConsensus Diagnostic Criteria for PPA have been proposed [1] with three clinical variants of the condition being recognized: a semantic variant (svPPA), and two non-semantic variants termed nonfluent/agrammatic (naPPA) and logopenic (lvPPA) variants of PPA. Briefly, svPPA presents with loss of object and word meaning among grammatically correct and fluent speech. In contrast, naPPA presents with relatively preserved word comprehension but with apraxia of speech and/or grammatical errors in speech making these patients’ speech labored and ‘nonfluent’. lvPPA presents with anomia and marked difficulties in sentence repetition, amidst variable phonological errors in speech, preserved grammar (unlike naPPA), and single word comprehension (unlike svPPA) [1, 2].
Nevertheless, the distinction between the different PPA subtypes remains clinically challenging, in particular for lvPPA and naPPA. By contrast, clinical diagnosis of svPPA has been established for a long time [3] and its initial symptoms and pattern of progression stand out from the other two variants of PPA. The separation of lvPPA and naPPA is complex due to overlap in symptoms, in that the former make phonological errors and the latter also have errors in speech production on the basis of impaired articulatory planning [4]. In the absence of explicit guidelines on measuring these linguistic features, it becomes clear that there is a need to establish other cognitive biomarkers that can reliably distinguish both presentations.
Such cognitive biomarkers might be in particular relevant to detect the underlying pathology of both presentations. This is particularly relevant for lvPPA, as amyloid imaging studies with positron emission tomography (PET) have shown that lvPPA patients present with a similar prevalence of high amyloid as patients with Alzheimer’s disease (AD) [5]. By contrast, naPPA patients present with predominately low amyloid burden likely due to the prevalent underlying tauopathy. The difference in this underlying pathology between naPPA and lvPPA raises the question as to whether lvPPA show deficits in episodic memory and orientation similar to AD, in addition to the prevalent language deficits. If so, can performance on particular tests of memory be employed as proxy measures in discriminating an underlying amyloid (lvPPA) from a non-amyloid (naPPA) PPA syndrome? Two previous studies by our group investigated this issue [6, 7] and indeed showed that only lvPPA patients are impaired on episodic memory testing, whereas naPPA patients perform at a similar level as age-matched controls. Neither study specifically investigated whether the episodic performance in lvPPA might have been affected by the modality (verbal versus non-verbal) of the episodic memory tests employed. In particular, one should assume that verbal episodic test performance might be confounded by the language deficits in lvPPA, whereas non-verbal episodic memory might give a more accurate memory profile in lvPPA. Such findings would have immediate clinical implications, as these tests could possibly substitute more expensive and invasive methods such as amyloid imaging and lumbar puncture to confirm underlying pathology.
The current study set out to address this issue by contrasting verbal and non-verbal episodic memory measures in lvPPA and naPPA, while controlling for language deficits. Episodic memory performance was also compared to AD and age-matched controls. Importantly, a subset of patients had undergone amyloid imaging with PET to corroborate our findings. We hypothesized that lvPPA would be impaired on episodic memory tests compared to naPPA, confirming previous results. We further hypothesized that lvPPA would perform on an episodic performance level similar to AD and that non-verbal memory measures would allow the highest discrimination between lvPPA and naPPA, when controlling for overall language deficits.
MATERIALS AND METHODS
Patients meeting the consensus criteria were recruited from the Frontotemporal Dementia Clinic at Neuroscience Research Australia in Sydney. Diagnosis of PPA (25 lvPPA, 29 naPPA) was based on the International Consensus Criteria [1] using the clinical methods described in detail elsewhere [5]. A diagnosis of AD was made based on the diagnostic consensus criteria [8]. Forty-five patients (17 lvPPA, 13 naPPA, 15 AD) underwent an amyloid imaging PET study with 11C-Pittsburgh compound B (PiB) at the Austin Health Centre for PET in Melbourne, Australia. PiB retention was determined as the standardized uptake value ratio (SUVR) using the cerebellar cortex as reference region [5]. Neocortical SUVR was defined as the average SUVR of frontal, superior parietal, lateral temporal, lateral occipital, and cingulate regions. A cut-off ratio of 1.5 was used to dichotomize between ‘high’ and ‘low’ neocortical SUVR. All AD and lvPPA cases showed high neocortical SUVR (‘PiB positive’), while all naPPA cases showed low neocortical SUVR (‘PiB negative’). Data from the PiB-PET subset of patients were used in a confirmatory manner, adding to the results obtained from the whole-group analysis. Healthy control subjects (n = 90) were selected from the Frontier volunteer panel and age- and education-matched to the patient groups. This panel comprises of healthy elderly participants recruited from the community and screened via a general cognitive test (the Addenbrooke’s Cognitive Examination–Revised (ACE-R) that includes memory, language, visuospatial, and orientation sub domain scores [9]). At testing, these healthy participants are also tested on a detailed neuropsychological battery, and undergo standard questionnaires to screen for neuropsychiatric symptoms and structural neuroimaging. All healthy controls performed within normal range on these measures. All patients were tested on a general cognitive screen, the ACE-R. Particularly, the ACE-R language domain score that was used as a covariate for statistical analysis, comprises of short language tasks of single word and sentence comprehension, writing, single word and sentence repetition, and reading.
Carers of patients completed a neuropsychiatric screening questionnaire (Cambridge Behavioral Inventory revised, CBI-R [10]). This questionnaire comprises of questions regarding abnormalities in the patient’s behavior, memory, mood, eating and sleeping habits, and was used to determine memory problems in the patient as gauged by the carer. Disease severity was assessed using the Rasch score of the Frontotemporal Dementia Rating Scale (FRS) [11]. This scale ranks dementia severity based on different behavioral changes and impairments in everyday functioning that characterize frontotemporal dementia, of which PPA syndromes can be variants, with lower Rasch scores indicating greater disease severity. The South Eastern Sydney and Illawarra Atrea Health Service and the University of New South Wales human ethics committees approved the study and written informed consent was obtained from the participant or the primary caregiver in accordance with the Declaration of Helsinki.
Neuropsychological tests
Verbal recall and recognition were assessed using the Rey Auditory Verbal Learning Test (RAVLT) [12] and components from the memory domain (address learning, delayed address recall, retrograde memory, and address recognition) from the ACE-R. In the RAVLT, participants are presented with a list of 15 unrelated words that are repeated over five trials. Participants are required to recall as many words as they remember at the end of each trial as well as over a filled (with non-verbal tasks) delay of 30 minutes. No non-verbal memory tasks were administered during the delay. Following the delayed recall, a recognition component is administered where participants are required to correctly indicate intra-list words among foils. The RAVLT scoring yields total number of words recalled in each trial, number of words correctly recognized in the recognition component, as well as indices of learning over the five trials (Short Term Percentage Retention (STPR) calculated as immediate recall (A6) /final trial learning (A5)*100) and delay (Long Term Percentage Retention (LTPR) calculated as delayed recall (A7)/final trial learning (A5)*100) expressed as percentages. Only the RAVLT STPR, LTPR, and recognition components were considered here. Non-verbal recall was assessed using a three-minute delayed recall of the Rey-Osterrieth Complex Figure (ROCF) [13]. Here, participants are shown a complex figure and asked to copy it. The figure and its copy are then hidden and participants are required to draw as much of the figure as they can recall, testing incidental learning. After 3 minutes, a delayed recall is also administered and scoring comprises of number of correct components of the complex figure that participants were able to recall [14]. Non-verbal recognition was assessed using the Doors component (set A) from the Doors and People Test [15]. Here, participants are shown colored photographs of 12 ‘target’ doors. Each is followed by a presentation of four doors (one target and three foils of the same general label (e.g., barn door)) where the subject picks out the target door. A maximum score of 12 can be attained and if subjects score more than 9 on set A, they proceed to a more difficult set B. Only a small percentage of our patient group managed to proceed to set B, therefore, scores from only set A are considered here. A trained neuropsychologist (E.F.) administered and scored all neuropsychological performances.
Composite scores for verbal memory recall (comprising of scores from address learning, delayed address recall, and retrograde memory measures of the ACE-R and the STPR and LTPR measures of the RAVLT), verbal memory recognition (comprising of ACE-R address recognition and RAVLT list recognition), non-verbal memory recall (comprising of the ROCF three-minute delayed recall score), and non-verbal memory recognition (comprising of the Doors Test A raw score) were calculated by converting individual raw test scores into percentages (dividing the test score by maximum possible score and multiplying the result by 100), grouping the tests into their respective verbal and non-verbal recall and recognition domains and deriving averages of all percentage test values within each domain.
Statistics
Results were calculated using RStudio v2.13.1. Prior to any analysis, variables were plotted and checked for normality of distribution via Shapiro-Wilk tests. An analysis of variance (ANOVA) was used to compute mean differences for demographic data across groups. We performed an analysis of covariance (ANCOVA) to investigate how overall language deficits and disease severity impacted episodic memory performance. The ANCOVA thus computed group differences in patients for episodic memory data while controlling for disease severity (FRS Rasch score) and overall language impairment (ACE-R language score). The ACE-R language score was used as a covariate as its composite score comprising of single word and sentence comprehension, writing, single word and sentence repetition, and reading tests provides an overall impression of language performance in these patients. While post-hoc differences for demographic data were calculated using ‘false discovery rate’ method, post-hoc differences between groups for neuropsychological data were calculated using the ‘Tukey’s HSD’ method. These statistical procedures were further repeated on patients divided into high and low neocortical SUVR groups. Logistic regressions were then performed in order to evaluate diagnostic accuracy for both composite and specific neuropsychological measures between PPA groups.
RESULTS
Demographics
Demographics, MMSE, ACE-R, and CBI-R scores for all participants are presented in Table 1. All groups were matched for age, gender, and education and all patient groups were matched for disease duration (all p values >0.1). There was a significant group effect of disease severity as measured by the FRS [F(2,81) = 7.62; p < 0.001]. Severity of disease was significantly greater in AD patients as compared to both naPPA (p < 0.001) and lvPPA (p < 0.05) groups. Importantly, the lvPPA and naPPA groups did not differ significantly on disease severity (p > 0.1). Within patient groups, no significant differences were noted on demographic or disease-related variables between participants who did not undergo and those who underwent PiB-PET scans (p > 0.1).
Cognitive screening
When corrected for disease severity, significant group effects emerged for ACE-R language [F(2,80) = 10.08; p < 0.001], ACE-R memory [F(2,80) = 4.30; p < 0.05] subdomains, and overall ACE-R performance [F(2,80) = 7.57; p < 0.001] between patient groups. The lvPPA group performed poorer than both AD and naPPA groups on the ACE-R memory and ACE-R total scores (both p values < 0.05), and poorer than the AD group on the ACE-R language domain (p < 0.001). On the CBI-R memory component, significant group differences emerged [F(3,174) = 57.02; p < 0.001], wherein carers of patients with AD endorsed the most number of memory problems, which was significantly higher than both PPA groups (p < 0.01). Carers of patients with lvPPA endorsed significantly more number of memory complaints (p < 0.001) compared to carers of naPPA patients and all patient groups had higher endorsement of memory problems than controls (p < 0.01).
Episodic memory: Composite scores
Composite scores for verbal and non-verbal recall and recognition corrected for disease severity and language impairment are presented in Fig. 1 and Table 2. Significant group effects emerged only for composite verbal recall [F(2,79) = 3.71; p < 0.05] and non-verbal recall [F(2,71) = 20.57; p < 0.001] performance. The naPPA group performed significantly better than the AD group on verbal memory recall (p < 0.05) and significantly better than both lvPPA and AD groups on non-verbal memory recall (p < 0.001). No significant differences were found between AD and lvPPA groups on any of the measures.
Episodic memory: Test specific scores
The composite findings were followed up by an ANCOVA (correcting for disease severity andlanguage impairment) on all neuropsychological test measures to test overall and between-group differences (Table 3). Uncorrected and individually corrected (for disease severity and for language impairment) p-values for the whole group’s performance are depicted in Table 4.
On tests of verbal recall and recognition, all patient groups performed comparably except on the address recall task from the ACE-R [F(2,79) = 8.15; p < 0.001], on which the lvPPA group scored lower than the naPPA group (p < 0.001). Group differences also emerged on tests of non-verbal ROCF copy [F(2,72) = 3.59; p < 0.05] and recall [F(2,71) = 20.57; p < 0.001]. On the ROCF copy, the AD group performed poorer than the naPPA group (p < 0.05), while on the ROCF recall measure, the lvPPA and AD groups performed comparably, but poorer than the naPPA group (both p values < 0.001). Post-hoc tests on the non-verbal recognition measure (Doors test) revealed no significantdifferences between patient groups. No other significant differences between patient groups were found.
Amyloid – episodic memory relationship
Episodic memory performance for the amyloid subgroup’s performance (PiB positive versus negative) are depicted in Table 4. Analysis involving patients with PiB scans revealed an identical pattern as the whole group analysis. When corrected for both disease severity and language impairment, group differences emerged on ACE-R retrograde memory [F(2,31) = 5.57; p < 0.01] and ROCF delayed recall measure [F(2,31) = 7.03; p < 0.01]. The lvPPA group performed poorer than the naPPA group only on the ACE retrograde memory and ROCF delayed recall measures (both p values < 0.05), and the lvPPA and AD group performed comparably on all measures (p < 0.1). Within each patient group, no significant differences emerged on episodic memory data between participants who underwent PiB-PET and those who did not undergo amyloid imaging (all p values > 0.1).
Logistic regression and cut-off scores
Area under the curve was computed on lvPPA and naPPA patient groups to determine how well composite and specific verbal and non-verbal memory tests could discriminate both groups.
The non-verbal recall composite score emerged as the strongest discriminator between lvPPA and naPPA patients (distinguishing 81.3% of the sample) followed by verbal recall (distinguishing 73.9% of the sample) and verbal recognition (distinguishing 65.7% of the sample) composite scores. The weakest discriminatory measure was the non-verbal recognition composite score (distinguishing 47.9% of the sample).
Among individual tests, the two strongest discriminatory measures that emerged were the ROCF delayed recall and the ACE-R address delayed recall, successfully discriminating 81.3% and 75.3% of the sample at clinic presentation respectively. For lvPPA and naPPA patients with PiB-PET scans, this distinction increased to 84.2% and 83.3% of the sample respectively successfully classified at clinic presentation solely based on the ROCF delayed recall and ACE-R address delayed recall measures. Additionally, while the ACE retrograde memory measure discriminated 71.2% of lvPPA and naPPA patients in the overall group, this measure successfully discriminated 90.5% of lvPPA and naPPA patients in the high neocortical SUVR group, making this the strongest discriminator between lvPPA and naPPA patients with PiB-PET scans.
We established cut-off scores for the ROCF delayed recall and ACE-R address recall measures to determine their diagnostic specificity for lvPPA and naPPA. A cut-off score of 15 (max = 36) on the ROCF delayed recall and a cut-off of 3 (max = 7) on the ACE-R address delayed recall distinguished both groups significantly in a chi-square test (p < 0.05), with 90% of lvPPA patients and 33% of naPPA patients respectively, performing below both cut-offs.
DISCUSSION
Our results indicate that episodic memory measures are reliable cognitive biomarkers to distinguish lvPPA and naPPAs. After controlling for both disease severity and language impairment, only a delayed recall and non-verbal recall measure appeared to be particularly strong in discriminating both PPA groups. More importantly, non-verbal episodic memory measures, such as the ROCF were the most sensitive memory measures to dissociate amyloid based PPAs (i.e., lvPPA) from those with underlying tauopathies (i.e., naPPA). Thus modality of episodic memory task emerges as an important factor to have the most sensitive and specific discrimination of PPAs, when controlling for overall language deficits.
In more detail, we replicated previous findings showing that episodic memory is significantly impaired in lvPPA compared to naPPA and can act as a potential diagnostic biomarker for the underlying amyloid pathology in lvPPA [6, 7]. More importantly, the study findings suggest that non-verbal recall provides the clearest distinction of naPPA and lvPPA patients when controlling for language deficits. Particularly, the ROCF delayed recall measure emerged as the most sensitive discriminator between lvPPA and naPPA, possibly because this measure may not have been directly impacted by language deterioration in the course of the disease. While one may argue that this could reflect a visuospatial deficit in copying and drawing in lvPPA [16], our results clearly indicate that the lvPPA group performed comparably to both naPPA and AD groups on the ROCF copy suggesting operation of a larger deficit in non-verbal recall rather than in visuospatial abilities in lvPPA. While our non-verbal recall measure emerged sensitive, the non-verbal recognition test (Doors Test) was not a powerful discriminator between PPA syndromes, echoing similar results found using this test in patients with AD and frontotemporal dementia [17].
Previously, diagnostic distinction of PPA subtypes has mainly relied on various assessments of language and connected speech to distinguish naPPA from lvPPA [18–21]. Based on these, some have accurately classified as high as 88% of these patients using measures of acoustic speech production [20], up to 87% on a short language battery [21] and additionally up to 81% based on automated structural magnetic resonance imaging algorithms [22]. Classifying lvPPA patients on the basis of language tests has also been acknowledged to produce a false positive rate as high as 14% [21] casting doubt on whether one should rely solely on heavily language-oriented tasks to aid discrimination between these conditions. There is also the problem that PPA patients show overlapping features with lvPPA producing phonological errors and naPPA producing articulatory or phonetic deficits on the basis of apraxia of speech [4]. Non-verbal memory tasks, on the other hand, have been less explicitly tested for their diagnostic utility with few studies reporting deficits on these tests [7, 23] although the current findings replicate prior results.
The finding that groups with underlying Alzheimer pathology (AD and lvPPA) performed comparably on the ROCF delayed recall measure and poorer than the naPPA group, suggests that a non-verbal episodic memory task may aid in discriminating an underlying Alzheimer from a non-Alzheimer pathology based PPA syndrome. This finding is strengthened by identical results from patients with confirmatory PiB-PET scans, where the ROCF delayed recall measure discriminated close to 85% of PPA patients. Further support for this comes from early evidence for amyloid pathology to have strong links with scores on memory and orientation domains of the ACE-R, deficits on which are especially notable in amyloid-related PPA syndromes [6]. One surprising finding was that the delayed address recall of the ACE test also discriminated both groups very well, even when corrected for language deficits. Similarly, the ACE retrograde memory measure discriminated both PiB-confirmed PPA groups up to 90% . Thus, retrieval delay and semantic memory might be other important factors, in addition to non-verbal test material, for episodic memory testing in PPAs.
The current findings have both clinical and theoretical implications. When testing PPA patients, in addition to the suggested cut-off of 22 on the ACE-R memory domain (in [6]), if patients score less than 15 on the ROCF delayed recall and less than 3 on the ACE-R address delayed recall, it is likely that the patient has lvPPA, which can then be further confirmed using amyloid imaging. The suggested ROCF cut-off may also be particularly helpful for clinicians when assessing PPA patients who do not speak the same languages as them. Theoretically, amyloid deposition in the brain has been noted to be critical and strongly related to episodic memory performance, even in healthy adults [24]. Considering that both AD and lvPPA are characterized by high amyloid in the brain, lvPPA is likely to represent a language-onset variant of AD instead of a clinical variant of PPA [25]. This speculation needs further confirmatory evidence.
In summary, the findings suggest that a non-verbal episodic memory measure and a delayed recall measure would be useful as proxy measures for discriminating between non-semantic PPA syndromes as well as between Alzheimer and non-Alzheimer pathologies when used as part of a comprehensive neuropsychological test battery. Despite these promising findings, there were some limitations in this study, such as that only a subset of patients had pathological amyloid imaging confirmation, although the PiB group did not differ from the remaining patients for the observed deficits. This may have reduced the power for the PiB subgroup’s analysis. Similarly, as a larger number of exploratory analyses were performed here, this could also have potentially increased chances of type-1 errors. Another limitation of the current study was that we used percentages instead of raw or standardized scores in our composite score calculation. We also did not include PPA patients with mixed or atypical features and svPPA patients. We also used the FRS as a measure of disease severity and though this scale has been validated in PPA patients [11], this measure may not have accurately captured disease severity in AD. Future studies should address these issues as well as attempt to understand if episodic memory deficits present early in the course of the disease could aid in differentially classifying lvPPA from AD. These would clearly have implications to more accurate diagnosis of lvPPA as the current PPA diagnostic criteria exclude pronounced memory impairment in early stages of the disease. Finally, future studies should investigate: i) the relationship between AD, lvPPA and combinations of underlying tau and amyloid pathophysiology and ii) how neuroanatomical substrates of non-verbal memory deficits in these patients correlate with current imaging evidence demonstrating predominant left-sided temporoparietal atrophy in lvPPA.
Footnotes
ACKNOWLEDGMENTS
MH is supported by Alzheimer’s Research UK, the Wellcome Trust and the Newton Trust. JRH is supported by NHMRC program and ARC Centre of Excellence in Cognition and its Disorders. CEL is supported by DVC postdoctoral Fellowship, University of Sydney, Australia. VLV is a recipient of a NHMRC Research Fellowship and received speaker’s honoraria from GE Healthcare, AstraZeneca, and Piramal Imaging, and consulting honoraria from Novartis and Bayer Healthcare. CR is a recipient of grants from GE Healthcare, Avid Radiopharmaceuticals, Piramal Imaging, and Navidea, NHMRC research grant, and speaker’s honoraria from GE Healthcare, Piramal Imaging, and AstraZeneca.
