Abstract
Background
Few studies have examined written discourse in primary progressive aphasia (PPA), a clinical syndrome due to frontotemporal lobar degeneration (FTLD) or Alzheimer's disease (AD).
Objective
We aim to: (1) determine differences in written discourse in PPA variants and controls using three approaches for analyzing narratives, and (2) make recommendations regarding the clinical utility of these approaches.
Methods
Individuals with PPA and healthy controls wrote descriptions of the Boston Diagnostic Aphasia Examination Cookie Theft Picture (CTP). We hypothesized that written narratives would be characterized by: (1) fewer total words, lower percentages of CTP content units (CUs) and Core Lexicon Words, and lower communication efficiency in all PPA variants compared to controls; (2) fewer content words in semantic variant PPA (svPPA) than in logopenic variant PPA (lvPPA), nonfluent variant PPA (nfvPPA), and controls; and (3) fewer function words in nfvPPA than in lvPPA and controls.
Results
Participants with svPPA had significantly lower percentages of total CTP CUs, CTP noun and verb phrase CUs, and Core Lexicon Words than lvPPA. Overall content unit (CU) profile of lvPPA was more similar to controls than to the other two variants. Written narratives of participants with nfvPPA and svPPA had significantly lower percentages of particles, a class of function words, than lvPPA participants.
Conclusions
Part of speech analysis showed a deficit in function words in nfvPPA. Content unit analysis distinguished svPPA and lvPPA, and is easily incorporated into the clinical environment when spoken data are scarce or difficult to obtain because of speech production impairments.
Keywords
Introduction
Primary progressive aphasia (PPA) is a neurodegenerative clinical syndrome, characterized by the insidious onset of language impairments with relative sparing of other cognitive domains in the early stages of disease progression.1–3 Three variants are commonly recognized: semantic variant PPA (svPPA), nonfluent variant PPA (nfvPPA), and logopenic variant PPA (lvPPA). 4 PPA is diagnosed based on clinical features, patterns of atrophy in the left temporal, parietal, and/or frontal cortices, and underlying neuropathologies, typically frontotemporal lobar degeneration (FTLD) in svPPA and nfvPPA, and Alzheimer's disease (AD) in lvPPA.5–12
Spoken discourse has been investigated extensively in individuals with all PPA variants using various elicitation techniques, including picture description,13–22 semi-structured interviews,14,15,19,23 and wordless picture books.16,19,24–30 Analysis of spoken descriptions of the Cookie Theft Picture (CTP) from the Boston Diagnostic Aphasia Examination 31 revealed grammatical impairment in individuals with nfvPPA compared to other PPA variants, with shorter, less complex, and fewer well-formed sentences in nfvPPA than in lvPPA and svPPA.16,18 Similarly, in semi-structured interviews, there were more frequent grammatical errors in nfvPPA than in lvPPA and svPPA. 15 Themistocleous et al. 20 performed an automated analysis of spoken CTP descriptions, which revealed that participants with nfvPPA produced proportionally more nouns than participants with lvPPA and svPPA, and fewer verbs than svPPA, although the latter was not significant—however, the content-to-function word ratio distinguished individuals with nfvPPA from the other variants. Analysis of spoken descriptions of the Picnic Scene from the Western Aphasia Battery 32 revealed that slow rate of speech, frequent phonemic errors, syntactic errors, reduced embeddings, and short mean length of utterance in nfvPPA. In contrast, there was normal rate of speech and few speech or syntactic errors, but increased proportions of closed class [versus open class (e.g., nouns)] words, and use of high frequency nouns in svPPA. In lvPPA, speech rate was intermediate between the other two variants, but distortions and syntactic errors were less common than in nfvPPA. The greatest numbers of false starts, filled pauses, and repaired sequences occurred in lvPPA compared to nfvPPA and svPPA. 13 Specifically, reduced total speech duration, fewer utterances, and a higher use of open class words (e.g., nouns) were the most effective measures for distinguishing nfvPPA from lvPPA. In spoken descriptions of the Picnic Scene by Italian-speaking individuals, nfvPPA was distinguished from lvPPA by reduced total speech duration, fewer utterances, and a higher use of open class words. 33
Much less attention has been directed to written discourse (see Kim and colleagues 34 for review), even though written communication is increasingly vital in everyday life with the expanded use of email, text messaging, social media, and online shopping and banking, and despite the current diagnostic criteria specifying associated speech-language characteristics that are captured through written language. Specifically, surface dyslexia/dysgraphia (regularization of irregular words results in phonologically plausible errors, such as “ocean” spelled “oshun”) in svPPA. 4 Although beyond the scope of this paper, researchers have investigated whether spelling can characterize the PPA variants and have found that nfvPPA and svPPA have distinctive spelling profiles, with some exceptions and overlap of the spelling error types emerging with disease progression.35,36 Surface dysgraphia (reliance on sublexical phonology-to-orthography conversion, phonologically plausible errors, more difficulty spelling words with low phoneme-to-grapheme conversion probabilities, such as irregular words like “yacht”) is typical of svPPA; phonological dysgraphia (phonologically implausible errors, more difficulty spelling pseudowords than words, better performance on regular than exception phoneme-to-grapheme mapping) is characteristic of nfvPPA; a mixture of both spelling patterns is seen in lvPPA.35–38 Neophytou et al. 39 employed an automated analysis of spelling using pairwise comparisons to accurately classify 64% of their participants with PPA based on spelling profiles in a real-life scenario in which there was no prior knowledge of PPA variant type.
Few studies have examined written discourse in PPA.40–48 An advantage of studying written discourse is that writing is not confounded by motor speech impairments [i.e., apraxia of speech (AOS), dysarthria] that can be encountered in PPA. While comprehensive assessment of PPA requires evaluation in both spoken and written domains, evaluation of written discourse can enable identification of language impairments that may be masked in spoken discourse by speech production deficits. Apraxia of speech and agrammatism are core features of nfvPPA.3,4,26,49 Although agrammatism (progressive agrammatic aphasia50,51) and apraxia may present in isolation [primary progressive apraxia of speech (PPAOS52,53)], the most common presentation is concomitant AOS and agrammatism, 54 with AOS being more frequent or prominent than agrammatism and one of the first symptoms that contributes to timely diagnosis.55–59 Studies of nfvPPA, or PPA unspecified as to type, reported a median AOS prevalence of 78% across 162 aggregated cases, and in about 20% of cases, AOS was the primary and sometimes only deficit with little or no evidence of aphasia. 60 Mutism, the absence of speech or articulatory movement due to AOS, can present early in the disease course in nfvPPA while other cognitive functions are spared. 56 AOS can also occasionally present in lvPPA.13,59,61 Less commonly, dysarthria (typically spastic and hypokinetic subtypes) occurs in PPA.13,61–63 Motor speech impairment can complicate language-based analysis of spoken discourse because distinguishing phonetic (AOS) and phonemic speech sound errors can be challenging even for experienced clinicians, 64 and because speech intelligibility can be compromised due to dysarthria.
Most studies of written discourse employed picture scene descriptions. For example, Sitek et al. 41 compared descriptive writing using three picture scenes (the CTP, the picture scene from the Frenchay Aphasia Screening Test, 65 and the Warrington beach scene 66 ) in 10 Polish-speaking individuals with nfvPPA and 17 individuals with progressive supranuclear palsy (PSP). Sitek et al. 42 used this same approach to analyze written discourse in Polish-speaking patients (nine with lvPPA, 13 with AD, and 13 with MCI).
In three other studies, oral and written discourse were analyzed.38,41,42 Graham et al. 40 compared spoken and written CTP descriptions in 14 individuals with nfvPPA (relative to controls). Tetzloff et al. 43 compared spoken and written descriptions of the Picnic Scene from the Western Aphasia Battery (WAB) 32 in eight individuals with agrammatic primary progressive aphasia (agPPA) defined as predominant agrammatism, 21 with agrammatism in the context of dominant apraxia of speech (DAOS), and 13 with primary progressive apraxia of speech (PPAOS). Tetzloff et al. 48 subsequently compared written descriptions of the WAB Picnic Scene from 24 patients with AOS+ primary agrammatic aphasia (PAA), 24 patients with PPAOS, and 24 healthy controls. Josephy-Hernandez et al. 44 analyzed total units (i.e., words, non-words, false starts) and content units in spoken and written descriptions of the WAB Picnic Scene in healthy controls and individuals with PPA (28 nfvPPA, 30 lvPPA, 17 svPPA). Code et al. 45 conducted a longitudinal analysis of written language in an individual with nfvPPA. Heitkamp et al. 46 evaluated diary entries written over time by an individual of Swiss origin with svPPA. Hwang et al. 47 examined two works of historical fiction authored by an individual with svPPA.
The studies above yielded the following observations. With regard to nfvPPA, analysis of written picture scene descriptions reveals manifestations of agrammatism without the confounding factor of speech production deficits, such as apraxia of speech and dysarthria. For example, Graham et al. 40 reported that their patients had no evidence of agrammatism or reduced verb production in spoken discourse. However, they showed agrammatism, reduced verb production, and overall telegraphic output in written narratives. Sitek et al. 41 found that there were no differences in the distributions of nouns and verbs in the written picture descriptions in nfvPPA and PSP, but that participants with PSP wrote longer narratives and had more letter and diacritic mark omissions than participants with nfvPPA. Tetzloff et al. 43 found a lower mean length of sentence/utterance, fewer grammatical utterances, more non-utterances, more syntactic and semantic errors, and fewer complex sentences/utterances in nfvPPA (agPPA and DAOS) than PPAOS in written and spoken narratives, as well as fewer correct verbs and nouns in spoken than written narratives. Tetzloff et al. 48 found that AOS + PAA group showed significantly reduced number of words compared to PPAOS patients and controls, reduced number of sentences compared to controls, and grammatical deficits. Longitudinal analysis of picture description revealed a decline in the total number of words, verbs, and function words over time in nfvPPA. 45
With regard to the other PPA variants, Sitek et al. 42 found that written picture descriptions were comparable across lvPPA, AD, and MCI, but letter insertion errors distinguished lvPPA from AD and MCI, and frequent verb use distinguished lvPPA from AD. Analysis of text from authors’ books and their diary entries over time showed less varied vocabulary, increased use of high-frequency words, and the emergence of simplified sentence structure over time in svPPA.53,54
When comparing spoken and written discourse, Josephy-Hernandez et al. 44 analyzed content units (words that are intelligible in context, accurate concerning the picture, and relevant to and informative about the content of the picture) rather than parts of speech, and compared spoken and written discourse. Participants with lvPPA and svPPA wrote fewer content units than controls. There were fewer total content units in written versus spoken narratives in all PPA variants. Participants with lvPPA and svPPA produced fewer written content units than spoken content units (see Table 1 for a summary).
Summary of findings for oral and written discourse in PPA.
PPA: primary progressive aphasia; lvPPA: logopenic variant PPA; nfvPPA: nonfluent variant PPA; svPPA: semantic variant PPA; PSP: progressive supranuclear palsy; agPPA: agrammatic primary progressive aphasia; DAOS: agrammatism in the context of dominant apraxia of speech; PPAOS: primary progressive apraxia of speech; PAA: primary agrammatic aphasia.
In summary, only Josephy-Hernandez et al., 44 Heitkamp et al., 46 and Hwang et al. 47 included individuals with svPPA in their analyses (total n = 19). While Josephy-Hernandez et al. 44 compared total units and content units in spoken and written discourse in healthy controls and individuals with all PPA variants, they did not analyze parts of speech. Thus, little is known about written discourse in svPPA. No studies compare different written discourse measures (e.g., parts of speech, content units) in their capacity to distinguish between all PPA variants. Most previous studies have concentrated on nouns and verbs, leaving the analysis of other word classes relatively underexplored (e.g., particles that are relevant in this paper). Furthermore, it is not known whether spoken language deficits that characterize the variants (i.e., fewer function words in nfvPPA, fewer nouns in lvPPA and svPPA) are modality-specific. Findings regarding spoken discourse cannot be generalized to written discourse because these modalities, while coexisting and complementary, are developmentally and evolutionarily independent.67,68
In the present study, we aimed to: (1) address the gap in knowledge regarding deficits in parts of speech in all PPA variants manifested in written discourse, (2) expand the analysis to include commonly used approaches, such as the CTP content units (CUs), 69 and (3) make recommendations regarding the clinical utility of different analysis approaches. We examined several written discourse measures in PPA variants and neurotypical controls using three tools:
(i) lexical and morphological measures, including total words, content and function words, and part of speech; (ii) CTP CUs and communication efficiency [defined as syllables per content unit (CU), Syll/CU]69,70 and; (iii) Core Lexicon Words. 71 We chose these measures because: (i) lexical and morphological measures will allow us to investigate how known differences among the PPA variants (i.e., fewer function words in nfvPPA, fewer nouns in lvPPA and svPPA) are manifested in written discourse; (ii) CTP CUs and communication efficiency will allow us to investigate the proficiency of written discourse, which, unlike proficiency in the spoken domain, is not compromised by motor speech impairment; and (iii) Core Lexicon Words will allow us to compare the PPA variants using a clinician-friendly tool that captures word retrieval, an impairment common to all PPA variants, albeit with differences in the nature and severity of word retrieval difficulty (i.e., prominent anomia in svPPA due to impaired semantic knowledge 4 ; greater verb naming deficits in nfvPPA72–77 and greater noun naming deficits in svPPA and lvPPA74–77). 78 Regarding aims 1 and 2, based on previous literature describing the speech and language profiles of the PPA variants, 4 we hypothesized that written picture narratives will be characterized by: (1) fewer total words (content and function words), lower percentages of CTP CUs and Core Lexicon Words, and lower communication efficiency (Syll/CU) in all PPA variants compared to controls who do not have speech and language impairments; (2) fewer content words in svPPA than in lvPPA, nfvPPA, and controls because anomia and associated impaired semantic knowledge are the predominant features of svPPA79,80; and (3) fewer function words in nfvPPA than in lvPPA and controls due to greater impairment for verbs versus nouns72–76,81 and agrammatism in nfvPPA. 4
Methods
Participants
Participants were individuals with PPA (n = 49) who were included in previous research studies or seen in an outpatient neurology clinic. Twenty-one participants (16 with lvPPA, five with nfvPPA) were enrolled in clinical trials (tDCS Intervention in Primary Progressive Aphasia, ClinicalTrials.gov Identifier NCT02606422; Targeting Language-specific and Executive-control Networks with Transcranial Direct Current Stimulation in Logopenic Variant PPA, ClinicalTrial.gov Identifier NCT03887481). Sixteen individuals with svPPA and 12 individuals with nfvPPA were seen in an outpatient neurology clinic between June 2012 and December 2017. Eighteen healthy controls participated in a longitudinal study of stroke recovery.
The research was planned, conducted, and recorded according to the principles of the Declaration of Helsinki, seventh edition. The participants in the clinical trials or their legally authorized representatives provided written informed consent. Ethical approval was obtained from the Johns Hopkins University School of Medicine Institutional Review Board (NA_00071337; NA_00042097; IRB00201027) and the University of South Florida Institutional Review Board (IRB005068).
For clinical patients, ethical approval for chart review, data extraction, and data entry into a secured, de-identified spreadsheet from the electronic medical record was obtained from the Johns Hopkins University School of Medicine Institutional Review Board which approved this study as exempt research for which consent is not required (IRB00241582). Overall, 49 individuals with PPA (mean age = 68.53 years ± 7.90; 51% women) completed written narratives of the CTP. This PPA group included 16 with lvPPA, 17 with nfvPPA, and 16 with svPPA. PPA was determined by their medical history, comprehensive neurological examination, neuroimaging (if available), and a battery of cognitive/language tests. Criteria from the international consensus were used to classify PPA. 4 The particular variant was diagnosed based on expressive and receptive language characteristics, including the key characteristics of impaired repetition in lvPPA, agrammatism and/or apraxia of speech in nfvPPA, and impaired word comprehension in svPPA. Written discourse, our measure of interest, was not used to classify the variants. Eighteen healthy controls (mean age = 58.83 years ± 10.34; 39% women) also completed CTP written narratives. These individuals had no history of cognitive changes, no subjective memory complaints, and no pathological findings on neuroimaging (if available). Performance was within normal limits on the Mini-Mental State Examination (mean = 29.22 ± 0.81; ≥27.6/30 indicates normal cognition). 82 There were no sex differences across PPA variants and healthy controls. After correcting for multiple comparisons, the PPA variants were not significantly different from one another in terms of age and years of education; however, control participants were significantly younger than individuals with nfvPPA and had significantly more years of education than participants with svPPA (Table 2; see Supplemental Table 1 for medians and mean ranks for age and education for PPA variants and participants overall).
Age, education, and sex for PPA variants and participants overall.
F: female; SD: standard deviation; y: years; PPA: primary progressive aphasia; lvPPA: logopenic primary progressive aphasia; nfvPPA: nonfluent variant primary progressive aphasia; svPPA: semantic variant primary progressive aphasia; *p values were calculated using Kruskal-Wallis Test with Dunn's Post Hoc Comparisons for age and education and using Pearson chi-square for sex. **p values were adjusted for six comparisons (controls-lv, controls-nfv, controls-sv, lv-nfv, lv-sv, nfv-sv); pbonf = 0.05/6 = 0.008. § missing data for 1 subject; ‡ missing data for 2 subjects.
The PPA variant groups were compared on tests of confrontation naming, important for performance on the written discourse, and a measure of overall function. Individuals with svPPA scored lower on the short version of the Boston Naming Test (BNT),83,84 the Hopkins Action Naming Assessment (HANA), 85 and the Pyramids and Palms Trees Test (PPTT),86,87 than individuals with lvPPA and nfvPPA. They also scored lower on the Kissing and Dancing Test 78 than individuals with nfvPPA. The Global Score on the Clinical Dementia Rating (CDR)88,89 was significantly higher (reflecting greater impairment) for svPPA than nfvPPA; however, the mean scores for all variants reflected very mild/mild cognitive impairment (Table 3; see Supplemental Table 2 for medians and mean ranks for cognitive/language test scores for PPA variants).
Cognitive/language test scores for PPA variants.
SD: standard deviation; PPA: primary progressive aphasia; lvPPA: logopenic primary progressive aphasia; nfvPPA: nonfluent variant primary progressive aphasia; svPPA: semantic variant primary progressive aphasia; *p values were calculated using Kruskal-Wallis Test with Dunn's Post Hoc Comparisons. **p values were adjusted for three comparisons (lv-nfv, lv-sv, nfv-sv); pbonf = 0.05/3 = 0.016. § missing data for 1 subject; ‡ missing data for 2 subjects; † missing data for 3 subjects.
Procedures
All participants completed a written description of the CTP. Written picture descriptions were obtained at the initial testing timepoint for those participants who were in clinical trials and at the baseline clinical evaluation for those participants who were seen in the outpatient neurology clinic. Participants were asked to wear their eyeglasses (if needed and available) for the task. They were shown the CTP and were instructed, “Please look at the picture and write a description of everything you see happening in this picture.” There were no time limits.
Handwritten picture descriptions were typed into Word documents by one co-author (KMS). A second co-author (DCT) with experience in language sample collection and analysis independently reviewed the transcriptions to ensure accuracy. Discrepancies were resolved by reviewing handwritten descriptions to achieve consensus. Written discourse was analyzed as described below.
Analysis 1: part of speech
Written narratives were analyzed and scored for measures of lexical, phonological, and morphological characteristics using an online platform (Open Brain AI; https://openbrainai.com). 90 Open Brain AI uses computational analysis of spoken and written language for research and clinical purposes. Automated lexical (i.e., total number of words, total number of content words, total number of function words), phonological (i.e., total number of syllables), and morphological measures (i.e., number of adjectives, adverbs, articles, conjunctions, nouns, particles, prepositions, pronouns, and verbs; percentage of adjectives, adverbs, articles, conjunctions, nouns, particles, prepositions, pronouns, and verbs of the total number of words) were extracted.
The part of speech tagging from Open Brain AI was verified manually and by analyzing the written discourse samples using a part-of-speech tagger tool from an online website (https://textinspector.com).
Analysis 2: content units and communication efficiency
Written CTP descriptions were also evaluated using the analysis developed by Yorkston and Beukleman 69 and Craig et al. 70 These measures included total CTP CUs [concepts mentioned by at least one healthy control describing the CTP (a total of 52)] which captures the relevance of written output to the pictured scene, and communication efficiency (syllables per CU, Syll/CU) which measures digression and irrelevancy in written output. 91 CTP CUs were counted by reviewing each written sample and identifying any of the CUs from the list of 52. Repeated CUs were counted once. Percentages of CUs (out of 52) were calculated.
Analysis 3: core lexicon
Additionally, a Core Lexicon analysis was performed using a predetermined word list developed by Dalton et al. 71 The Core Lexicon Words are single-word lexical items commonly produced by healthy individuals in their descriptions of the CTP. The list consists of 26 words (16 content words and 10 function words). Each written discourse sample was manually analyzed for the presence or absence of each item, and the percentages of the 26 possible items were calculated.
In summary, we measured:
For Analysis 1:
number of total words, number of total syllables, number of all content words (i.e., adjectives, adverbs, nouns, verbs), number of all function words [i.e., articles, conjunctions, particles (a class of function words that includes adverbial particles, the infinitival particle “to”, discourse particles, negative particles), prepositions, pronouns], percentages of parts of speech (i.e., percentages of adjectives, adverbs, articles, conjunctions, nouns, particles, prepositions, pronouns, verbs), defined as the percentage of each part of speech out of the number of total words,
For Analysis 2:
(vi) percentage of CTP CUs (of 52 total CUs) (i.e., content units relevant to the CPT task according to Yorkston and Beukelman
69
and Craig et al.,
70
(vii) percentages of adjective CTP CUs, noun CTP CUs, verb CTP CUs, prepositional phrase CTP CUs, noun phrase CTP CUs, and verb phrase CTP CUs (of 52 total CUs), (viii) communication efficiency, defined as the number of syllables divided by the number of CTP CUs according to standardized scoring,
92
with lower syllables/CU (Syll/CU) reflecting more efficient discourse, and
For Analysis 3:
(ix) percentage of Core Lexicon Words (of 26 Core Lexicon Words).
71
The Kruskal-Wallis rank test evaluated the differences in the distribution of age, education, and our measures of interest in the written narratives among the PPA variants and healthy controls. The Kruskal Wallis rank test was used in an analysis of single-word and phrase CUs. Pairwise comparisons were performed using Dunn's Post Hoc Comparisons. A Bonferroni correction was applied to account for multiple comparisons. The adjusted p-values for significance are included in the table legends. Differences in means for sex were tested with Pearson chi-square among the PPA variants and healthy controls. JASP 0.18.3 was used for statistical analysis.
Results
Analysis 1: part of speech
As shown in Table 4, after adjusting for multiple comparisons, the total number of words, the total number of syllables, and the total number of content words were significantly lower for svPPA than lvPPA. There was a trend for fewer total content words for svPPA than controls (p = 0.010; pbonf = 0.05/6 = 0.008).
Analysis 1 results for total words, total syllables, content words, function words, and part of speech.
SD: standard deviation; PPA: primary progressive aphasia; lvPPA: logopenic primary progressive aphasia; nfvPPA: nonfluent variant primary progressive aphasia; svPPA: semantic variant primary progressive aphasia; *p values were calculated using Kruskal-Wallis Test with Dunn's Post Hoc Comparisons. **p values were adjusted for six comparisons (controls-lv, controls-nfv, controls-sv, lv-nfv, lv-sv, nfv-sv); pbonf = 0.05/6 = 0.008.
The percentages of particles were significantly lower for nfvPPA and svPPA than lvPPA. There was a trend for lower percentages of verbs in nfvPPA than in controls (p = 0.011, pbonf = 0.05/6 = 0.008) and in lvPPA (p = 0.090, pbonf = 0.05/6 = 0.008). The percentage of verbs was lowest for nfvPPA (14.56%) followed by svPPA (16.18%), lvPPA (18.15%), and then controls (19.29%) (Table 4; Figure 1(a)–(d); see Supplemental Table 3 for medians and mean ranks for total words, total syllables, content words, function words, and part of speech).

Examples of written cookie theft picture descriptions by a healthy control individual (a) and by individuals with PPA (b-d). Colors indicate different parts of speech. (a) Example from Healthy Control Participant. (b) Example from a Participant with Logopenic Variant PPA. (c) Example from a Participant with Nonfluent Variant PPA. (d) Example from a Participant with Semantic Variant PPA (Color version available in online).
Analysis 2: content units and communication efficiency
After adjusting for multiple comparisons, percentages of CTP CUs were significantly lower for nfvPPA and svPPA than controls, but that was not the case for lvPPA. There was also a significantly lower percentage of CPT CUs for svPPA than lvPPA. As expected, communication efficiency (Syll/CU) was significantly lower for lvPPA, nfvPPA, and svPPA than controls (Table 5; see Supplemental Table 4 for medians and mean ranks for content units, communication efficiency, and Core Lexicon).
Analysis 2 results for content units and communication efficiency.
SD: standard deviation; PPA: primary progressive aphasia; lvPPA: logopenic primary progressive aphasia; nfvPPA: nonfluent variant primary progressive aphasia; svPPA: semantic variant primary progressive aphasia; *p values were calculated using Kruskal-Wallis Test with Dunn's Post Hoc Comparisons. **p values were adjusted for six comparisons (controls-lv, controls-nfv, controls-sv, lv-nfv, lv-sv, nfv-sv); pbonf = 0.05/6 = 0.008.
We expanded the Cookie Theft analysis to compare PPA variants and controls for
We also compared the composite distributions, that is, the percentages of each type of CUs (nouns, verbs, adjectives, noun phrases, verb phrases, prepositional phrases) among healthy controls and the PPA variants (Figure 2()–(d)). The composite distributions were not significantly different for controls and all PPA variants (Pearson chi-square = 14.04, df =15, p = 0.523); that is, among healthy controls and PPA variants, nouns comprised the largest percentage of content units in written picture descriptions, followed by verbs, prepositional phrases, verb phrases, noun phrases, and last, adjectives. Upon visual inspection, composite distributions of written CTP descriptions of participants with lvPPA closely resembled the CU type distribution of controls (Figure 2(a) and (b)). Composite distributions of written CTP descriptions of participants with nfvPPA were similar to the svPPA distribution (Figure 2(c) and (d)). There were higher percentages of nouns in written picture descriptions of participants with nfvPPA (53.81%) and svPPA (55.33%) than healthy controls (41.23%) and participants with lvPPA (44.66%), and lower percentages of verb phrases in written picture descriptions of participants with nfvPPA (3.95%) and svPPA (3.08%) than healthy controls (8.64%) and participants with lvPPA (8.61%). Analysis of these data supported visual inspection in part. The percentages of nouns were significantly higher for nfvPPA and svPPA versus healthy controls, and the percentages of verb phrases were significantly lower for svPPA than healthy controls and lvPPA (Table 5).

Content unit types in cookie theft picture descriptions by healthy controls (a) and by individuals with PPA (b-d).
Analysis 3: core lexicon
Participants with svPPA had a significantly lower percentage of Core Lexicon Words than controls and lvPPA (Table 6; see Supplemental Table 4 for medians and mean ranks for Core Lexicon).
Analysis 3 results for core lexicon.
SD: standard deviation; PPA: primary progressive aphasia; lvPPA: logopenic primary progressive aphasia; nfvPPA: nonfluent variant primary progressive aphasia; svPPA: semantic variant primary progressive aphasia; *p values were calculated using Kruskal-Wallis Test with Dunn's Post Hoc Comparisons. **p values were adjusted for six comparisons (controls-lv, controls-nfv, controls-sv, lv-nfv, lv-sv, nfv-sv); pbonf = 0.05/6 = 0.008.
Discussion
Analysis of written discourse in the three primary progressive aphasia variants
We found that the part of speech analysis showed a deficit in function words in nfvPPA, with significantly lower percentages of particles in the written output of participants with nfvPPA compared to lvPPA, and a trend for fewer verbs than in controls, and that the content unit analysis distinguished svPPA and lvPPA with significantly lower percentages of total CTP CUs, CTP noun phrase, and CTP verb phrase CUs in svPPA than lvPPA. When considering percentages of parts of speech, the percentages of particles were significantly lower for nfvPPA and svPPA than lvPPA, distinguishing lvPPA from the other two variants. Particles “do not fit conveniently into other classes of words,” 93 and do not change form through inflection. 94 Particles include function words (e.g., adverbial particles such as “back” in the phrase “come back,” the infinitival particle “to” in the phrase “to get the cookies,” discourse particles such as “anyway” in the sentence “Anyway he is falling”, negative particles such as “not” in the sentence “She is not paying attention”), and this finding is, in part, consistent with our hypothesis that participants with nfvPPA would generate fewer function words than lvPPA and controls. There was a strong trend for participants with nfvPPA to have the lowest percentage of verbs compared to controls and lvPPA in written output.
Participants with svPPA had significantly fewer total words, total syllables, and content words than participants with lvPPA. This finding may be expected given that participants with svPPA were more impaired in word semantics as evidenced by lower scores on tests of confrontation naming than participants with lvPPA and nfvPPA. Participants with lvPPA generated more total words, total syllables, content words, and function words than controls, although this content was not always relevant to the picture stimulus with significantly lower communication efficiency for lvPPA than controls [communication efficiency (Syll/CU) 4.77 for lvPPA versus 0.34 for controls, see Table 5].
The content unit analysis distinguished nfvPPA and svPPA from controls with significantly lower percentages of CTP CUs (single word and phrase CUs) in nfvPPA and svPPA than controls, but significantly higher percentages of noun CUs in these variants than controls. Percentages of noun and verb phrases were significantly lower in svPPA than in lvPPA; the percentage of verb phrases was significantly lower in svPPA than in controls. Overall, written CTP descriptions of participants with svPPA were most dissimilar to controls whereas those of lvPPA were more closely aligned. As expected, the communication efficiency (Syll/CU) of the PPA variants paled in comparison to controls.
Like the part of speech analysis, the Core Lexicon analysis distinguished svPPA from controls and lvPPA. The significantly lower percentage of Core Lexicon Words in svPPA than controls and lvPPA is consistent with the finding that participants with svPPA generated fewer total content words than controls and fewer total words, total syllables, and content words than participants with lvPPA.
In summary, our results indicate the following key findings for each PPA variant:
The written output in nfvPPA is characterized by significantly fewer particles than lvPPA, and a trend for fewer verbs than controls. The written output in svPPA is more limited, with significantly fewer total words, content words, CTP total CUs, CTP noun CUs, and Core Lexicon Words than lvPPA, and is syntactically more simplistic, with significantly fewer phrases than lvPPA and controls. The written narrative composite profile in lvPPA is similar to that of controls but is also characterized by a significantly higher percentage of verb phrase CUs than in svPPA and controls, as well as irrelevant content manifested by a high syllable-to-content ratio.
Difficulty naming, or anomia, is present in all three PPA variants. Some have reported differences when considering word-class deficits. Greater verb naming deficits have been found in nfaPPA72–77 and greater noun naming deficits have been reported in svPPA and lvPPA.74–77,78 These conclusions are primarily drawn from performance on confrontation naming tasks rather than connected speech. Performance on written discourse is congruent in part with these core deficits seen in confrontation naming. All the PPA variants were less efficient in their written picture descriptions than controls. We did not compare noun and verb naming; however, there was a trend for fewer verbs in the written narratives of nfvPPA than controls. However, the percentages of nouns were not significantly different among the PPA variants and healthy controls. Anomia is prominent in svPPA due to impaired semantic knowledge. 4 Impaired semantic knowledge was manifested in the written output of participants with svPPA by participants with this variant writing fewer total words, fewer content words, fewer CTP total CUs, fewer CTP noun CUs, and fewer Core Lexicon Words compared to lvPPA and controls. The written narratives of participants with svPPA were characterized primarily by single words (nouns) and limited phrases. This was somewhat unexpected as individuals with svPPA are thought to have relatively preserved phonological and syntax abilities. 95 A possible explanation is that impaired semantic knowledge characteristic of svPPA, rather than a pure syntactic deficit, compromised the ability to combine words into informative, meaningful phrases.
Intersections with other studies of written discourse
We found a trend of a lower percentage of verbs in nfvPPA compared to controls. These findings are consistent with Graham, 40 who found reduced verb production in CTP descriptions in nfvPPA compared to healthy controls, and Code, 45 who found a decline in the number of verbs over time in sentence-writing tasks. Sitek et al. 42 reported frequent verb use in lvPPA compared to AD. We found higher percentages of verbs in lvPPA than in nfvPPA and higher percentages of CTP verb phrase CUs in lvPPA compared to svPPA.
Comparison of spoken and written discourse
Analyses of oral CTP descriptions generally have revealed grammatical impairment in nfvPPA compared to lvPPA and svPPA.16,18 Themistocleous et al. 20 found that patients with nfvPPA produced fewer conjunctions, determiners, particles, and prepositions than patients with svPPA and lvPPA in spoken picture descriptions; however, the statistical models for these categories did not show statistically significant differences. Our results that there were fewer particles and verbs in the written output of participants with nfvPPA (significantly fewer particles in the written output of participants with nfvPPA compared to lvPPA, and a trend for fewer verbs than controls) align with the findings of Themistocleous et al., 20 reflecting grammatical impairment in nfvPPA. Grammatical impairment in nfvPPA is characterized by decreased use of verbs which are fundamental to sentence structure, 96 and the loss of function words, which affects sentence formation and results in telegraphic output. 13 Limited attention has been given to the use of adjectives in oral and written discourse in PPA. Interestingly, we found the lowest percentage of adjective CUs in the written narratives of participants with nfvPPA (0.49% adjectives, Table 5) and a low percentage of adjectives in the part of speech analysis for nfvPPA (1.27%, Table 4). Our findings that participants with nfvPPA produced few adjectives in written discourse appear consistent with Walenski et al. 30 who found significantly fewer attributive adjectives in oral story narrations of individuals with agrammatic PPA and agrammatic post-stroke aphasia than healthy controls. An analysis of contemporaneous spoken and written narratives is needed to discover similarities and differences in these domains.
Clinical implications
Incorporation of analyses of spoken and written connected speech, as a supplement to standard cognitive/language tests commonly used in clinical settings, is generally advocated to capture the impact of aphasia on functional communication despite concerns regarding the lack of standardization of elicitation techniques, data collection, and analysis.34,97 The time required to transcribe and analyze discourse has been identified as the most significant barrier to the implementation of discourse analysis into the clinical environment. 98 Incorporation of written discourse may overcome this barrier to clinical implementation because clinicians do not need to transcribe the samples and can review writing samples for specific items from the CTP CU list or from compendiums of core lexicon checklists, perform main concept analyses, or calculate derived communication efficiency scores. These non-transcription-based discourse measures can reduce the amount of time required for discourse analysis, making clinical utilization a reality. 99 We expanded the CPT CUs analysis to consider single words and phrases separately as a reflection of overall narrative complexity which is also easily incorporated into the clinical setting. In addition, online platforms, such as Open Brain AI, 90 which we used in this paper, can improve the efficiency of discourse analysis.
The distinction between these variant types can be challenging to determine in clinical practice due to shared language characteristics among variants (e.g., anomia), speech and language features that obscure differential diagnosis, and variability in clinical presentation, particularly in nfvPPA and lvPPA. 64 Our findings suggest that analysis of written narratives focuses on particles, verbs, total words, content units, and overall complexity of the narratives to aid in diagnosis. Analysis of written narratives focusing on particles and verbs may aid in distinguishing nfvPPA and lvPPA with infrequent use of particles and verbs in nfvPPA and frequent use of verbs in lvPPA. Differential diagnosis may also be facilitated by considering the total number of words with fewer overall words consistent with svPPA and fewer phrases than single words in nfvPPA and svPPA, although for different underlying reasons as indicated above.
Limitations
A limitation of our study may be the elicitation stimulus. The narratives derived from the CTP may be restricted in syntactic variety and spontaneity compared to story narration and structured interviews. However, CTP is the most widely used picture stimulus in studies of discourse in populations with neurogenic communication impairments. 100 Strengths of picture description, in general, include its sensitivity to lexico-semantic abilities in PPA,14,15 and availability of a standardized administration procedure and scoring system with published content units for the CTP.69,70 A list of 64 content units is available for the WAB Picnic Scene, although this resource may be less well-known. 101
In keeping with the original instructions for spoken descriptions of the CTP, there was not a prescribed time limit for the written CTP task because this task is intended to elicit as complete and accurate a response as possible. However, the unlimited duration of the task might introduce bias in the analysis since impairments in verbal and visual memory, processing speed, executive function, and visuospatial function may be mitigated and this might vary across PPA variants since their neurocognitive profiles differ. 102
Although we used percentages (except for total words, total syllables, and total content words) in each of our three analyses (part of speech, content units and communication efficiency, and Core Lexicon), another limitation may be the greater cognitive/language impairment in the svPPA group compared to the lvPPA and nfvPPA groups. To address this concern, we conducted a post hoc one-way analysis of covariance (ANCOVA), with CDR scores as a covariate. This allowed us to compare the PPA variants for those findings that showed a statistically significant difference among the PPA variants while removing the contribution of overall cognitive severity in explaining those differences. Specifically, we performed this analysis for: total words, total syllables, total content words, percentage of particles, percentage of content units (out of 52), percentage of noun phrase content units, percentage of verb phrase content units, and percentage of Core Lexicon words (out of 26). CDR scores were significant only for noun phrases. The results using CDR scores to control for differences in cognitive/language impairment in the PPA groups were identical to our original findings (Table 7).
Post hoc ANCOVA analysis of significant findings.
df:, degrees of freedom; lvPPA: logopenic primary progressive aphasia; nfvPPA: nonfluent variant primary progressive aphasia; svPPA: semantic variant primary progressive aphasia; p values were adjusted for three comparisons (lv-nfv, lv-sv, nfv-sv); pbonf = 0.05/3 = 0.016.
The bold type indicates significant values for easier reading in this very detailed table.
Future directions
A longitudinal study of written discourse may have the potential to facilitate early detection of PPA and change over time, particularly lvPPA which is typically associated with AD pathology. Analyses of writing samples have been a fruitful avenue of research in nfvPPA 45 and svPPA.46,47 Longitudinal analysis of writing samples of famous authors diagnosed with AD uncovered subtle changes that reflected the onset of AD103–105 and findings from the Nun Study revealed changes in idea density which were strongly related to dementia and AD in later life. 106 Picture description, despite its limitations, is well-suited to the assessment of change over time because the stimulus is readily accessible, repeatable, and easily analyzed using published criteria and automated approaches. In addition, comparison of spoken and written discourse using the same stimulus has received little attention to date. Spoken and written picture descriptions have only been compared in nfvPPA and healthy controls by Graham et al. 40 and in all PPA variants and healthy controls by Josephy-Hernandez et al. 44
Analysis of written discourse may serve as a potent measure of the effect of an intervention on functional communication. Written communication is an increasingly important activity of daily living because of the expanded application of technology in everyday life. Treatment success is commonly measured at the impairment level on trained and untrained confrontation naming tasks. However, since the ultimate goal of speech-language treatment is to facilitate successful communication in daily life (e.g., function-based: asking for directions, talking on the phone), assessment of the generalization of treatment at the functional level is vital and consistent with the domains of the International Classification of Functioning Disability and Health—body functions and structure (impairment-based), and activity and participation (function-based). 107 Connected speech samples, such as picture descriptions, are thought to more closely reflect communication as it occurs in everyday life and can provide compelling evidence of treatment effect at a life participation level.
Footnotes
Acknowledgements
We wish to acknowledge funding from NIH to support this work. We extend our sincerest gratitude to the participants, their families, and referring physicians for their dedication and interest in our study.
Ethical considerations
The research was planned, conducted, and recorded according to the principles of the Declaration of Helsinki, seventh edition. Ethical approval was obtained from the Johns Hopkins University School of Medicine Institutional Review Board (NA_00071337; NA_00042097; IRB00201027) and the University of South Florida Institutional Review Board (IRB005068). For clinical patients, ethical approval for chart review, data extraction, and data entry into a secured, de-identified spreadsheet from the electronic medical record was obtained from the Johns Hopkins University School of Medicine Institutional Review Board which approved this study as exempt research for which consent is not required (IRB00241582).
Consent to participate
The participants in the clinical trials or their legally authorized representatives provided written informed consent to participate in the studies.
Consent for publication
Not applicable.
Author contributions
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Salary support for authors (DCT, KN, JG, BR, KT) was from awards NIH/NIA R01AG075404, R01AG068881, and R01AG075111 (National Institutes of Health, National Institute of Aging) to KT. For authors AEH and HK salary support was provided from award NIH/NIDCD R01 DC05375 to AEH. No funding was received for this work for KS and CT.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
