Abstract
Background:
Differentiating early behavioral variant frontotemporal dementia (bvFTD) and primary psychiatric disorders (PPD) is complex and biomarkers have limited accuracy, leading to inaccurate diagnoses.
Objectives:
Develop a simple bedside clinical tool to differentiate bvFTD from PPD.
Methods:
A checklist of clinical features differentiating bvFTD from PPD was developed based on literature and clinical experience. The checklist was filled prospectively for 29 consecutive patients (Montreal Neurological Hospital) with late-onset (≥ age 40) behavioral changes suggestive of bvFTD. The checklist was subsequently retrospectively completed on the baseline visit (N = 137) of the Late-Onset Frontal Lobe study (Amsterdam). In both cohorts, patients were followed 2 years to establish a final best clinical diagnosis, categorizing patients into Probable FTD (N = 46), Possible FTD (N = 8), Other Cognitive Disorder (N = 36), Other Neurological Disorder (N = 10), or PPD (N = 66).
Results:
All items distinguished the two groups except “duration more than 5 years”, which was removed to create a final 17-item version. Mean checklist scores were significantly different across all groups (Oneway ANOVA F(4,161) = 27.462, p < 0.001). The PPD group had lower scores than all other dementia categories, with the largest difference between Probable FTD (
Conclusions:
Although further prospective validation is required, the “FTD vs PPD Checklist” could provide a simple tool to improve diagnostic accuracy, particularly in non-specialized settings.
Keywords
INTRODUCTION
Frontotemporal dementia (FTD) is one of the most common forms of early-onset dementia [1, 2]. The behavioral variant (bvFTD) presents with personality and behavioral changes such as loss of empathy, apathy, and disinhibition. The clinical diagnosis of bvFTD is challenging because there is a striking overlap with symptoms of primary psychiatric disorders (PPD) including schizophrenia, major depressive disorder, bipolar disorder, obsessive-compulsive disorder, autism spectrum disorders, and some personality disorders [3, 4]. In addition, cortical atrophy, a key marker of bvFTD, is often minimal in the initial phase of the disease [5]. Consequently, a large percentage of patients are initially diagnosed with a PPD prior to the identification of FTD [6]. In some cases this constitutes an erroneous diagnosis, while in others the psychiatric disorder constitutes a prodromal phase to bvFTD [2, 3]. It has also been shown that patients with PPD can be wrongly diagnosed with bvFTD, particularly in community settings [7].
Expert clinicians across the world have developed various approaches to identify patients with FTD among patients presenting with late-onset (≥40-year-old) behavioral changes, but there is no agreed upon consensus approach, and evidence suggest a low rate of diagnostic accuracy. Indeed, the Late-Onset Frontal lobe (LOF) study [8] demonstrated that in a cohort of mixed neuropsychiatric cases (i.e., representative of clinical practice) the application of current diagnostic criteria for possible bvFTD has poor specificity [9, 10]. In addition, in this clinically representative population, the sensitivity of standard MRI is insufficient (70%); while the specificity of alternatives such as [18F]FDG-PET is low (68%) [5]. Moreover, a significant proportion of patients with PPD had abnormal findings on [18F]FDG-PET, increasing the risk of false bvFTD diagnosis [5, 10]. Further, cognitive deficits on standard cognitive tests were found to be often more severe in PPD, and therefore cannot be used solely to differentiate them from bvFTD [11].
A multi-disciplinary approach including a specialized neurology and neuropsychiatric assessment is recommended to improve the diagnostic process [7, 12], but the field’s capacity to predict long-term progression remains limited [10]. A few approaches have shown potential to improve diagnostic accuracy to identify cases of bvFTD in patients with late-onset behavioral changes including the systematic application of clinical scales such as the Frontal Behavior Inventory (FBI) and Stereotypical Rating Inventory (SRI) [13], in addition to social cognitive batteries such as the Mini-SEA [14]. Biomarkers hold a lot of promise to improve diagnostic accuracy in the early stage, including cerebrospinal fluid (CSF) markers such as neurofilament light chains and phosphorylated tau [15], in addition to MRI morphometric processing with machine-learning classifications [16, 17]. However the sample size of studies tends to be small and restricted to a single site. In the same line, promising Tau-PET tracers have so far limited binding in non-AD tauopathies [18].
In the context of all those limitations to biomarkers, a careful clinical characterization is still a major factor to reach an accurate diagnosis [10, 19]. There is a need to improve and formalize this clinical approach to the differential diagnosis between bvFTD and PPD, particularly in context where access to specialized multi-disciplinary teams is limited. We hypothesized that a bedside questionnaire focusing on key clinical differentiating factors could constitute a simple clinical tool to improve diagnostic accuracy, particularly in non-specialized settings. In this study, we present the scale development, reduction process, and pilot validation in two cohorts following the STARD reporting guidelines.
MATERIALS AND METHODS
Checklist development
We aimed to develop a checklist of clinical features differentiating bvFTD from PPD in patients presenting with late-onset (≥ age 40) behavioral changes suggestive of bvFTD (i.e., apathy, disinhibition, loss of empathy, stereotypies, or personality changes with executive dysfunction). The literature was reviewed to identify clinical factors more associated with one or the other diagnostic categories [3, 19]. The 18-item initial version of the “FTD vs PPD Checklist” was built based on this literature and collaborative expert clinical experience (SD, BCD, HC). The intent was to include only simple yes/no questions that could be rapidly assessed in clinical practice. We decided not to include an item on family history of psychiatric disorders, as this was deemed less high yield than a personal psychiatric history and the documented risk of introducing an unnecessary bias toward psychiatric disorders [6]. The scale was designed to capture the most common psychiatric differential diagnoses (i.e., mood disorders, anxiety disorders, personality disorders) in addition to other situational/relational contributors (e.g., relational problems, malingering) [10, 19]. We initially included a question about prolonged duration of symptoms to capture (among other diagnoses) developmental autism spectrum disorder, although this has shown to be relatively rare even among phenocopies cases [20]. Apathy, loss of empathy, and disinhibition were not included as the goal of the checklist was to provide a more in-depth assessment of features associated with PPD in patients presenting with those symptoms clinically.
The preliminary 18-item checklist version is presented in Table 1. The thirteen items in part A include clinical features that are more suggestive of PPD. The five items in part B include features strongly associated with bvFTD. The “FTD vs PPD Checklist” is completed by the evaluating clinician, ideally at the time of the assessment (although it can be done retrospectively). The score (maximum 18) is obtained by adding answers that are more suggestive of FTD, i.e., ‘no’ responses from section A and ‘yes’ from the section B.
Questions from the Initial 18-item Preliminary Version of the “FTD vs PPD Checklist”
Questions from the Initial 18-item Preliminary Version of the “FTD vs PPD Checklist”
Data collection
The checklist was piloted in two cohorts of patients presenting with behavioral changes at ≥40 years of age. It was initially obtained prospectively in a Neuropsychiatry clinic at the Montreal Neurological Hospital (MNH) to explore concept validity. We subsequently tested the scale retrospectively in a well characterized longitudinal cohort (LOF study) to increase the sample size and further develop validation of the scale.
Montreal Neurological Hospital cohort
In a first step, the “FTD vs PPD Checklist” was completed prospectively in consecutive adult patients with late-onset behavioral changes (defined as apathy, disinhibition, stereotypies, and/or personality changes with executive dysfunction in patients≥40-year-old without upper age limit) referred to a specialized neuropsychiatry clinic at the MNH to assess for FTD. Patients with a diagnosis FTD (any subtype) beyond the mild stage were excluded. This study was performed after obtaining the consent from the McGill University Health Centre Review Ethics Board, as part of a broader database. Patients were evaluated by a UCNS board certified neuropsychiatrist with expertise in FTD (SD). The checklist was filled out prospectively at baseline evaluation for a total of 29 patients (age range 41–84) between March 2015 and September 2016. Patients were followed longitudinally for 18–24 months until the establishment of a final diagnosis. Diagnoses were based upon the DSM-5 for PPDs, and the International bvFTD Criteria Consortium [21]. At the last follow-up patients were classified into the following categories: Probable bvFTD, Possible bvFTD, Other Cognitive Disorder, Other Neurological Disorder, or PPD (including situational problems). The final diagnosis of possible bvFTD was maintained if clinical criteria were met without supportive imaging or genetics, and in the absence of a PPD or other cognitive disorder better explaining symptoms.
Late-onset frontal lobe study cohort
In a second step, the scale was retrospectively completed on the baseline visit of 137 patients from the Late-Onset Frontal (LOF) Lobe Study [8]. In the LOF study, patients were recruited through the Alzheimer’s center of the Amsterdam University Medical Centers, and the department of Old Age Psychiatry of GGZinGeest, in Amsterdam, between April 2011 and June 2013. The inclusion for this cohort was limited to patients between the ages of 45 and 75, presenting with late-onset apathy, disinhibition and/or compulsive/stereotypical behavior. A total score of≥11 on the Frontal Behavioral Inventory (FBI) or a score of≥10 on the Stereotypy Rating Inventory was required. All subjects underwent neuropsychological testing and an MRI scan, and were diagnosed based on current diagnostic criteria, and evaluated 2 years later for establishment of a final diagnosis by multi-disciplinary committee.
A student and a psychiatry resident involved with the LOF study (LPD, FG) were trained to rate the checklist based on charts and study documents. Information gathered at the first evaluation was used to retrospectively complete the checklist for each patient. Questions which had not been asked for directly in the baseline assessment were completed as follows: For question 3, “emotional distress” was defined as any score≥50 (out of 100) on the patient-rated level of suffering caused by behavioral change, as evaluated by the Visual Analogue Scale (VAS). For question 4, “expression of guilt, self-blame, or suicidal thoughts” was marked “yes” if patient received a score≥3 (out of 6) on the Montgomery-Asberg Depression Rating Scale question 9 (pessimistic thoughts) or 10 (suicidal thoughts). For question 6, a score≥25 on the patient-rated level of change on the VAS. Question 12, “Is there a legal or compensation issue associated with the case?”, was marked “yes” in cases where the patient had lost their job due to their symptoms; therefore, a diagnosis of FTD would justify compensation to the benefit of the patient. For question 15, language complaints constituted a score≥3 on the FBI question 10 (aphasia and verbal apraxia). Finally, for question 18, “abnormalities on elemental neurological examinations” was marked “yes” if rigidity, tremors, or abnormal eye movements were present on the documented neurological examination performed for all patients. Based on final diagnosis at the 2-year follow-up, all patients were categorized into similar categories as the MNH cohort.
Statistical analyses
All statistical analyses were performed using IBM SPSS Statistics for Windows, version 24. Descriptive statistics were calculated for demographic variables and group distribution.
Checklist reduction
Before proceeding to the analyses of scale performance, several steps were initially taken to determine which checklist items should be kept for scoring. First, the percent of positive responses (patients whose answer indicated FTD) for each question were compared between the probable bvFTD and PPD diagnostic groups to identify any questions which did not differentiate diagnoses in the right direction. Any item that did not show a percentage of response in the direction of FTD (i.e., higher percentage of ‘no’ part A and percentage of ‘yes’ in part B in patients with a final diagnosis of probable bvFTD) was eliminated from the final analyses of scale performance. This led to the elimination of 1 item, leaving a final scale of 17 items (see Results: Scale Reduction Section). We also performed chi-square tests to identify single items that showed statistically significant differences between probable bvFTD and PPD, although we did not require significant difference to keep items in the scale because our a priori hypothesis was that the total score would differentiate the two groups rather than each individual question. Next, phi coefficients were calculated for each pair of questions to look for near-perfect correlations (i.e., higher than 0.9) between items, which would make them redundant in the checklist. We finally tested for the checklist internal consistency with Chronbach’s Alpha.
Checklist performance
After this process of scale reduction, “FTD vs PPD Checklist” scores were obtained for all subjects. OneWay analysis of variance (ANOVA) were performed to compare mean total scores across diagnostic groups for both clinical cohorts combined. Statistical significance was established at p-value≤0.05 uncorrected given the relatively small number of tests performed.
Optimal cutoff scores for bvFTD and PPD identification were determined using receiver operating characteristic (ROC) curves analysis comparing probable FTD to PPD. Given that the objective of the checklist is to identify a specific diagnostic category rather than constitute a screening tool for at-risk patients, when determining optimal checklist score thresholds we placed more emphasis on higher specificity and positive predictive value than on sensitivity. Crosstabs analyses were performed to find the positive predictive value for the proposed cutoff scores.
RESULTS
Demographics
This study included a total of 166 patients, including 120 men and 46 women (Table 2). On average, onset of symptoms began at 58 years of age, with a mean age of 62 at initial diagnosis. The mean duration of symptom prior to the baseline assessment was 4 years.
Demographics
LOF, Late-Onset Frontal Lobe study cohort; MNH, Montreal Neurological Hospital cohort; SD, standard deviation.
Final diagnoses
The most prevalent final diagnosis was PPD (N = 66, 40%), followed by Probable bvFTD (N = 46, 30%) (Table 3). The most commonly diagnosed PPD was Major Depressive Disorder. Other psychiatric diagnoses included most commonly Anxiety Disorders, Bipolar Disorder, Relationship Problems, Schizophrenia, and Autism Spectrum Disorder. A significant portion of patients received a diagnosis of Other Cognitive Disorder (N = 36, 22%). These patients presented most commonly with various other dementias including Alzheimer’s disease, dementia with Lewy bodies, or progressive supranuclear palsy. A number of patients were diagnosed with Other Neurological Disorders, referring to neurological diseases that are not primarily classified as neurocognitive disorders (N = 10, 6%). This included cases of multiple sclerosis (2), post-anoxic encephalopathy (2), paraneoplastic/autoimmune encephalitis (2), epilepsy related behavior (1), sleep apnea (1), Parkinson’s disease (1) and cerebrovascular accident (1).
Distribution of Final Diagnoses (N=166)
LOF, Late-Onset Frontal Lobe study cohort; MCI, mild cognitive impairment; MNH, Montreal Neurological Hospital cohort; PPD, Primary Psychiatric Disorder.
Scale reduction
As seen in Table 4, all but one item on the checklist had a response distribution in the right direction to distinguish patients with Probable bvFTD and those with PPD. Item 7, “Is the duration of neuropsychiatric symptoms longer than 5 years?”, did not adequately distinguish the two groups given that, as opposed to our hypothesis, the percent of positive response was higher in the probable bvFTD group compared to PPD. We therefore analyzed the performance of the scale without Item 7 in addition to the full scale.
Percent of patients whose response on single items indicated FTD in patients with probable bvFTD compared to primary psychiatric disorders
Chi-square tests performed on each item score showed a significant difference between the FTD and the PPD groups on 12 items (items 1, 2, 3, 4, 6, 8, 9, 10, 11, 13, 16, 17) out of 18 (See Table 4). As expected many items were correlated to each other, but none of the original items had a perfect correlation with each other. The highest correlation was found between questions 9 and 10 (φ = 0.686). The lowest correlations were found between questions 4 and 12 (φ = –0.003), as well as questions 7 and 13 (varphi =–0.003). With item 7 removed, the scale had moderate internal consistency (Chronbach’s Alpha = 0.570).
FTD vs PPD checklist scores
Table 5 provides the mean global checklist scores for each diagnostic subgroup across the combined sample (MNH and LOF). The mean total score was significantly different across all groups for both the full scale (Oneway ANOVA F(4,161) = 24.578, p < 0.001) and the reduced scale without item 7 (Oneway ANOVA F(4,161) = 27.462, p < 0.001). As predicted, the highest score was found in probable bvFTD, while the lowest score was found in subjects with PPD at the last follow-up. Post-hoc analyses showed that the PPD group had significantly lower scores than all other dementia categories, but the difference with the ‘other neurological disorders’ category was not significant (p = 0.204). The largest difference was found between the mean scores of patients with Probable bvFTD (
Mean total scores on the “FTD vs PPD Checklist” of the MNH and LOF cohorts combined
Mean full checklist scores for the 18-item version and the reduced 17-item checklist without item 7 compared across diagnostic categories at the final follow-up for both clinical samples combined. *Mean scores significantly below the Probable bvFTD group mean score. ∼: Mean scores significantly above the PPD group mean score. bvFTD, behavioral variant frontotemporal dementia; LOF, Late-Onset Frontal Lobe study cohort; MCI, mild cognitive impairment; MNH, Montreal Neurological Hospital cohort; PPD, primary psychiatric disorder.

Distribution of scores across diagnoses. Distribution of total checklist scores in patients separated by diagnostic groups for the two cohorts apart (A: Montreal Neurological Hospital; B: Late-Onset Frontal lobe study) and combined (C). bvFTD, behavioral variant frontotemporal dementia; Dx, diagnosis; PPD, Primary psychiatric disorder; SD, standard deviation.
The ROC curves for both versions of the “FTD vs PPD Checklist” are shown in Fig. 2, with Probable bvFTD as the positive actual state. The area under the curve was 0.895 for the entire checklist and 0.906 for the checklist without item 7, supporting the revised scale with item 7 removed as the most accurate version. There was no single cutoff that showed sufficient accuracy to distinguish probable bvFTD from PPD. Therefore, two cutoff scores were established using the results from the ROC curve aiming to maximize diagnostic specificity and positive predictive value for each category. The coordinates of the ROC curve used to establish these cutoff values are provided in Fig. 2. A score ≥11 was found to be strongly indicative of bvFTD (specificity 93.9%, sensitivity 71.1%, PPV 89.2%). Scores ≤8 were found to be strongly indicative of a PPD (specificity 91.3%, sensitivity 77.3%, PPV 92.7%). Patient scores of 9–10 are considered indeterminate.

ROC Curves for Probable bvFTD versus PPD (N = 112). ROC curves for the 18-items (left) and the 17-items (right) “FTD vs PPD Checklist”, with Probable bvFTD as the positive actual state. Coordinates of the ROC curve used to determine cutoff values for each curve are provided below their respective figure. ROC, receiver operating characteristics; bvFTD, behavioral variant frontotemporal dementia; PPD, Primary psychiatric disorders.
DISCUSSION
The “FTD vs PPD Checklist” was developed based on literature and clinical expertise as a rapid and simple bedside tool to facilitate the identification of patients with bvFTD among the wider group of patients presenting with behavioral changes such as apathy, disinhibition, and loss of empathy. Although further prospective validation will be required, results suggest that this 5-minute 17-item checklist is promising as a simple clinical tool to improve diagnostic accuracy, particularly in non-specialized settings. A score of 11 and above was found to have an 89.2% PPV for bvFTD, while a score of 8 and below had a PPV of 92.7% for either PPD or situational problems. Patients with scores of 9 or 10 are considered in the indeterminate zone, as these scores were seen across all diagnostic groups, including other neurocognitive disorders.
One item of the original scale (question 7: duration longer than 5 years) was eliminated in the scale reduction process, adjusting the initial 18-item checklist to a 17-item one. As opposed to our a priori hypothesis, duration of symptoms longer than 5 years was found to be slightly more common in patients with probable bvFTD compared to the PPD in our cohorts. Although this was unexpected, slowly progressive cases of bvFTD have been well documented [22]. This could also be related to prolonged delays prior to the initial consultation in bvFTD, versus a stronger personal desire to consult in patients with PPD. In addition, it is often difficult to establish the exact period of symptom onset in bvFTD, as subtle personality changes can often be traced back several years prior to the moment when concerns for dementia emerged.
Different clinical approaches have been developed as an attempt to differentiate bvFTD from PPD. Score >12 on the positive section of the FBI and >5 on the SRI were found to significantly increase the probability of bvFTD at longitudinal follow-up (four to five times odds ratio) [13]. However, these scales essentially capture in a structured fashion symptoms of bvFTD, and therefore have limitations to identify subjects with more ambiguous presentations. To our knowledge, the “FTD vs PPD checklist” is the first scale that specifically focuses on factors differentiating bvFTD from PPD categories, rather than quantifying symptoms of one or the other categories. Approaches relying on standardized social cognition tests have also shown interesting sensitivity and specificity to differentiate patients with well-characterized bvFTD and major depressive disorder [14], and recent work supports the use of the Ekman 60 faces test in real-life clinical settings [23].
The “FTD vs PPD checklist” has several advantages. First it is brief and can be completed by a clinician within the first visit. Second, items are yes/no categorical responses rather than severity ratings, which should increase inter-rater and intra-rater reliability. It can further be applied by clinicians that have less experience with this clinical population, which could minimize the rate of false bvFTD diagnosis in community settings [7]. The specificity of the checklist above and below the cutoffs was over 90%, which could potentially reduce the need for costly investigations such as FDG-PET that was found to lack specificity in patients with late-onset behavioral changes [5].
This pilot study has several weaknesses that must be acknowledged. First, in the LOF study we had to retrospectively complete the checklist. While this is not ideal, the LOF baseline assessment was very systematic, allowing us to reliably fill the scale following a structured approach. The most problematic item was item 12 (Is there a legal or compensation issue associated with the case?) which was meant to target litigious compensation in a North American context, but did not translate as well to the Netherlands setting. Second, we did not perform inter-rater reliability assessment as patients were either seen by a single clinician (MNH cohort) or was filled retrospectively (LOF cohort). We also did not assess intra-rater reliability; however, given that the checklist is based on rather simple yes/no answers, we do not anticipate problems at this level. Third, the scale did not clearly differentiate patients with final diagnoses of probable bvFTD from patients with stable possible bvFTD diagnoses at the 2-year follow-up (i.e., phenocopies versus very slow progressors); however, the sample size for this group was small. It is also important to note that the gold standard used to validate the scale was a 2-year multi-disciplinary clinical diagnosis rather than confirmed pathological diagnoses post autopsy. While this approach is in line with current best practice, it cannot be excluded that a small number of patients with final PPD diagnoses might be still in the prodromal phase of bvFTD. Further, the internal consistency of the scale was moderate (Chronbach’s Alpha = 0.570). We believe this is related to the goal of this scale to capture various differences between bvFTD and several PPD, as opposed to measuring a unitary concept such as the severity of a specific symptom. Finally, the study had a moderate sample size, particularly in the MNH Cohort for which the scale was completed prospectively. This limited the possibility to determine the scale accuracy to distinguish bvFTD from other cognitive disorders. Because of these weaknesses, the checklist including cutoff scores will have to be validated prospectively in larger cohorts with inter-rater reliability before recommending clinical use.
Conclusion
The “FTD vs PPD Checklist” is a simple 17-item checklist that shows promising properties to identify patients suffering from bvFTD among real-life clinical cohorts of patients with late-onset behavioral changes. The checklist was developed to be a quick complementary tool to a comprehensive neuropsychiatric assessment, along with other validated severity scales and biomarkers. The final version of the scale is provided in Fig. 3 for reference purpose, as further prospective validation is required before the checklist (including the cutoff scores) can be used clinically. In terms of future directions, it will be interesting to determine if the checklist can also assist in differentiating patients with PPD from other major neurocognitive disorders, and to differentiate probable bvFTD from bvFTD non-progressors (i.e., “phenocopies”) [24].

FTD versus PPD Checklist (17-item Final Version). The total score is obtained by adding all the ‘No’ responses in section A and the ‘Yes’ responses in section B. Scores ≥11 are suggestive of bvFTD. Scores ≤8 are suggestive of a primary psychiatric disorder. Score of 9–10 are indeterminate.
Footnotes
ACKNOWLEDGMENTS
Dr. Ducharme receives salary and operating funds from the Fonds de Recherche du Québec –Santé. Research at the Amsterdam University Medical Centre Alzheimer Centre is part of the neurodegeneration research program of the Neuroscience Campus Amsterdam. Amsterdam University Medical Centre Alzheimer Centre is supported by Alzheimer Nederland and Stichting VUmc funds.
