Abstract
Background:
Semantic dementia (SD) is a subtype of frontotemporal lobe degeneration characterized by semantic loss, with other cognitive functions initially preserved. SD requires differential diagnosis with Alzheimer’s disease (AD) and behavioral variant frontotemporal dementia (bvFTD). Semantic knowledge can be evaluated through different tests; however, most of them depend on language.
Objective:
We describe the development of a brief drawing task that may be helpful for the differential diagnosis of SD.
Methods:
Seventy-two patients, including 32 AD, 19 bvFTD, and 21 SD were asked to draw 12 items with different age of acquisition and familiarity, belonging to four different semantic categories. We employed the drawings of healthy volunteers to build a scoring scheme.
Results:
Turtle, strawberry, train, and envelope were the items of each category that best discriminated between groups and were selected for the Brief drawing task. The discriminatory power of the Brief drawing task between SD versus AD and bvFTD patients, estimated through the area under the curve was 0.84 (95% CI = 0.72–0.96, p = 0.000007). In a logistic model, the Brief drawing task (p = 0.003) and VOSP “number location” subtest (p = 0.016) were significant predictors of the diagnosis of SD versus AD and bvFTD after adjustment by the main covariates. The Brief drawing task provided clinically useful qualitative information. SD drawings were characterized by loss of the distinctive features, intrusions, tendency to prototype, and answers like “I don’t know what this is”.
Conclusion:
The Brief drawing task appears to reveal deficits in semantic knowledge among patients with SD that may assist in the differential diagnosis with other neurodegenerative diseases.
INTRODUCTION
Semantic dementia (SD) is a subtype of frontotemporal lobe degeneration (FTLD) in which the main symptom is a specific loss of semantic memory, in both the verbal and non-verbal domains, with initially preserved abilities of other cognitive functions [1, 2]. The initial features of SD are anomia and single word comprehension deficits [1], with a poor understanding of single low-frequency words and initial preservation of higher frequency words. However, grammar knowledge is preserved, there is no apraxia of speech, and verbal repetition is maintained. SD patients show a relative preservation of visospatial capacity, executive functions, and memory, particularly visual memory [3]. Although the semantic deficit is the main clinical characteristic, changes in behavior and personality are also observed.
Alzheimer’s disease (AD) and behavioral variant frontotemporal dementia (bvFTD) are the main differential diagnosis of SD, especially at early disease stages. Clinically, typical AD is characterized by a progressive cognitive decline, especially affecting episodic memory. Some researchers are reporting that semantic memory is as well impaired early in the AD patient, resulting in verbal fluency and naming difficulties. The semantic loss in AD can often occur several years before diagnosis [4]. On the other hand, bvFTD is characterized by progressive and insidious behavioral alterations associated with cognitive impairments, principally involving executive dysfunction with relative preservation of memory and visuospatial functions [5]. Semantic deficits are not part of the diagnostic criteria; however, in clinical practice they can sometimes be objectified.
Diagnosis of patients with SD could be challenging because there is not a gold standard test, and methodological differences between centers might suppose a barrier for study comparisons [6]. Screening tests such as the Mini-Mental State Examination, in which the words to be named are very common, are not sensitive enough to detect early SD. More comprehensive neuropsychological assessments including detection of low-frequency words are usually needed [7, 8]. Traditionally, tests of confrontation naming like the Boston Naming Test (BNT), single-word comprehension, word-picture matching, description of objects, and category fluency, which are based on language, have been used to assess the loss of semantic knowledge in SD patients [9, 10]. Another way to evaluate semantic memory is through the Pyramids and Palms Tree test, specifically created to assess this domain; however, a caveat of this test is that it is closely related to language and culture. Drawing tasks have been used in several studies to investigate the patient’s ability to retrieve semantic information through a non-linguistic modality [11–15].
Drawing is a complex task; when we are drawing, different abilities and cognitive processes intervene in a joint way, among them visuospatial skills, attentional mechanisms, different mental representations of space, conceptual knowledge, motion planning, and control mechanisms, as well as spatial manipulation abilities [16]. Cohn (2012) defends a very close relationship between language and symbolic graphic representations [17]. This author maintains that when people are painting, they store hundreds of thousands of mental models in their long-term memory that combine to create different graphic representations. This process is very similar to language, where different lexical elements are stored and combined to generate new and infinite productions. Free drawing requires access to the semantic store, while the copy responds to functions of attention, working memory, as well as visuoperceptive and visuomanipulative skills, without requiring semantic knowledge of the model.
Drawing is a technique extensively used in cognitive assessments and is present in many screening tests, as well as in different neuropsychological tasks. However, in regular neuropsychological assessments, patients are typically asked to copy a figure, for example, the Rey-Osterrieth Complex Figure Test (ROCF), or to draw what they have previously copied or seen (e.g., delayed recall ROCF test or Benton visual retention test). The task of drawing without visual references as a tool to assess semantic knowledge does not usually form part of the neuropsychological batteries that are routinely administered in dementia units.
Publications that report its use as a tool in clinical practice are mostly single cases where this technique explores category-specific deficits [11, 13–15]. Probably, the most comprehensive study investigating the structure of conceptual knowledge through drawings in a sample of patients with SD was carried out by Bozeat et al. in 2003 [12].
Drawing is an easy tool to administer, simple to understand, and inexpensive. It can be applied to almost any age range and it offers very valuable information on semantic knowledge. When a person is asked to draw a turtle, for example, it is necessary to have semantic knowledge of what it is. That a turtle is an animal that is characterized by having a shell, shell pattern, and legs, head, and tail is part of the knowledge we all share about what a turtle is. If we ask someone to draw a turtle, their drawing should at least be composed of these five elements.
We consider that drawing without visual references could be a good technique to evaluate semantic knowledge in clinical practice, with the advantage of not being interfered by the language component. As far as we know, no work has been carried out to analyze drawing as a tool in the differential diagnosis of SD versus AD and bvFTD. The aim of this study is to design, describe, and propose a drawing tool for the diagnosis of SD.
METHODS
Patients
We recruited SD, AD, and bvFTD patients from the University Hospital Marqués de Valdecilla (UHMV) Memory Unit in Santander, Spain. All patients met the diagnostic criteria for AD, bvFTD, and SD [5, 18–20], respectively. Amyloid positron emission tomography (PET) with Pittsburgh Compound-B (PiB) and 2-[18F] Fluoro-2-Deoxy-D-Glucose PET images were acquired in all patients with AD and bvFTD and in 90.47% of patients with SD. We recruited patients older than 50 years with a typical clinical presentation of AD and positive PIB-PET scan. BvFTD patients with no evidence of brain atrophy and disease progression were excluded to avoid potential non-progressive “phenocopy” cases [21]. SD cases presented with anterior temporal lobe atrophy on structural MRI, a negative PIB-PET and FDG-PET compatible with DS. Diagnoses were established by a clinical committee formed by three neurologists (PSJ, ERR, and CL) and one neuropsychologist (AP). All patients were assessed by a multidisciplinary team to exclude other neurological or psychiatric etiologies. Only patients scoring 4 or 5 on the Global Deterioration Scale were included to assure a comparable degree of disease severity. We selected a group of healthy volunteers between the ages of 55 to 94. The study was approved by the ethical committee of UHMV, and written informed consent was obtained from all the patients.
Neuropsychological assessments
All patients underwent a comprehensive neuropsychological battery by a trained neuropsychologist (AP) that explored the main cognitive functions (memory, language, praxis, visual perception, and frontal functions). All neuropsychological scores were adjusted for age and educational level according to normative neuropsychological data from the NEURONORMA PROJECT [22]. Cognitive cut-off scores were defined by NEURONORMA project, and results were considered abnormal if they were more than 1.5 standard deviations below the mean.
As screening tests, the Mini-Mental State Examination test [23] and the Memory Alteration Test [24] were used. Episodic verbal memory was assessed by the Free and Cued Selective Reminding Test and the non-verbal episodic recall was measured using the recall of the ROCF test [25] after 30 min. Semantic memory was assessed by category fluency and description of objects. Language was assessed by the BNT [26] (confrontation naming, verbal comprehension and verbal repetition of BNT). The constructive praxis was assessed by the ROCF test or CERAD battery. The subtest “number location” from the Visual Object and Space Perception Battery (VOSP) evaluated visuospatial and visuoperceptual functions. Frontal functions were assessed by phonemic fluency (“P-words”), Trail Making Test – parts A and B, Digit Span and Digit Symbol of the WAIS III battery, and Stroop test.
Drawing task design
Items were selected from the list of stimuli included in the Cambridge Semantic Memory Battery, with associated values of concept familiarity and age of acquisition [27]. We chose 12 items belonging to four different semantic categories representing living things and artifacts. For each category, there were 3 items to draw, each one with progressively increasing age of acquisition and conversely, a decreasing familiarity (see Supplementary Material 1). The list of items included: dog, duck, turtle (animals category); banana, pear, strawberry (fruits category); plane, motorcycle, train (vehicles category); and comb, scissors, envelope (objects category).
Patients and healthy volunteers were provided with a pencil and a blank paper and were asked to produce drawings of the 12 selected items with unlimited time. They did not have any visual reference: they were asked to draw what they had in their mind.
We employed the drawings of the healthy volunteers to build a scoring scheme. Two independent raters (AP, MG) examined and assembled a list of all features present for each item. These features reflected the attributes that most of the normal subjects included in their drawings, regardless of their pictorial ability or other variables that may differ from one individual to another. All those features that were present in more than 75% of the control group drawings for each item were included in the final scoring scheme. It should be noted that the maximum possible score for each item varies in function on the number of features produced in the control drawings (example: the drawing of a dog should consist of 6 characters, while the drawing of the strawberry only 3). Patient’s drawings were evaluated according to the presence of these features that served as a gold standard (see Supplementary Material 2). Finally, in order to design a clinically orientated task as brief and informative as possible, we selected just those items of each category which showed the highest potential discriminating SD from AD and bvFTD.
Statistical analyses
The Statistical Packages for Social Sciences (SPSS 10.0.1) was used for the data analysis, applying parametric and non-parametric tests according to the data distribution. In the univariate analysis, differences among groups on demographic variables and neuropsychological measurements were analyzed with one-way analysis of variance (ANOVA) and Tukey’s post hoc contrasts. ROC curve analysis was performed to evaluate the discriminatory power of the drawing task. The Spearman’s coefficient was used to assess correlations between the main neuropsychological tests. Logistic binary regression models were constructed to adjust for the covariates. Those variables that were statistically significant in the univariate analysis were included in the models. The inter-rater agreement was evaluated through Pearson’s correlation and Cohen’s Kappa.
RESULTS
Demographic data and neuropsychological test scores
We recruited 72 patients, including 32 AD (71.9% women; mean age at diagnosis 67.81±6.15; range 53–80 years), 19 bvFTD (21.1% woman; mean age at diagnosis 72.68±7.32; range 55–87 years), and 21 SD (42.9% woman, mean age at diagnosis 71.24±9.44; range 50–82 years). We selected 74 age-matched healthy volunteers (59.5% women; mean age 70.35±8.22; range 55–94 years).
Patient’s demographic data and their main neuropsychological test scores are summarized in Table 1. There were no significant differences in age, disease duration, category fluency, or ROCF score. However, there were statistically significant differences across the three diagnostic groups for the BNT and VOSP (“number location” subtest). SD patients obtained lower scores in the BNT; statistically significant differences were observed with AD patients (p = 0.000001) and with bvFTD (p = 0.001). The VOSP “number location” subtest (visuospatial and visuoperceptual function) was impaired in the AD group compared to the SD (p = 0.003) but not with bvFTD.
Demographic data and neuropsychological test scores
AD, Alzheimer’s disease; bvFTD, behavior variant frontotemporal dementia; SD, semantic dementia; BNT, Boston naming test; ROCF, Rey Osterrieth Complex Figure Copy. In bold, p < 0.05.
Qualitative analysis of drawing in dementia patients
SD
The category “animals” was the one that most frequently SD patients refused to draw, specifically a 23.8% of the patients declined to draw any of the three animals, most of the time because they argued that they had no idea of what that word meant. The item “turtle” was the one that most frequently patients left blank, expressing many of them “I don’t know what this is". It was observed that, regardless of the animal, drawings tended to be composed by a body either square or circular, and a head and legs, four generally. The animals drawn looked all similar and did not exhibit the distinctive features of each of them (e.g., ducks did not have feathers and were not palmipeds or turtles did not have a shell) resulting in drawings that were more “prototypical” than the target items. In five cases, intrusions were observed, adding features not specific to that item, for example, 4 legs in the duck or a turtle with a dog’s body (see Fig. 1). In one case, a patient produced a semantic error, drawing a crocodile instead of a turtle. In the “fruits” category, the drawings tended to have rounded or elongated shapes, without the specific characteristics of each of them; the three drawings being very similar in many cases. No intrusions were observed in this category. In the “vehicles” category, some patients represented these items in the form of elongated boxes with wheels, while in others cases the same shape was observed as in the body of the animals to which wheels were added. The loss of the distinctive features of the items was also evident and it was difficult to find planes with wings or trains with wagons. In addition, certain intrusions such as 4 wheels in motorcycles, or legs in planes and trains were present (see Fig. 1). In the last domain, “object”, the representations were simpler, and drawings differed better from each other. The drawings tended to be elongated forms (especially comb and envelope) while the scissors were in many cases drawn with two crossing lines. No intrusive elements were observed in this category

SD drawings. The loss of distinctive properties can be observed (ducks without beaks or feathers, turtles without shells). The tendency to prototype (all animals and all vehicles were drawn the same basic shape). Inclusion of intrusions (4-legged ducks, windows on a motorbike).
AD
Only one patient refused to draw any of the items (animals and vehicles), alluding to her poor pictorial abilities. The great majority of the drawings, around 90%, were schematic representations but composed of most of the essential features that allow items identification (turtles with shells, ducks with beaks and two legs, planes with tails and wings, trains with wagons and machine, or spots on the banana). The rounded and square non-specific shapes seen in SD patients were not observed, nor were any intrusive elements objectified in their drawings. In the remaining 10%, the drawings were not recognizable in the “animal” category, but correct and recognizable features were seen in the rest of the categories (fruits, vehicles, and objects). In general, in the domain of non-living things (vehicles and objects), the drawings were composed of most of the necessary features and were more recognizable.
bvFTD
None of the bvFTD patients refused to draw any items and all of them completed the task. After qualitative analysis of the drawings, we observed that for half of the patients, representations had a very good level of execution, the drawings were composed by most of the distinctive features in all domains and were comparable to the controls group drawings. Out of the other half, 5 patients (27.7%) produced recognizable drawings but lacking certain distinctive details, being the “animal” category the one that presented the worst execution, and the “object” category the one with best. In addition, intrusions were observed, such as a dog-shaped like a turtle, a duck with 4 legs, and a turtle with many legs. In the remaining 22.2%, the drawings tended to be very primitive, and the items were not recognizable, except for the domain artifacts (vehicles and objects) in which most of the required features were present.
Figure 2 shows a representative example of the comparison of the drawings among the three groups of patients.

Comparison of drawings produced by SD, AD, and bvFTD patients in the Brief drawing task.
Item selection for a Brief drawing task
Table 2 shows the comparisons between SD versus AD and bvFTD for each of the items included in the test. Mann-Whitney test yielded significant differences for turtle (p = 0.010), banana (p = 0.024), strawberry (p = 0.001), train (p = 0.000107), and envelope (p = 0.001). In the pairwise analysis of SD versus AD, all previous items were significantly different except for the banana; the item pear (p = 0.037) was also significantly different between AD and SD. In the SD versus bvFTD comparison, all previous items continued being significantly different except envelope. Therefore, turtle, strawberry, train, and envelope were the items of each category that best discriminated between groups. Those four items turned out to be those of lesser familiarity and older age of acquisition for each category.
Comparison of all items across patients
AD, Alzheimer’s disease; bvFTD, behavior variant frontotemporal dementia; SD, semantic dementia; AD+bvFTD, pooled AD and bvFTD patients. In bold, p < 0.05.
Preliminary data on Brief drawing task performance for the differential diagnosis of SD
Based on our previous analysis, turtle, strawberry, train, and envelope were included in our Brief drawing task. Next, we assessed the reliability of the task and its scoring scheme between two raters (AP and MG). The inter-rater agreement (r = 0.98; p < 0.001) and Cohen’s Kappa = 0.72 were high.
ROC analyses were performed to evaluate the diagnostic usefulness of the Brief drawing task. Its discriminatory power between SD against the other two groups of patients, estimated through the area under the curve (AUC), was 0.84 (95% CI = 0.72–0.96, p = 0.000007). Additionally, we calculated the AUC for those cognitive tests that were statistically significant in the univariate analysis. BNT demonstrated a similar AUC to the Brief drawing task (AUC = 0.84, 95% CI = 0.74–0.95, p = 0.000005). The VOSP “number location” subtest showed an AUC of 0.75 (95% CI = 0.61–0.90; p = 0.004). The Brief drawing task had a 90.48% sensitivity (69.62%–98.83%) and 66.00% specificity (51.23%–78.79%) to differentiate SD versus AD and bvFTD with a cut-off score of 9 out of 13. The positive and negative predictive values were 52.78% and 94.29%, respectively. The Brief drawing task had a 90.48% sensitivity (69.62%–98.83% and 100.00% specificity (95.14%–100.00%) to differentiate SD versus healthy volunteers. The positive and negative predictive values were 100% and 97.37%, respectively.
Scores of the BNT were positively correlated with those of the Brief drawing task (r = 0.44, p = 0.0001). Category fluency (animals) showed also a positive correlation with the Brief drawing task (r = 0.30; p = 0.010). However, the VOSP “number location” subtest and ROCF test were not significantly correlated with the Brief drawing task (r = –0.06; p = 0.63 and r = –0.06; p = 0.65 respectively).
Finally, to assess the diagnostic performance of the Brief drawing task, adjusted by the main covariates, we built a logistic regression model including all relevant demographics and neuropsychological variables that were statistically significant in the univariate analysis. Table 3 shows that after adjusting for age, gender, months of evolution, and education, the Brief drawing task was a significant predictor of SD diagnosis versus AD and bvFTD (p = 0.003). Moreover, the VOSP “number location” subtest was a significant predictor of SD diagnosis as well (p = 0.016).
Results of binary logistic regression analyses for SD diagnosis, including, in addition to Brief drawing task, statistically significant predictors and demographic covariates
AD, Alzheimer’s disease; bvFTD, behavior variant frontotemporal dementia; SD, semantic dementia; AD+bvFTD, pooled AD and bvFTD patients. In bold, p < 0.05.
DISCUSSION
Our study shows that the Brief drawing task, a simple test independent of spoken language, which is composed by four drawings without visual reference, appears to reveal deficits in semantic knowledge among SD patients that may assist in the differential diagnosis with AD and bvFTD.
When we are asked to draw a train, we depend on semantic knowledge to include all characteristics that define a train. In order to draw a nice train, we also need that many other functions, like visuospatial visuoperception and constructive praxia, work finely. However, our results show that despite the selective impairment of these domains, AD patients at early stages can draw semantically correct representations, which are significantly better than SD patients, and which can be used as a clinical test for differential diagnosis between these diseases. Therefore, regardless of pictorial abilities, best or worst visuospatial and visuoperceptive function, or constructive praxia capacity, in most of our AD and bvFTD patients the drawing of a train would include wheels, a locomotive- wagons and elongated shape. These would be the common features we would all draw of a train. But that would not be the case in SD patients who typically might lack some of these basic characteristics.
In our study, the total number of features that patients included in their 4 drawings were not correlated with their visuospatial and visuoperceptive capacity measured through the VOSP “number location” subtest and were not correlated with the task that evaluated constructive praxia (ROCF test). However, as expected, the total number of features included did correlate positively with tasks that measure semantic knowledge through language (BNT and categorical fluency), so those patients with lower scores in the BNT and less category fluency were those who incorporated less detail in their drawings. Based on these results, we consider that the Brief drawing task is targeting semantic knowledge and not other skills related to drawing. Our results are in line with Bozeat et al., 2003 who found significant correlations between impaired performance in the drawing assessment with object naming and word-to-picture matching, pointing out that all those deficiencies were due to selective damage to central conceptual knowledge [12]. In our multivariate analysis, the Brief drawing task and VOSP “number location” subtest predicted the SD diagnosis. Patients with SD scored better in the VOSP “number location” subtest than bvFTD and AD, with statistically significant differences with the later. The logistic regression model showed that the Brief drawing task predicted SD diagnosis independently of VOSP and all other covariates, which also supports the hypothesis that our test is independent and complementary of visuospatial function assessment.
We propose that the Brief drawing task could be a useful clinical tool for the differential diagnosis of the SD versus other dementias, mainly AD and bvFTD. A low score in the Brief drawing task, reflecting the lack of essential features drawn by patients with SD, correlated with loss of semantic knowledge measured through classical language-based tests. The drawings of patients with AD and bvFTD were composed of a significantly greater number of features, had more details, and were sometimes very similar to healthy volunteers. The AD and bvFTD groups did not differ statistically in the number of features included in their drawings. Bozeat et al. in 2003 already observed this loss of semantic knowledge through the analysis of drawings in 6 patients with SD; however, in that work patients were compared only with a healthy control group [12]. The drawings of the SD patients were also characterized by a loss of distinctive features, intrusions, and a tendency to prototype. In that study, patients were asked to draw an object with the model present; afterward, they were asked to draw the same object immediately after removing the model. Finally, after an interval of time, they were asked to draw the same object based on their memory. In contrast, in our study, we did not provide patients with a visual reference.
We performed ROC curves to evaluate the discriminatory power of the test, in what is probably the most demanding clinical scenario, in which clinicians need to distinguish between SD, AD, or bvFTD at their early stages. In our sample, the value of the AUC of the Brief drawing task score (0.84) was comparable with that of the BNT, which is an established and fundamental tool in SD evaluation. Based on ROC curves, we defined a cut off point of 9. This value yielded high sensitivity and negative predictive values, thus the Brief drawing task would be best placed as a screening test, especially useful in those patients with severe language deficits that makes difficult to perform the BNT. In those patients with scores equal to or lower than 9, it will be worth exploring their semantic memory with more specific tests like the Pyramids and Palms.
In our experience, the Brief drawing task is an attractive task for most of our patients; it is quick, does not take more than 3 minutes on average, is easy to administer and to correct, and does not need specific materials other than paper and pencil. And very importantly, the information provided about semantic knowledge is not dependent on the use of spoken language, often compromised in dementia patients, thus it could play a complementary role to that of the BNT.
Another advantage of the Brief drawing task is that, in addition to an objective score, it also provides rich qualitative information clinically about semantic loss. A salient characteristic of SD drawings was the presence of intrusions, that is to include elements not proper of that item, like drawing 4 legs to a duck, turtles with dog’s form, or motorcycles with several wheels (see Fig. 1 for examples). Intrusions were also found in some of the bvFTD patients, likely related to their perseverative component, but very distinctly they were not present in any of the AD patients. Another characteristic feature of SD patients’ drawings was the tendency to prototype; turtles were drawn the same as dogs, with head, body, 4 legs, and tail; or drawings of the three vehicles were almost identical, all of them similar to a car. Finally, some of the answers given by SD patients ("I don’t know what this is", “I’ve never heard that") were very suggestive of a deep deterioration of the underlying concepts.
As a summary, the main aspects that characterized SD patients’ during the Brief drawing task were: 1) answers “I don’t know what this is”, “it sounds to me, but I don’t know”, “I’ve never heard it”; 2) loss of distinctive properties; 3) inclusion of intrusions; and 4) tendency to prototype.
The conceptual knowledge deterioration observed when Brief drawing task was administered in our patients followed similar patterns described by other authors: 1) Patients obtained better results with more frequent or familiar items than with less frequent or familiar items [15, 29]. In our sample, the most informative item for each category was the one with less degree of familiarity; 2) Patients often retained a modest knowledge of more prototypical items and their properties; however, their knowledge of less prototypical items was more deteriorated (e.g., SD patients drew turtles with dog bodies or trains with car shapes) [30, 31].
Variables such as premorbid pictorial skills, motivation for drawing tasks, and even personality traits, for instance, obsessive and perfectionist personalities versus passive personalities who draw quickly and without details, might influence results and be considered limitations.
Our study has shown the potential utility of a very simple and quick drawing task as a screening test for semantic knowledge. In order to validate it as a clinical tool, further studies are ongoing to replicate our findings and cut off points in independent samples of patients and controls to establish inter-center variability and test-retest reliability.
DISCLOSURE STATEMENT
Authors’ disclosures available online (https://www.j-alz.com/manuscript-disclosures/19-0660r1).
