Abstract
Based on the knowledge of art therapy, we developed a new neuropsychological drawing test in order to identify individuals with mild cognitive impairment (MCI) as well as dementia patients and healthy controls (HC). By observing a variety of drawing characteristics of 92 participants with a mean age of 67.7, art therapy and dementia experts discriminate HC from MCI, early dementia of the Alzheimer-type (eDAT), and moderate dementia of the Alzheimer-type (mDAT) by the process analysis of tree drawings on a digitizing tablet. The art therapist’s average categorical rating of healthy and MCI or demented individuals matched the clinical diagnosis by 88%. In a first small study, we analyzed interrater reliability, sensitivity, specificity, negative and positive predicted values of our tree drawing test (TDT) in comparison with the clock drawing test (CDT). Similar values of moderate interrater reliability were found for the TDT (0.56) as well as for the CDT (0.54). A significant high sensitivity of 0.9 within this binary impairment scale (HC versus impaired or demented) can be demonstrated. Substantial values for the specificity (0.67) could be obtained that however remain under a perfect value of the CDT (1.0). Considering 31 individuals that received the clinical diagnosis “impaired or demented” the TDT shows a higher recognition rate for the MCI group than the CDT. Furthermore in 8 of 12 borderline cases of clinical diagnosis, the outcome of the TDT diagnosis was consistent with the final clinical result.
Keywords
INTRODUCTION
Currently, worldwide 30 million people suffer from Alzheimer’s disease (AD) and the World Health Organization predicts that this number will triple over the next 20 years [1]. Therefore, the early diagnosis of AD plays an increasingly important role for our society and can reduce the enormous burden for the health care system [2]. Many studies prove that the early treatment can slow down the disease progression and might increase patient’s quality of life and of their caregivers who suffer emotionally [3]. Due to the fact that nearly 50% of dementia cases are not diagnosed [4], it is important to search for new diagnostic methods for the primary care environment. Individuals with mild cognitive impairment (MCI) have an increased risk of developing dementia and are a target group for preventive measures [5]. Thus research aims at diagnosing MCI at an early stage with short, simple and easily administrable screening tests.
This study represents an entirely new approach for the development of a new hard- and software tool which works with the creative abilities of AD patients and of patients with MCI to detect the impairment of cognitive functioning. Artistic ability seems to be maintained until the advanced stage of AD and the positive self-experience during the creative process provides a sense of achievement instead of a reduction on cognitive deficits [6, 7]. Many of the established neuropsychological tests (NPTs) cause substantial stress and the pressure to perform, that could be one reason for the considerable delay in the diagnosis. Stress is a potent modulator of cognitive function in general [8, 9]. During all of the established NPTs, the patient is situated in an examination situation, where he is required to answer questions and solve cognitive tasks provided by a physician as a kind of assessor. We assume that this aspect may cause stress in some cases and has an influence on the achievement on sufficient results of NPTs. Brain research studies show that brain functions are negatively affected by stress that could lead to a distortion of the results of NPTs [10].
The influence of the well-being aspect that is caused by artistic creativity to neuropsychological tasks has not been previously investigated. We propose that the combination of the enjoyment in being creative and the diagnostic purpose will help to perform valid results and influence the patient to submit to a medical check in the early stage of AD symptoms.
The use of handwritten tasks for dementia diagnosis is well established [11]. Our study is based on the method of the clock drawing test (CDT) that is currently used quite frequently by neurologists and psychiatrists and general practitioners [12, 13]. Previous studies and reports have shown that the CDT screens for early dementia due to AD (eDAT), but does not achieve the quality to screen individuals for MCI [14, 15].
Corresponding to art therapeutic methodology we developed a tree drawing test (TDT) and investigated free creative tree drawings of healthy individuals as well as patients with MCI, eDAT, and moderate dementia (mDAT).
In contrast to the one color and one pen thickness drawing of the CDT, with a digitizing tablet we could collect a variety of information by offering a 12-color palette and three line thicknesses and gather data by observing and retaining the whole drawing process. Drawing characteristics of the areas contour, dynamics of pencil guiding and coloring have been considered for the discrimination of MCI from eDAT and mDAT. We compared our diagnosis with the result of the CTD that was conducted on the same patient. We propose that our method provides the ability to detect all stages of AD as well as patients with MCI and healthy individuals.
METHODS
Participants
A subsample of 92 participants (58 females and 34 males) which represents the first part of an ongoing study, had a mean age of 67.7. They were recruited from the Memory Clinic of the Department of Psychiatry and Psychotherapy at the University Hospital of Tübingen (including healthy controls). All of them had normal visual ability and sufficient hearing. No participant was physically impaired or expressed reservation or rather anxiety in dealing with the digital pad so that all were able to perform the drawings. In the course of the geriatric assessment all participants completed the German 15-item version of the Geriatric Depression Scale (GDS) [16].
The local ethical committee at the University Hospital of Tübingen authorized the study. All participants received a patient education and signed a consent form.
Patients with MCI, eDAT, and mDAT
Patients with MCI, eDAT, and mDAT underwent a larger assessment with physical, neurological, neuropsychological, and psychiatric examinations as well as brain imaging.
On the basis of the experience of art therapist specialized in dementia patients with MCI due to AD [17] and MCI due to depression show nearly the same characteristics in their drawing process. Therefore we combined the two MCI types for the rating.
List of all investigations/analyses with the related consideration of CDT, TDT training stage or rather TDT experienced stage as well as the related figures
The diagnostic criteria for the eDAT and mDAT groups were determined according to the NINCDS/ADRDA (National Institute of Neurological and Communicative Disorders and Stroke Alzheimer’s Disease and Related Disorders Association) [18, 19]. eDAT patients have a score of 3 or 4 and mDAT patients a score of 5 on the Global Deterioration Scale [20].
Healthy control group
HC individuals had no history of neurological or psychiatric disease or any sign of cognitive decline, as confirmed by a clinical interview. All healthy participants were recruited by advertisement in a newspaper.
Procedures
The drawings have been performed and assessed with a Windows Surface Pro 3 digitizer and a handheld stylus pen. Participants were asked to draw a clock with a certain time and, after this task, to paint a tree from their memory. In the case of the TDT, the participants had the opportunity to use the whole variety of a 12-color palette and were allowed to choose between three line thicknesses during their drawing process, whereas performing the CDT they were limited to use the black color and one given line thickness.
A specialized psychologist guided CDT and the TDT and formulated the drawing instruction and the copy conditions in a standard way (CDT: “I want you to draw the face of a clock, putting in all the numbers where they should go, and set the hands at 10 after 11”: TDT: “Please paint a tree of your choice!”).
CDT ratings were made by the psychologist which was one minor part of a detailed medical analysis comprising neuropsychological assessment and physical examination. The neuropsychological examination includes the German version of the modified Consortium to Establish a Registry for Alzheimer’s Disease neuropsychological test battery (CERAD [21]) including the Mini-Mental State Examination (MMSE [22]).
The tree drawings were rated by three art therapists after viewing and discussing the drawing process. The raters were blinded to participant clinical diagnostic, gender, demographic information, and the result of the CDT.
The art therapy and dementia experts analyzed the tree drawings within six meetings that were grouped into a training stage (meeting 1–3) and an experienced stage (meeting 4–6). One rater (rater 3) started to rate only from the 3rd meeting within the training stage.
The calculation of the hit rate in diagnosis by the art therapists compared to the clinical diagnosis based on a dichotomous consideration, where the diagnoses MCI, eDAT, and mDAT were grouped to “impaired or demented”. We differentiated between “hits”, where the art therapist and the clinical diagnosis of one subject were both “impaired or demented” or both “healthy”, and “false diagnoses”, where one diagnosis was “healthy” and the other one “impaired or demented”. For each meeting the hit rates (=number of hits divided by number of subjects) of the therapists were averaged. The rates of false diagnoses represent the cases when the diagnoses of all raters differed from the clinical diagnose.
An overview of all investigations and analyses with the related consideration of CDT, TDT training stage or rather TDT experienced stage as well as the related figures can be found in Table 1.

Workflow of the study.

Distribution of hit rates and false diagnoses of the art therapists over the six meetings.
Statistical analyses
To assess the agreement between the clinical diagnoses and the TDT or CDT results, Cohen’s kappa coefficients with 95% confidence intervals were calculated.
Cohen’s kappa coefficients were also applied in order to measure the inter-rater reliability of the TDT.
After classifying the diagnoses into “healthy” and “impaired or demented”, the performances of the TDT and CDT were determined by calculating the specificities, sensitivities, the positive (PPV) and negative predicted values (NPV) with 95% confidence intervals.
All statistical analyses were calculated with the SAS statistical software program, version 9.4.
Fig. 1 shows a schematic workflow of the data used in this study.
RESULTS
Clinical and demographic characteristics of the participants
Within the mDAT group only four participants could be recruited, whereas an equal number of MCI and eDAT and a minor number of HC participated. MCI and eDAT groups did not significantly differ in age and education and showed a lower mean value of education than the HC group (Table 2).
As expected, patients with MCI due to AD or depression had higher mean GDS scores that is just above the cut-off-score of 6 that indicates mild depression. A steady decrease of mean MMSE scores from HC to mDAT was observed.
Art therapeutic rating in diagnosis compared to the clinical diagnosis
Clinical and demographic characteristics of HC individuals, patients with MCI, eDAT, and mDAT
Note: Values are expressed as mean (standard deviation). n, number; y, year; HC, healthy participants; MCI, mild cognitive impairment due to Alzheimer’s disease or depression; eDAT, mild Alzheimer-type dementia; mDAT, moderate Alzheimer-type dementia; M/F, male/female; GDS, Geriatric Depression Scale (higher score indicates more severe depressive symptoms, maximum 15, a cut-off of 6 indicates mild depression); MMSE, Mini-Mental State Examination.
The art therapeutic hit rate, i.e., the mean percentage of correctly diagnosed “healthy” or “impaired or demented” subjects, increased substantially after the 3rd meeting (=last meeting of the training stage) and remained fairly constant during the experienced stage (Fig. 2). In contrast to a hit rate of 64% in the first rating session, a hit rate of 88% could be reached in the last session. Parallel to this result the percentage value of the false diagnoses, i.e., the percentage of subjects diagnosed false by all raters, decreased substantially within the 4th meeting and the beginning of the experienced stage. In the last session a value of only 8% could be achieved (Fig. 2).
Visual comparison of tree and clock drawings
In the tree drawings of MCI, eDAT, and mDAT patients, an increase of infringements of the design laws from MCI over eDAT to mDAT diagnoses could be observed. In comparison to the clock drawings of the same individuals, it became evident, that slight infringements of the design laws in the drawings of MCI patients are recognizable with the TDT. These slight infringements that were detected after observing the drawing characteristics of the areas contour, dynamics of pencil guiding, and coloring were not able to be determined in the CDT, because it does not contain any information about the creative drawing process (Fig. 3). A range of important characteristics of the drawing process show a possible association with the diagnosis “impaired or demented” and their absolute frequency distribution for the diagnoses healthy (HC), impaired (MCI) or early demented (eDAT), see Fig. 4.

Clock and tree drawings performed by four individuals with the clinical diagnosis healthy (HC), impaired (MCI), early dementia of the Alzheimer-type (eDAT), and moderate dementia of the Alzheimer-type (mDAT) (from left to right).

A range of important characteristics of the TDT-drawing process and the absolute frequency distribution for the diagnosis healthy (HC), impaired (MCI), or demented (eDAT).
Cohen’s kappa coefficient and test performance parameters
Numbers (N) and percentages (%) of clinical, TDT- and CDT-diagnoses of “healthy” and “impaired and demented” separated by training and experienced stage
Note: Rater 3 started only at the 3rd meeting of the training stage and therefore rated 32 out of 53 subjects.
The results of the six meetings were divided into two stages: the training stage, in which drawing characteristics of the rating categories HC, MCI, eDAT, and mDAT have been collected, and the experienced stage, in which all collected characteristics helped to discriminate between the four rating categories.
To compare the performance of the CDT with the TDT, all diagnoses were grouped in “healthy” and “impaired or demented” (Table 3).
To assess the agreement between the clinical diagnosis and the corresponding TDT or CDT outcome, Cohen’s kappa (k) coefficients were calculated. Under consideration of the diagnoses “healthy” and “impaired or demented” there was a low agreement between the TDT and the clinical diagnosis within the training stage (rater 1: k = 0.14, rater 2 k = 0.19, rater 3 could not be considered) and a moderate agreement within the experienced stage (rater 1: k = 0.57, rater 2: k = 0.59, rater 3: k = 0.53). Cohen’s kappa coefficients of the CDT (training stage: k = 0.30 and experienced stage: k = 0.54) also indicated a slight or moderate accordance to the clinical results (Fig. 5).

Cohen’s kappa coefficient with 95% confidence interval, measuring the level of agreement between the clinical diagnoses and the diagnoses of the clock drawing test (CDT) and the tree drawing test (TDT), separately for training and experienced stage.
The calculation of the TDT-specificity in the experienced stage resulted in values of 0.75 (rater 1) and 0.63 (raters 2 and 3). These results remained below the specificity value of 1.0 for the CDT. Therefore the TDT might be characterized by a lower probability to detect correctly a person as “healthy” than the CDT (Fig. 6).

Specificity with 95% confidence interval of the clock drawing test (CDT) and the tree drawing test (TDT), separately for training and experienced stage.
In contrast, the sensitivity, i.e., the number of persons being correctly identified as “impaired or demented”, ranged for the TDT between 0.87 and 0.94 in the experienced stage and was therefore higher than the corresponding CDT-sensitivity of 0.74 (Fig. 7).

Sensitivity with 95% confidence interval of the clock drawing test (CDT) and the tree drawing test (TDT), separately for training and experienced stage.
The PPV revealed high values (≥0.9) for both tests in the experienced stage (Fig. 8), whereas the NPV was higher for the TDT (values between 0.6 and 0.71) than for the CDT (NPV = 0.5) (Fig. 9).

Positive predictive value (PPV) with 95% confidence interval of the clock drawing test (CDT) and the tree drawing test (TDT), separately for training and experienced stage.

Negative predictive value (NPV) with 95% confidence interval of the clock drawing test (CDT) and the tree drawing test (TDT), separately for training and experienced stage.
In the experienced stage, 31 persons had a clinical diagnosis of “impaired or demented” (Table 3). A Venn diagram analysis showed that 26 of them were correctly diagnosed by all three TDT raters and additionally four by at least one rater (Fig. 10). Compared to these results, the CDT correctly identified only 23 persons as “impaired or demented”. One person was neither identified by the CDT nor by the TDT.

Experienced stage: Venn diagram of the total number of subjects being correctly diagnosed as impaired or demented. The four circles demonstrate the overlaps between the three raters of the tree drawing test (TDT, rater 1 to 3) and the clock drawing test (CDT).
Venn diagram analysis of the experienced stage, conducted separately for the clinical diagnosis MCI (Fig. 11), eDAT (Fig. 12), and mDAT (Fig. 13), revealed a higher ability for the TDT in detecting MCI individuals compared to the CDT. The CDT detected six out of 11 persons with a clinical MCI-diagnosis, whereas all three TDT raters classified nine persons correctly and one additional person was identified by rater 3 (Fig. 11). These results point out a potentially increased capacity to recognize MCI patients with the TDT. One patient with MCI was neither identified by the CDT nor the TDT.

Experienced stage: Venn diagram of the number of individuals with mild cognitive impairment (MCI) being correctly diagnosed as impaired or demented. The four circles demonstrate the overlaps between the three raters of the tree drawing test (TDT, rater 1 to 3) and the clock drawing test (CDT).

Experienced stage: Venn diagram of the number of individuals with early dementia (eDAT) being correctly diagnosed as impaired or demented. The four circles demonstrate the overlaps between the three raters of the tree drawing test (TDT, rater 1 to 3) and the clock drawing test (CDT).

Experienced stage: Venn diagram of the number of individuals with moderate dementia (mDAT) being correctly diagnosed as demented. The four circles demonstrate the overlaps between the three raters of the tree drawing test (TDT, rater 1 to 3) and the clock drawing test (CDT).
Looking at the 17 persons, who were clinically diagnosed with early dementia, the TDT also showed a higher ability of detecting them (Fig. 12). Fourteen persons were correctly identified by the CDT and 17 by the TDT, 12 of them by all three raters and five by two raters.
All persons with the clinical diagnosis mDAT (n = 3) were correctly classified by both tests (Fig. 13).
Diagnosis of borderline cases
In twelve borderline cases, in which further examinations were necessary after the first clinical diagnosis to be able to make a sufficiently accurate diagnosis, we investigated the conformity of the 2nd clinical diagnosis with the TDT results. Latter were already provided after the first clinical diagnosis which had to be corrected by the 2nd clinical diagnosis. Finally, we found a total agreement of the 2nd clinical diagnoses with the TDT in four cases, i.e., in more than 33%. In four further cases we could realize a movement of the 2nd clinical diagnosis towards the TDT result, i.e., if the first clinical diagnose stated MCI and by contrast the TDT dementia (mDAT), the 2nd clinical diagnosis with the result “eDAT” moved into the direction of the demented stage already determined by the TDT. Combined in one group of 8 of 12 borderline cases, the TDT was in 67% more successful than the first clinical diagnosis. In four other cases, the TDT could not make any contribution to an accurate diagnosis because the 2nd clinical diagnosis moved away from the TDT result.
DISCUSSION
While a variety of brief dementia tests are available, few are widely used, and many have limited evidence regarding their performance [23]. In a survey of the International Psychogeriatric Association 20 brief cognitive instruments are described, including the MMSE [22] as the most common followed by the CDT [24]. Several studies about these NPTs investigated their predictive values and accuracy and found a low capacity to determine individuals with cognitive impairment [15, 23]. To achieve a higher sensitivity many NPT applications are used in combinations [25].
In another study we suggest that modern digitizing devices offer the opportunity to measure a broad range of visuoconstructive abilities that may be used as a fast and easy instrument to screen for the early detection of cognitive impairment and dementia in primary care [26].
Regarding the CDT, Kim and Chey [27] discuss a limitation for older individuals with low educational attainment and argue that it is not sensitive in detecting dementia in illiterate or low educated elderly persons. Furthermore, the CDT as a rapidly administered system is influenced by age, gender, education, language and ethnicity [13 , 28–30]. We searched for another approach to develop a new method regardless of education level, language and stressors by combining the patient’s enjoyment of being in a creative nonverbal process with the physician’s or therapist’s realization about cognitive impairments. It has shown that therapeutic creative activities have stabilizing effects on the individuals by reducing distress, increasing self-reflection, self-awareness and well-being [31].
The resulting images in art therapy with dementia patients differ greatly from healthy individuals. Hacking and colleagues in their Diagnostic Assessment of Psychiatric Art (DAPA) described the differences between patients with mental disorders and control subjects without any psychological medical history [32, 33]. With our newly developed TDT, we could underpin these results by comparing the tree drawings of patients in three stages of dementia. With a collection of a variety of drawing characteristics we could discriminate paintings of individuals with cognitive impairments and dementia from healthy individuals. The outcome of the interrater reliability resulted in a moderate accordance with the clinical diagnosis and referring to the CDT comparable Cohen’s kappa coefficient values could be achieved.
After all, the fact that the sensitivity for TDT (Cohen’s kappa between 0.87 and 0.94) is superior to CDT (k = 0.74) emphasizes the suitability of our method and procedure.
In contrast to these findings the specificity of the TDT (Cohen’s kappa between 0.63 and 0.75) remains under those of the CDT (k = 1.0). In three cases the drawings of healthy persons showed several characteristics which were assigned to the diagnosis impaired or demented. We suppose that could stem from various reasons, e.g.,k a large inexperience in drawing and writing or reservations about the handling with a digital device. Whereas the CDT is visualizing the cognitive concept of time, the TDT requires creative expression. These different concepts of the both tests might explain the lower specificity of the TDT for cognitive impairment.
Referring to other studies investigating the reliability and validity of the CDT and describing insufficient sensitivity and specificity for the identification of MCI [34] the association of age, gender and level of education with TDT scores needs further examination. In the further development of our TDT, the specificity is needed to be improved and increased by determining the causes for the outliers. Compared to a high amount of participants (n = 253) in the study of Seigerschmidt and colleagues [34] who evaluated the CDTs of patients with cognitive impairment and questionable dementia, we could present a high specificity for the TDT even with a low number of participants (experienced stage n = 39). This is evidenced by the fact, that only a small number of TDT rating experts is sufficient to outline relevant criteria for the TDT.
The consideration of the scores for the different individuals HC, MCI, eDAT, and mDAT within the experienced stage of this study, resulted in the confirmation that the TDT is more sensitive to detect individuals with MCI than the CDT. In the two stages of dementia (eDAT and mDAT) the TDT showed a slightly higher ability (eDAT) or rather the same ability (mDAT) compared to the CDT result. The fact that MCI contains patients with early AD as well as patients with depression could be a reason why the TDT shows a higher sensitivity since depression related symptoms might affect the creative expression reasonably early.
Our findings substantiate that the CTD lacks sufficient sensitivity and specificity for the individuals with cognitive impairment and proves a higher ability for the TDT. With both drawings tests, CDT and TDT, executive control functions are measured which are cognitive processes coordinating simple ideas and actions into complex goal directed behavior. However, the TDT requires a creative implementation of the object “tree” that augments presumably the individuals executive control functions. The diverse subject “tree” needs to be invented by the creative brain whereas the clock as a well-defined object could be accessed as a simple symbol. We assume that the versatile subject and the amount of colors that are made available by the TDT might enhance creative cognition [35] and emotional memory [36], additional to the visual and kinesthetic perception, motor planning, eye-hand coordination, visual-motor integration, and manual skills stimulated by the drawing and handwriting processes [37]. These further brain functions stimulated by the creative process might give us a larger spectrum to be able to distinguish cognitive impaired persons from healthy individuals. In a recent study, Callahan and coworkers [38] found that the type of emotional information remembered by patients with amnestic MCI at immediate recall depends on the presence or absence of depressive symptoms and suggest that the cognitive profile of amnestic MCI patients with concomitant depressive symptoms differs from that without depressive symptoms as well as from patients without depressive symptoms and late-life depression. Further studies about the cognitive profile of these groups might help to clarify the impact of cognitive impairment on the drawing character of individuals.
The evaluation of the CDT provides a range of 6 scores to determine the degree of the cognitive abilities [39]. Considering the overall TDT process we detected a lot of drawing criteria that the art therapists could link to a drawing style of a cognitive impaired or demented person. These criteria should be further examined with regard to their frequency distribution and their common occurrence.
In some cases the CDT is used as a stand-alone screen for dementia, but usually it is incorporated into longer test batteries or included in composite cognitive screens such as the Mini-Cog, which performs better than the CDT alone [40]. In the same way as the CDT is associated with other NPTs the TDT as a combined test and as useful supplement in primary care has to be discussed.
Regarding the clinical setting, our results of patients whose first diagnose has been changed after further examinations suggest that the TDT could serve as an important implement providing information for the consideration of further investigation measures. In 67% of the twelve investigated borderline cases of medical diagnosis, we could show that the TDT outperforms the first clinical diagnosis. The CDT in screening for AD and MCI in the clinical practice has been recently studied [41]. Further investigations are necessary to strengthen the abilities of the TDT in clinical practice and primary care.
In conclusion, these preliminary results about the development of a novel neuropsychological drawing test show a new and effective opportunity to detect cognitive impairment and dementia in clinical settings and primary care. With a contemporary pathway by combining a modern digitizing device with a minor stress causing and resources orientated method for our TDT, we could obtain a higher sensitivity than for the parallel reviewed CTD. Thus, the TDT may become a handy and practicable tool for the early detection of AD.
Footnotes
ACKNOWLEDGMENTS
All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (ethics committee Eberhard Karls University Tübingen) and with the Helsinki Declaration of 1975, as revised in 2008.
The research was funded by the BMWI, KF 3332301.
