Abstract
Background:
The Smart Aging Serious Game (SASG) is an ecologically-based digital platform used in mild neurocognitive disorders. Considering the higher risk of developing dementia for mild cognitive impairment (MCI) and vascular cognitive impairment (VCI), their digital phenotyping is crucial. A new understanding of MCI and VCI aided by digital phenotyping with SASG will challenge current differential diagnosis and open the perspective of tailoring more personalized interventions.
Objective:
To confirm the validity of SASG in detecting MCI from healthy controls (HC) and to evaluate its diagnostic validity in differentiating between VCI and HC.
Methods:
161 subjects (74 HC: 37 males, 75.47±2.66 mean age; 60 MCI: 26 males, 74.20±5.02; 27 VCI: 13 males, 74.22±3.43) underwent a SASG session and a neuropsychological assessment (Montreal Cognitive Assessment (MoCA), Free and Cued Selective Reminding Test, Trail Making Test). A multi-modal statistical approach was used: receiver operating characteristic (ROC) curves comparison, random forest (RF), and logistic regression (LR) analysis.
Results:
SASG well captured the specific cognitive profiles of MCI and VCI, in line with the standard neuropsychological measures. ROC analyses revealed high diagnostic sensitivity and specificity of SASG and MoCA (AUCs > 0.800) in detecting VCI versus HC and MCI versus HC conditions. An acceptable to excellent classification accuracy was found for MCI and VCI (HC versus VCI; RF: 90%, LR: 91%. HC versus MCI; RF: 75%; LR: 87%).
Conclusion:
SASG allows the early assessment of cognitive impairment through ecological tasks and potentially in a self-administered way. These features make this platform suitable for being considered a useful digital phenotyping tool, allowing a non-invasive and valid neuropsychological evaluation, with evident implications for future digital-health trails and rehabilitation.
Keywords
INTRODUCTION
As the world’s general population becomes older, with the number of people aged 60+ expected to reach more than 2 billion (one in five/six individuals) in 2050 [1, 2], the scientific interest in the aging process turns crucial to face with the social, health, and economical requests brought by aging-related issues. With aging, metabolic and neurobiological brain changes challenge the individual’s capacity to cope with the maintenance of previous levels of cognitive performance [3–5]. In this situation, the brain can reveal important neural capacities to functionally reorganize itself and to redistribute resources in order to maintain the same performances as in youth [3–5], in a phenomenon called “neurocognitive aging” [5]. However, not all individuals present with the same resources and a proportion of older adults may experience an unsuccessful aging process [2]. Neurocognitive disorders are conditions characterized by major or mild decline from previously attained levels of cognitive functioning [6, 7]. These conditions might have different biological substrates and are subtyped according to their etiology, with amyloid-β deposition and cerebrovascular diseases as potential causes [6].
Major neurocognitive disorders are characterized by significant cognitive decline associated with important interferences in autonomies, as in dementia [6, 7]. Considering the maturing age of the population, dementia is impacting importantly on healthcare systems at the worldwide level, with a forecast of 115–150 million people affected by dementia in 2050 [8, 9]. This evidence makes dementia a public health priority [9]. Prevention and early intervention are thus mandatory in social and sanitary politics, mainly acting on pre-clinical conditions, as the mild neurocognitive disorders. These conditions are characterized by a modest decline in one or more cognitive functions and by a maintained independence in everyday activities [6, 7]. Mild neurocognitive disorders are estimated to be present in about 20%of non-demented older adults aged over 60 years, and the percentage increases in older individuals [10].
Among the mild neurocognitive disorders, mild cognitive impairment (MCI) is a condition affecting 6.7%–25.2%of the population (60–80 + years) [11]. MCI lies on the continuum between healthy aging and dementia (especially Alzheimer’s disease, AD), and it is characterized by the maintenance of personal autonomies even in the presence of mild cognitive deficit in the amnestic (amnesic MCI, aMCI) and/or non-amnestic sphere [7, 12]. Individuals with MCI most commonly experience deficits in episodic memory (especially those subjects most likely to develop AD), but they can also present with difficulties in executive functions, attention, language, or visuospatial skills [13]. The neurobiological substrate is represented by the accumulation of amyloid plaques and neurofibrillary tau protein tangles in medial temporal lobe structures typically involved in AD pathology [14, 15]. People with MCI older than 65 years present with a cumulative incidence for dementia of about 15%when followed for two years [11].
Vascular cognitive impairment (VCI) refers to mild neurocognitive disorders with a dominant (if not exclusive) vascular etiology [6], going from ischemic white matter hyperintensities to infarcts [6, 16]. From a neuropsychological perspective, people with a diagnosis of VCI maintain their complete autonomy, but they might present with deficits primarily affecting processing speed, attention, and executive functions (well captured by timed tests of executive functioning), while other cognitive abilities remain relatively intact [17]. With the accumulation of vascular insults, the worsening of the cognitive profile and the loss of personal autonomy, the diagnosis of VCI can meet the criteria for vascular dementia (VaD), the second most common form of dementia [8]. This makes VCI a risk factor for the development of dementia [8].
Considering the higher risk for dementia of both MCI and VCI, their timely diagnosis is becoming a clinical imperative, as they represent the ideal candidates for early rehabilitative intervention and care [18–22]. Moreover, the possibility to make a differential diagnosis between the two conditions by digital phenotyping opens the perspective of tailoring the intervention based on their neurobiological basis.
Nowadays, the international guidelines ask for the use of specific biomarkers for the diagnosis of AD and VaD as clinical-biological entities. Whereas for VaD, the presence of neuroimaging data on vascular insults remains the only biomarker available [6], the diagnosis of AD might require the use of specific biomarkers of amyloid-β deposition, tau protein accumulation, and/or brain neurodegeneration [23]. Although with good diagnostic value, these diagnostic methods are invasive and expensive and thus are not suitable for massive use in the screening of the general population. For this reason, in clinical practice, the neuropsychological pencil-and-paper tests still play a crucial role in the diagnosis of dementia. However, the majority of these instruments are often not effective in the detection of slight deviations from healthy aging, typical of the earliest phases of the neurodegenerative diseases [24], such as MCI and VCI. The recent development of new digital technologies opens new diagnostic perspectives. Digital biomarkers can support clinicians in the timely diagnosis of pre-clinical conditions, by assessing cognitive functions using sensitive and ecologically valid tools resembling everyday activities in a realistic fashion and by collecting potentially huge amount of data in each usage, whose analysis can offer more sensitive clinical profiles [24] allowing personalized intervention in order to decrease the probabilities to develop dementia [25, 26].
Among new technological modalities, virtual reality (VR) and serious games (SGs) consist of promising tools for the assessment of functionality loss in pathological aging [27–30]. They are also used to predict outcomes and to strengthen randomized controlled trials results [31]. One of the major advantages of SGs is the possibility to integrate VR-based real-life routines in the evaluation process, making them ideal tools for the reliable evaluation of functioning in everyday living. SGs, such those hosted in digital applications, are now able to integrate VR environments to assess cognitive functions with different levels of complexity [19, 32–35] assuring the ecological validity of the measurements, an aspect often lacking in conventional neuropsychological tools, also in neurological conditions [30, 36–38]. In this line, the higher sensitivity of VR-based measures, compared to pencil-and-paper tests, has been demonstrated for the detection of cognitive impairment [39]. Moreover, VR has been previously presented in the literature not only to investigate and assess, but also to rehabilitate cognitive abilities. A recent six-years review of the effects of VR-based neurorehabilitation on multiple functional domains pointed out the promising beneficial results of the application of VR-based tools for the treatment of neurological acute and neurodegenerative conditions [40]. In terms of the effects on multiple cognitive domains, VR has been shown to impact people’s engagement with the assessment, a crucial aspect to enhance the effects of long-term rehabilitation on cognitive domains [41, 42]. Hence, the systematic adoption of these digital measures in older adults would lead to the efficient detection and treatment of pre-clinical conditions and therefore in the effective prevention of dementia cases.
The Smart Aging Serious Game (SASG) [32, 43] is a digital platform for the evaluation of cognitive impairment in aging able to collect both active (performance) and passive (time of execution) data while the subject performs ecologically-based and multi-domain tasks in a 3D VR scenario. The possibility to potentially collect huge amount of execution data makes this platform a great ally for the future implementation of statistical big data approaches, in neurodegenerative conditions also in early disease stages. The SASG has been demonstrated to be valid for the application in healthy aging [43] and in the detection of aMCI [44], especially considering the significant relation between SASG score and the degree of hippocampal injury associated with the pathology [45]. These findings make SASG suitable as a digital tool for the pre-clinical stage of AD.
The assessment of several neurocognitive domains and the ecological value of the SASG make it an ideal candidate also for the diagnosis of other pre-clinical conditions, such as VCI. The present study has a twofold aim: 1) to confirm the validity of SASG in detecting people with MCI from a condition of neurocognitive aging; 2) to evaluate the diagnostic validity of this tool in differentiating between VCI and healthy aged subjects. We used a multi-modal statistical approach to achieve these goals, including both receiver operating characteristic (ROC) curves comparison, random forest, and logistic regression analysis.
METHODS
Participants
Eighty-eight people with cognitive decline were consecutively recruited for the present study. Among those, 61 subjects had a diagnosis of MCI due to AD (mean age = 74.26±5.00, 26 males) and 27 subjects had a diagnosis of VCI (mean age = 74.22±3.43, 13 males). Seventy-seven healthy controls were included as well (HC, mean age = 75.36±2.83, 38 males). Four subjects (n = 1 MCI, n = 3 HC) were excluded from analysis due to technical issues during the SASG administration session. Patients were invited to take part in the research during periodical neurological screening by physicians at the IRCCS Don Gnocchi Foundation of Milan, the IRCCS Don Gnocchi Foundation of Florence and the Memory Clinic of University Hospital of Careggi of Florence (Italy). The study was approved by the Ethics Committee of the Don Gnocchi Foundation and all subjects gave written and informed consent.
Inclusion criteria for eligibility of participants included: 1) diagnosis of MCI due to AD according to Albert’s et al. [13] criteria or VCI according to Hachinski criteria [46], 2) evidence of the presence of a cognitive impairment at neuropsychological evaluation; Mini-Mental State Examination [47] score≥24, 3) age ranged between 65–80, 4) years of education≥5. The following exclusion criteria were considered: 1) cognitive decay due to an acute or general medical disorder, 2) severe behavioral and/or psychiatric disturbances, 3) substance abuse disorder, 4) auditory or visual disturbances able to impact on the performance of neuropsychological tests, 5) stable pharmacological treatments of cholinesterase inhibitor, memantine, antidepressant, and antipsychotic drugs.
All subjects read and signed the written informed consent before taking part in the research.
After enrollment, participants were divided into two groups (MCI due to AD and VCI) following the classification criteria of the National Institute for Neurological Disorders and Stroke and the Canadian Stroke Network [46] reported below:
MCI due to AD: the presence of the biomarker of amyloid-β deposition or neuronal injury (hippocampal atrophy, FDG-PET imaging or CSF tau/phosphorylated-tau) compatible with MCI due to AD condition, Hachinski ischemic score≤4, absence of other dementia-causes (revealed with TC or MRI), Fazekas score < 2 [48], and amnestic presentation.
VCI: Evidence (MRI/TC) of white matter lesions due to small vessel disease, lacunar state and ischemic lesions, Hachinski ischemic score > 4, Fazekas score≥2 [48].
The HC were an age-, gender-, and education-matched cohort selected from the study of Cabinio et al. [45]. All HC were independently living and were native Italian speakers, in the absence of a major neurological complaint, and with a Mini-Mental State Examination score≥28.
Procedure
Each participant performed a neuropsychological evaluation using conventional paper-and-pencil tools (around 45–60 min) and a session of SASG (around 20–4 min), in a single session.
Measures
Conventional neuropsychological measures
Participants’ cognitive functions were assessed through: Montreal Cognitive Assessment (MoCA) [49] for a global cognitive level screening. The total score for the scale, ranging 0–30, was adjusted for age and years of education following indications of Conti et al. [49]; Free and Cued Selecting Reminding Test (FCSRT) [50] to evaluate episodic memory, in terms of both immediate and delayed recall. The scores of Immediate Free Recall (IFR, range 0–36) and Delayed Free Recall (DFR, range 0–36), Immediate Total Recall (ITR, range 0–12) and Delayed Total Recall (DTR, range 0–12) were considered for the present study. According to Frasson et al. [50], IFR and DFR scores were corrected for sex, age, and years of education. Trail Making Test (TMT) [51] for the evaluation of executive function, mental flexibility, visual search, and processing speed. Part A and B of TMT (TMTA and TMTB) were utilized, and total scores were adjusted for age and years of education according to Giovagnoli et al. [51].
The Smart Aging Serious Game
The SASG is a serious game conceived for the screening of cognitive functions in the older adult population who are required to perform tasks in a VR scenario resembling a real-life setting. This innovative tool has been amply presented in previous works and validated as an effective measure of cognitive functions in healthy aging [32, 43]. Recently, its validity in the detection of hippocampal degeneration has been reported [45], demonstrating its potentiality as a digital biomarker. Shortly, SASG offers a multiple cognitive domains screening by assessing working, prospective and long-term memory, spatial orientation, executive functions, divided and selective attention. By easily interacting with a touch-screen, subjects virtually navigate in a 3D scenario resembling a loft fitted with a kitchen, a dining room, and a bedroom. The subject is required to accomplish five tasks, in a fixed order, consisting of functional activities of everyday life (e.g., make a phone call, remember to turn the TV on after an activity, identify and remember the location of objects in the kitchen). In detail, in Task1 (T1) individuals have to recall the position of objects previously shown in the kitchen. Task2 (T2) requires a dual-task in which subjects have to listen to the radio and press a button every time they hear the word “sun” and at the same time water the flowers. In Task3 (T3) subjects have to make a phone call and to remember to turn on the TV after the call. Task4 (T4), the only task with a 2D graphical appearance, consists of a memory task in which 24 images are shown and the subject has to recognize the 12 objects shown in T1. Finally, in Task 5 (T5), subjects are asked to navigate in the kitchen and search for all the objects shown in T1.
To familiarize participants with the technological system, a 10-min interactive demo was used.
During the SASG session, different indices were automatically recorded for each Task, among these: (A) Performance time: milliseconds from beginning to the end of each the Task; (B) Performance accuracy: total number of objects correctly remembered by the subject (for Tasks 1, 3, 4) or total number of correct actions performed (for Tasks 2, 5). Moreover, additional data are collected from the platform (but were not used in the present research) in the different Tasks, such as: (i) Clicks: number of clicks; (ii) Errors: erroneously recognized objects or omissions; (iii) Distance: distance covered within the loft.
In this study, according to [43, 45], only Performance Time (A) and Performance Accuracy (B) data were used. According to [43, 45] all the raw scores were converted to z-scores (using the mean and standard deviation of the HC sample). This transformation both highlighted performance under 2 standard deviations from the HC and allowed to compute the following three composite scores:
Statistical analysis
Demographical statistics and neuropsychological assessment
Statistical analyses were performed using IBM SPSS Statistics software (Version 26), JAMOVI and JASP (Version 0.14). Normal distribution of variables was verified through the Kolmogorov–Smirnov normality test and parametric or non-parametric tests were considered accordingly.
Descriptive statistics were run for the demographical description of the sample and the performance of conventional and non-conventional neuropsychological measures using IBM SPSS Statistics software (Version 26). Chi-squared test and univariate ANOVA were utilized to verify that HC, VCI and MCI groups were balanced in terms of gender, age, and educational level. Parametrical (One-way ANOVA) or non-parametrical (Kruskal-Wallis H test) were run as appropriate to compare the groups’ performance on standard neuropsychological tests (MoCA, FCSRT, TMT) and SASG-total. Post-hoc comparisons were adjusted with Bonferroni’s correction for multiple tests (significance was considered when p < 0.006).
ROC curve comparison
Furthermore, according to Cabinio et al. [45], to compare the discriminant capacity of SASG-total with those of conventional pencil-and-paper tests (MoCA, IFR, DFR, TMTA, and TMTB), ROC curve analyses with the software Jamovi (Version 1.2.25.0; https://www.jamovi.org) were performed considering two conditions: 1) discriminate VCI from HC, and 2) discriminate MCI from HC. The area under the ROC curve (AUC) provides a measure of overall prediction accuracy and corresponds to random chance when AUC is equal to 0.5 and represents perfect accuracy when AUC is 1. According to the literature [52], we considered accuracy values 0.7 to 0.8 as acceptable, 0.8 to 0.9 as excellent, and more than 0.9 as outstanding.
Classification of healthy controls or patients
To further investigate the overall discriminant capacity of SASG-total and neuropsychological pencil-and-paper tests in predicting the two conditions (1. Discriminate VCI versus HC and 2. Discriminate HC versus MCI) and the weights of each potential predictor, random forest (RF) analysis and logistic regression (LR) implemented in JASP (Version 0.14) were adopted. The classification models run included the same variables considered in the previous ROC analysis altogether (see section on “ROC Curve comparison”: MoCA, IFR, DFR, TMTA, TMTB, and SASG-total). RF was run with the default parameter values set in JASP, specifically with respect to the data split we partitioned the data set into a training (60%), validation (20%), and test set (20%). In relation to the number of trees we selected an optimal number of trees (Ntrees (maximum) = 100), optimized with respect to the out-of-bag accuracy. LR was performed using the enter method. Classification accuracy represents the proportion of the instances that were classified correctly. Performance of the classification model was also evaluated by carrying out a ROC analysis. Precision represents the proportion of true positives among all the instances classified as positive, and F1 Score indicates the harmonic mean of Precision and Recall.
RESULTS
Participants
A total of 161 subjects was included in the analyses: 74 HC, 60 people with amnestic multidomain MCI and 27 people with VCI. The three groups of participants were balanced in terms of age, sex, and level of education distribution (see Table 1 for details).
Demographic characteristics of the sample
F, females; HC, healthy control group; M, mean; Ma, males; MCI, mild cognitive impairment group; N, number; sd, standard deviation; VCI, vascular cognitive impairment. ∧Kruskal-Wallis H test; °Chi-squared test.
Conventional and non-conventional neuropsychological measures results
Table 2 reports summary statistics and the three groups’ comparison of the assessment of the cognitive functions through the battery of standard neuropsychological tools. Compared to HC, both MCI and VCI showed a reduction in global cognitive function, as assessed by MoCA test (p < 0.001) and in tasks assessing memory (IFR, ITR, and DTR; p < 0.001). Moreover, in the DFR task only MCI showed a reduced performance compared to HC (DFR, p < 0.001). When evaluating executive function, both VCI and MCI did not differ from HC in TMTA, whereas only VCI but not MCI presented a reduced performance compared to HC in TMTB (p < 0.001).
Standard neuropsychological assessment battery groups comparison results
ES, Equivalent Score; F, ANOVA analysis; H, Kruskal-Wallis H test; HC, healthy control group; IR, interquartile range; Me, median; MCI, mild cognitive impairment group; MoCA, Montreal Cognitive Assessment; IFR, Immediate Free Recall; ITR, Immediate Total Recall; DFR, Delayed Free Recall; DTR, Delayed Total Recall; TMTA, Trail Making Test-Part A; TMTB, Trail Making Test-Part B; VCI, vascular cognitive impairment; ∧Kruskal-Wallis H test, *ANOVA analysis; p < 0.05 were reported in bold. Post-hoc comparisons were adjusted with Bonferroni’s correction.
Comparison between the three groups on SASG composite and total scores were performed (Table 3). Results showed large significant differences among groups in all SASG task scores. In particular, both MCI and VCI differed significantly from HC in equal measure in T2 and T5 (p < 0.001). Instead, a lower VCI score compared to the MCI score was observed in T1 and T3 (p < 0.001). Furthermore, only MCI presented a lower score than HC in T4 (p = 0.001). Finally, we observed that both VCI and MCI differed significantly from HC in SASG total score (p < 0.001), with a lower performance of VCI than MCI. This trend was confirmed in the SASG-total-time score, while MCI and VCI performed equally lower than HC in SASG-total-acc.
SASG’s groups comparison results
Data reported are z-scores. HC, healthy control group; MCI, mild cognitive impairment group; SASG-total-acc, Smart Aging Serious Game accuracy; SASG-total-time, Smart Aging Serious Game time; SASG-total, Smart Aging Serious Game total; VCI, vascular cognitive impairment; ∧Kruskal-Wallis H test. Post-hoc comparisons were adjusted with Bonferroni’s correction; *ANOVA analysis.
Diagnostic sensitivity and specificity of the Smart Aging Serious Game
ROC curve analysis was carried out to investigate the diagnostic sensitivity and specificity of the SASG and the conventional neuropsychological measures by discriminating between VCI and HC, and between MCI and HC (see Fig. 1). We registered a high diagnostic sensitivity and specificity of SASG and MoCA (in all cases AUC > 0.800) in detecting VCI versus HC and MCI versus HC condition (see Table 4). The paired-sample t-test showed a statistically significant difference under the ROC curves between SAGS and conventional neuropsychological measures including IFR, DFR, TMTA and TMTB (all p < 0.05). No statistical significance difference was observed between SASG and MoCA in both cases (VCI versus HC: p = 0.171; MCI versus HC: p = 0.064).

ROC curve for the diagnostic sensitivity and specificity of SASG and conventional neuropsychological tests in three conditions: VCI versus HC, MCI versus HC.
ROC curve results
AUC, Area Under the Curve; CI, confidence interval; J, Younden index; DFR, Delayed Free Recall; IFR, Immediate Free Recall; MCI, mild cognitive impairment group; MoCA, Montreal Cognitive Assessment; TMT, Trial Making Test; SASG, Smart Aging Serious Game; SE, standard error; VCI, vascular cognitive impairment. P-values refers to a paired T-test between SASG and each pencil-and-paper conventional test.
Classification of healthy controls or patients
Classification analyses revealed excellent accuracy for the classification model built to identify HC versus VCI (RF: 90%; LR: 91%) and an acceptable to excellent accuracy for the model built to classify HC versus MCI (RF: 75%; LR: 87%). Regarding the importance of the variable in both RF classification models SASG performance was associated with the higher mean decrease of accuracy (see Fig. 2).

Mean decrease in accuracy of the RF classification model considering conventional neuropsychological tests and SASG total scores to classify VCI versus HC (on the left) and MCI versus HC (on the right).
By adopting a LR approach, significant predictors able to discriminate between HC and the two mild neurocognitive disorders were SAGS (HC versus VCI: p < 0.001; HC versus MCI: p = 0.036) and MoCA (HC versus VCI: p = 0.012; HC versus MCI: p < 0.001).
Table 5 shows the predictive performances of RF and LR classification models in terms of Precision, Recall, F1 score and AUC. Precision was above 60%for each class of individuals.
A) Evaluation metrics of RF and LR classification models. B) Confusion matrices referred to the two classification models, values are reported as percentages (subjects count)
Area Under Curve (AUC) is calculated for every class against all other classes. MCI, mild cognitive impairment; VCI, vascular cognitive impairment; RF, random forest; LR, logistic regression.
Table 6 shows estimates of the LR classification models based on conventional neuropsychological tests and SASG to classify VCI versus HC and MCI versus HC.
Logistic regression model classifying HC versus VCI and HC versus MCI based on neuropsychological tests and SASG
df, degree of freedom; DFR, Delayed Free Recall; IFR, Immediate Free Recall; MCI, mild cognitive impairment group; MoCA, Montreal Cognitive Assessment; TMT, Trial Making Test; SASG, Smart Aging Serious Game; VCI, vascular cognitive impairment.
DISCUSSION
In the most recent years, the timely and precise diagnosis of conditions at risk of major neurocognitive disorders, such as MCI and VCI, has become of pivotal importance. Early rehabilitative interventions and longitudinal monitoring of the course of the disease can reduce the probability of developing major neurocognitive disorders such as dementia [18, 54], increasing the quality of life in those individuals and families, and reducing social and sanitary costs. According to this premise, the present study aimed to test the validity of the SASG platform, in detecting mild neurocognitive disorders, such as MCI and VCI, from the successful aging.
Our results showed that SASG well captures the specific cognitive profiles of aMCI and VCI, in line with the standard neuropsychological measures. A prominent deficit in the pure memory task (T4) was observed in typical aMCI while slower performance times (SASG-total-time) were recognized in VCI. These results together with our ROC, LR, and RF findings highlighted the validity of SASG in discriminating between unsuccessful and successful aging, as a robust digital neuropsychological tool and support the evidence that SASG presents with an equal or higher diagnostic validity than conventional neuropsychological tests.
The integration of digital health tools in the standard clinical practice is a recent challenge. It has become increasingly clear that neuropsychology is experiencing a crisis considering that it has essentially remained stable over-time [55] and that several potential advantages can be afforded by greater integration of technology [56]. Indeed, traditionally neuropsychology adopts labor intensive methods and allows data collection in a slow, potentially inefficient, and expensive way with a relatively poor estimate of human behavior outside of laboratory conditions [57, 58]. During the last few years, there were significant innovative changes in the neuropsychological field, characterized by a review leading to a transformation from Neuropsychology 1.0 to 2.0 [59]. In this new framework, neuropsychologists evolved from the utilization of subjective measures of cognitive functioning (with restricted norms) to the adoption of validated assessment tools with more sophisticated psychometric properties and normative procedures. A large-scale initiative of harmonization of assessment procedures in the United States is the Uniform Data Set (UDS) promoted by the University of Washington’s National Alzheimer’s Coordinating Center (https://www.alz.washington.edu/web/forms_uds.html). This initiative was aimed at developing standardized methods and setting up a shared neuropsychological battery (UDS 3.0 neuropsychological battery) to capture the continuum of cognitive decline from normal cognition through AD and to create a normative calculator available online [60].
In line with this transformation, the integration of technologies in the neuropsychological assessment can represent a tremendous advantage in the collection of a great amount of data, with the novel challenge to integrate them into a cohesive patient profile. Accordingly, our innovative tool SASG, opens the perspective to register not only simplistic measures such as accuracy (operationalized in terms of number of corrected responses), but also additional quantitative features such as executive times, clicks counts, navigation tracking coordinates about movements in the virtual scenario, etc. All of these features could allow a more precise patient profile also in studies exploiting machine learning algorithms and big datasets [27, 62], with the potential for better classification of clinical populations.
Our RF analyses revealed a higher precision of SASG in classifying MCI and VCI from HC. This finding could be explained by the different specificities that differentiate the digital tool from conventional neuropsychological tests. The ecological stimuli of the SASG scenario, presented with increased perceptual complexity associated with the VR-3D presentation, represent an enriched environment for the subject compared to conventional neuropsychological tests. This environment is harder to be learned and approached by the subject than pencil-and-paper tools and makes it more difficult to develop compensatory strategies able to “hide” subtle cognitive impairment [63, 64]. In terms of this graphical feature, multi-dimensional tasks were presented within this 3D environment, constituting functional-led measures of the cognitive domains in a daily life ecological context. The assessment in an ecologically-valid and real-life-mimicking setting could have assured advantages such as triggering a feeling of immersion and offering the closeness of daily life stimuli [65] overcoming the evaluation paradigm of the standard pencil-and-paper tests [33, 67]. Accordingly, recent contributions revealed how virtual reality assessment tools result in a precise detection of people at risk of major neurocognitive disorders also thanks to their ecological validity [30]. Moreover, SASG presented a multi-dimensional cognitive assessment, fundamental in the evaluation of neurocognitive disorders, that can involve several of the six neurocognitive domains [6]. It is well demonstrated that functional-led tasks are higher sensitive in registering cognitive impairment [63, 64]. Accordingly, we reported a higher sensitivity and specificity of the digital tool if compared to single-domain conventional measures, such as FCSRT and TMT. Coherently, the sensitivity and specificity of SASG were equal to the MoCA test, screening different domains of cognition as SASG does.
Globally, our study opens to the integration of digital biomarkers in the detection of neurocognitive aging, adopting valid, non-invasive, and potentially self-administered methods [31]. In fact, the actual gold-standard biomarkers consist of amyloid-β deposition, tau protein accumulation, and/or brain neurodegeneration [23]. Although standard neuropsychological measures can recognize the presence of cognitive deficits suggesting an initial unsuccessful aging process, differential clusters of mild to major neurocognitive disorders are conventionally diagnosed only by recurring to standard biomarkers, such as the amyloid-β load and tau protein [68]. This procedure often requires expensive and invasive examination, and therefore it is not usually timely available, with critical consequences for the lack of prompt targeted interventions. Our contribution, in line with previous work [45], presented SASG as a promising non-invasive tool for the early detection of MCI and VCI conditions. Also, the SASG owns the prerequisites for self-administration out of the clinic, also at the patient’s home, in a real-life setting. Implications are utmost: a more democratic delivery of services, not affected by physical barriers, less costs and resources-consuming, availability of remote assessment out of the hospital. This application would allow to capture cognitive, and motor changes in daily life preceding the frank manifestation of major neurocognitive disorders [69]. The literature reported valid examples of digital systems for the monitoring of people’s functioning in a real-life setting, such as the measure obtained from smartphone use as a surrogate for laboratory-based neuropsychological assessment [25]. Further innovative evidence consists in the Altoida machine learning platform, an amply valid digital biomarker, combining data from hand movements, gait, posture, eye tracking, visuo-spatial navigation, voice, dual-task activities during daily living [27, 62], to outline cognitive deficits before people recur to clinical investigations.
This study is not without limitations and some elements might be taken into consideration when planning future research using SASG. In our study, we focused on limited features of SASG available output data. Furthermore, we compared SASG validity only with standard pencil-and-paper measures without including a digital version of neuropsychological conventional tests. However, pencil-and-paper cognitive tests still represent the gold-standard of clinical neuropsychological measures. Moreover, SASG, similar to other digital devices, needs a session to allow for familiarization with the technology that extends the administration time. Nevertheless, it is plausible to assume that future generations of millennials will not need to familiarize themselves with digital devices anymore. Also, our VCI cohort consisted of a small group of patients, especially compared with the group of MCI subjects, preventing to explore the capacity of the SASG in differentiating between MCI and VCI conditions. Future studies need to replicate our findings with a wider sample of VCI.
To conclude, our results demonstrate that SASG is a valid ecologically-based tool for digital phenotyping in aging able to detect MCI and VCI from successful aging. Implications are utmost; in fact, efficient treatments for most neurodegenerative diseases are hindered by the fact that their detection intervenes at late stages when the integrity of the nervous tissue is very compromised. SASG allows for the early assessment of cognitive impairment through functional tasks and potentially in natural everyday life. These features make this tool suitable for being considered a digital candidate to assess and to monitor subjects with evident implications for future digital-health trails.
Footnotes
ACKNOWLEDGMENTS
The authors thank the GOAL Working Group (alphabetical order): Valeria Blasi1, Francesca Borgnis1, Monica Di Cesare1, Camilla Ferrari2, Tommaso Migliazza3, Chiara Pagliari1, Federica Rossetto1, Federica Savazzi1, Andrea Stoppini3. 1IRCCS Fondazione Don Carlo Gnocchi ONLUS; 2Universitá degli Studi di Firenze, Dipartimento di Neuroscienze, Psicologia, Area del Farmaco e Salute del Bambino, Firenze, Italy; 3Consorzio di Bioingegneria e Informatica medica–CBIM, Pavia, Italy.
This research was funded by the ITALIAN MINISTRY OF HEALTH, Ricerca Corrente and by BANDO FAS SALUTE 2014 from the Tuscany Region (Italy).
