Abstract
Repeated measurements could be helpful to identify patients with early cognitive decline. We compare the variation of cognitive performance over one year in patients with mild cognitive impairment (MCI) and healthy individuals using the Brain on Track self-applied computerized test (BoT). The study was initiated 30 patients with probable MCI and 377 controls from a population-based cohort, who performed the BoT test from home every three months for one year. The scores were compared using a linear mixed-effects model. All participants increased their scores in the first tests, after 120 days MCI patients started to decline, with a statistically significant higher rate. The area under the curve to detect MCI was 0.94. We identified a significant decline in cognitive performance over one year in patients with MCI using BoT and the test presented a high discriminative ability.
Keywords
INTRODUCTION
Cognitive performance is expected to decline with aging, as many other biological functions. Results from cross sectional studies suggest a gradual age-associated decline in most cognitive functions of normal aging elders [1]. However there is still limited research on the use and interpretation of repeated measurements to identify longitudinal trends of cognitive performance, particularly as a screening/diagnostic approach to identify individuals with cognitive decline based on these trajectories [1]. Such data would allow for a better understanding of the pattern and rate of age-associated cognitive changes and is also a promising strategy to identify pre-symptomatic or early symptomatic cognitive decline.
A comprehensive neuropsychological battery performed by a trained professional is the gold standard for detecting cognitive impairment [2]; however, it is not a cost-effective strategy for periodic cognitive testing in large groups of individuals. The brief cognitive screening tools currently available lack the desired discriminative ability to identify mild cognitive impairment (MCI) and still require a trained external evaluator [3, 4]. Furthermore, most tasks included in both comprehensive batteries and brief screening tests were not specifically designed to minimize the practice effects of repeated testing [5]. Computerized cognitive tests have the potential to overcome these limitations, by allowing the use of multiple test versions and self-administered testing and have shown psychometric parameters similar to traditional tests [6, 7]. Additionally, they offer the potential for easier implementation of adaptive testing, in which test difficulty can be tailored to the individual performance [8]. Nevertheless, most of the existing computerized cognitive tests were designed to mirror the pen and paper testing [9]; they require a trained professional, do not take advantage of the potential for adaptive testing, and are not intended for monitoring longitudinal cognitive performance [10].
The Brain on Track test (BoT) is a computerized cognitive test developed for self-administered web-based longitudinal cognitive screening and monitoring [11]. It was previously shown to have good reproducibility, significant correlation with existing cognitive tests, ability to identify clinically relevant differences for MCI and early dementia and high test-retest reliability when performed from home [11].
In this work we aimed to further develop and improve the BoT test by expanding the domains assessed and by developing subtests with levels of difficulty adjusted to the expected individual performance. The main objective of this study is to describe the variation of cognitive performance over one year using BoT in patients with MCI and individuals from a population-based cohort using this improved version of the BoT. We also assessed the ability of this new version of BoT to discriminate between MCI and healthy controls in single and repeated uses.
METHODS
The brain on track test
The development and validation of the BoT test resulted in a version with seven subtests [11]. After critical review of these results by the neurologists and neuropsychologists of the development team, it was decided to expand the assessment of memory, executive functions and information processing speed, due to the lack of subtests assessing those domains, and four additional subtests were added: Color Interference Task (executive functions), Delayed Verbal Memory Task (delayed verbal memory), Verbal Memory Task II (immediate verbal memory), and Attention Task III (Sustained attention, information processing speed). Furthermore, in previous work, the test showed a better discriminatory ability in individuals with middle and higher education, when compared with individuals with less than four years of schooling. Therefore, the Delayed Verbal Memory Task and Attention Task III were designed with three different levels of difficulty, adapted to the expected baseline cognitive performance of the participants, based on the educational attainment [11]. These changes resulted in the revised version of BoT, used in this study and expected to improve the discriminatory ability of the test. The total duration of the BoT test is 24 minutes, and the detailed description of the subtests is detailed in the Supplementary Material.
Study design and protocol
This is a longitudinal study in which a group of patients with probable MCI and a group of individuals from the general population were monitored with BoT for one year. Overall inclusion criteria were: a) ≥18 years of age; b) access to a computer at home and c) being able to use a computer and mouse interface without external help.
The individuals from the general population represent a subset from the EPIPorto population-based cohort [12]. This cohort was assembled between 1999 and 2003 as a representative sample of adult (≥18 years) dwellers from the city of Porto. Participants were selected by random digit dialing of landline telephones [12]. In the 2013-2015 revaluation of the cohort, the first 300 consecutive participants were invited to participate in the test-retest study of the first version of BoT [11], while the remaining who attend the re-evaluation (n = 676) were invited to participate in the present study. From the latter, 75 refused to participate and 289 were excluded, because they did not have continuous access to a computer connected to the internet at home (n = 182) or because they could not use a computer and mouse interface without external help (n = 107). Therefore, a total of 312 participants were enrolled, from whom 259 completed the one-year follow-up (83.0%). Participants from the EPIPorto cohort who presented impairment in any domain in the neuropsychological assessment were also evaluated by a neurologist to verify if they complied with the criteria for MCI; one participant from the cohort was considered to have MCI, due to probable Alzheimer’s disease, and included in the MCI group for data analysis.
Patients with probable MCI were recruited in the Memory Outpatient Clinic of Centro Hospitalar de Entre Douro e Vouga. Eligibility criteria included the presence of progressive cognitive complaints over a period of at least six months, as reported by the patient or family members, impairment in at least one cognitive domain in a neuropsychological evaluation and no limitation in daily activities [13]. Eligible patients who attended the Neurological outpatient clinic in the second semester of 2015 and complied with inclusion criteria were consecutively invited to participate in the study. We recruited 30 patients with a clinical diagnosis of probable MCI from a memory clinic, from which 24 completed the one-year follow-up (80.0%). From these, 16 were confirmed as having a progressive clinical deterioration compatible with MCI in the one-year clinical re-assessment (nine due to probable Alzheimer’s disease, six due to probable vascular cognitive impairment and one due to probable Lewy body disease), while in seven the final clinical diagnosis was anxiety/depression and in one obstructive sleep apnea syndrome. Those without MCI were included in the general population group for the analyses. The final analysis comprises a total of 17 patients with confirmed MCI and a total of 267 healthy individuals.
All of the participants from the EPIPorto cohort and patients with probable MCI performed the same baseline neuropsychological evaluation, using a battery of cognitive tests validated for the Portuguese population, including the Montreal Cognitive Assessment (MoCA) test [14], the Mini-Mental State Examination (MMSE) [15], the Wechsler Memory Scale III [16], Trail Making Test A and B [17], Stroop Test [18], Clock Drawing Test [19], and Token test (short version) [20]. Impairment in any given cognitive domain was defined has having a performance bellow 1.5 standard deviations (SD) from age and education-adjusted norms in tests assessing that domain. All the patients with probable MCI performed brain imaging and laboratory studies and were re-evaluated at the end of the one year follow up, by repeating the neuropsychological evaluation and the clinical observation by neurologist, both blinded for the results of BoT.
Assessment with BoT
All participants underwent the first testing with BoT in the hospital clinic or research laboratory; the test was self-administered, though under the observation of a member from the research team. This session had two main goals: a) teaching the participant how to login to the BoT website and accustoming the participant with the user interface and b) guaranteeing that the participant understood the instructions and mechanics of each subtest, to minimize learning effects in subsequent testing. One week after the training session, the participants were asked, by e-mail and mobile text messaging, to access the web site from their home computer and to perform the test autonomously. They were asked to repeat the test every three months for one year.
Statistical analysis
Final test scores of the BoT were calculated by summing the subtests’ z-scores (standardized using the mean and standard deviation (SD) of the general population sample as the reference), and then standardizing this sum to a t-score (using the mean and standard deviation of the general population sample as the reference, and then multiplying by 10 and adding 50). Student’s t test for independent samples was used to compare the differences in age, education, and test scores between MCI patients and controls, since all variables presented a normal distribution (p > 0.05 in Kolmogorov–Smirnov test).
Linear mixed-effects models (LMEM) fit by restricted maximum likelihood were built to describe and compare BoT scores between patients with MCI and individuals from the general population over one year. To build the model, we included, a priori, the variables age, education, and MCI versus non-MCI in the model, and separately tested linear and quadratic factors of time, retaining them in the model if they reached statistical significance (p < 0.05). Then, we separately tested interaction factors between all the variables in the model (MCI, age, education, the linear and quadratic terms for time) retaining the interaction factors in the final model if they reached statistical significance (p < 0.05).
To estimate the discriminative ability of a screening strategy for early cognitive impairment in individuals with memory complains based on the BoT test, we performed a direct comparison between patients with MCI and age and education-matched controls and estimated the area under the curve (AUC) of the BoT test scores to identify MCI, using a two-gate design to estimate the diagnostic accuracy [21]. We selected the best matched healthy controls for each patient with MCI using the nearest neighbor matching based on the distance measure on logistic regression method [22]. Age and education were selected as the matching variables, as there were the only variables associated with BoT performance in this and in previous samples [11].
To estimate the AUC to distinguish between MCI and controls based on the 12-month follow-up with BoT, we first built a LMEM in the matched sample, to estimate the trend in time of BoT scores in MCI versus matched controls using natural cubic splines with one knot (fixed for all the sample) and random effects by for each spline and intercept individual [23]. To estimate the fixed knot that allowed for the best fit of the data one-dimension optimized function defined using Bayesian-information criteria was employed [23]. Then, the random effects of the LMEM were used to predict the probability of MCI. These probability measures were used to define the AUC.
Two cut points were defined for the first BoT test: all the subjects scoring above the high cut point would be considered probably not affected and dismissed from further testing; subjects scoring below the low cut point would be classified as probably affected and immediately referred to a Memory Clinic; subjects scoring between these two points would be monitored through regular repetitions of the test. The high cut point was defined based on the AUC to reach the highest possible sensitivity, so that none affected subject was dismissed, and the low-cut point for a specificity of 85%, so that those immediately referred to the Memory Clinic have a high probability of being affected. For the 12-month monitoring strategy with BoT, a single cut point was defined, with the higher possible sensitivity, to guarantee that no affected subject was ruled out.
Statistical analysis was performed in R statistical package.
Ethics
The study was performed in accordance with the Helsinki Declaration of 1975. The research protocol was approved by the institutional ethics committees. The web-based system for data collection of the Brain on Track test is encrypted and anonymized, and its use has been approved by the Portuguese Data Protection Authority. All subjects provided written informed consent for participation.
RESULTS
At baseline, patients with MCI were older, less educated, and had worse performance in cognitive screening tests than healthy controls. The performance in BoT was also significantly worse in patients (Table 1). The matched sample of healthy controls selected by propensity score presented no significant differences in age, sex, or education when compared with MCI patients while having a significant better performance in BoT (Table 1).
Participant demographics and test scores at baseline
*p < 0.05 when compared to patients with MCI.
When analyzing the performance in BoT using the LMEM model, patients with MCI presented, on average, an overall significantly worse performance than healthy individuals. There was also a significant association of older age and lower education with lower average scores on BoT (Table 2). There was a significant trend to a linear increase in performance over time in both patients and controls, with a slope that did not differ significantly between the groups (p = 0.34 for interaction). The quadratic term for the effect of time on cognitive performance also reached statistical significance, even after including the linear effect of time in the model (p < 0.001). This quadratic term presented a negative concavity, denoting a decrease in performance after the initial increase. Moreover, there was a significant interaction between the quadratic term for time and having MCI, implying that in patients with MCI the decrease is significantly more pronounced than in healthy controls. There was no significant interaction between time (linear or quadratic) with education and age.
Linear mixed-effects model for the test scores of the Brain on Track test over one year
MCI, mild cognitive impairment.
In Fig. 1, the predicted model scores are depicted, comparing the performance over one year in patients with MCI and healthy controls. The peak in performance in patients with MCI is at around 100 days (coinciding approximately with the 3rd test), with a decline after that, while in controls the performance tends to stabilize at around 180 days (coinciding approximately with the 4th test).

Trajectories of cognitive performance over one year in the Brain on Track test in patients with mild cognitive impairment and healthy individuals.
Concerning the diagnostic accuracy of BoT for single use, the AUC to identify MCI was 0.862. Based on this, we propose a rule-in cut point, for immediate referral to a Memory Clinic, with a specificity of 88.3% and sensitivity of 76.5%, while the rule-out cut point, for dismissing subjects from further testing, with a sensitivity of 100% and a specificity of 47.0%. Using the data collected over one year in the monitoring strategy, the AUC increased to 0.944, while the single cut point for rule-out would have a sensitivity of 100.0% and a specificity of 73.0% (Table 3).
Sensitivity and specificity for single use and for repeated use of the Brain on Track test
BoT, Brain on Track; ROC, receiver operating characteristic. *Probability of MCI defined using a linear mixed-effects model to estimate the trend in time of BoT scores using natural cubic splines with one fixed knot and random effects for intercept and splines by individual.
DISCUSSION
In this study, we were able to implement a cognitive monitoring strategy based on the BoT computerized self-applied test in healthy individuals from a subset of a population-based cohort and in patients with probable MCI from a memory clinic. After an initial increase in test scores in all participants, patients with MCI presented a significant cognitive decline, when compared with controls, after a peak at 120 days. The repeated BoT measurements reached an AUC of 0.94 in the one-year monitoring, compared with 0.86 in for single use.
One of the biggest challenges faced in clinical practice and dementia research is to distinguish the age-associated cognitive decline from the early onset of dementia, particularly in patients with memory complaints, but without the interference in the daily performance or social activities that defines dementia. These results highlight the potential of a screening monitoring strategy to identify patients with MCI from the pool of elderly individuals with early memory complaints. Nevertheless, there are still some issues concerning its use. One important potential limitation of all monitoring strategies is practice effects. These are a major concern on longitudinal cognitive monitoring, because of the capacity of the individual to learn and adjust, and therefore individuals perform better at cognitive function tests with repeated testing, interfering on the results interpretation [24–26]. This can be illustrated in the few studies in which the MoCA test was applied repeatedly, at different intervals of time, in patients with MCI. While in a follow-up of 3.5 years, 42% of MCI patients declined in the MoCA, with an average of 1.7 points [27]; in shorter time spans, such as 12 months, the MoCA test result increases, demonstrating important practice effects [28]. Taking this limitation into account, it is important to know the factors that can minimize or enhance this effect. One of this factors is the task familiarity [29]. We tried to optimize this by starting the monitoring strategy with a self-administered BoT test in the hospital clinic or research laboratory, under the observation of a member from the research team, who repeated the instructions in case of any difficulty. Another strategy to minimize this problem is the use of alternate forms [29]. The BoT subtests are designed with a wide variety of elements and different combinations of these elements, so that each trial is different from test to test. The frequency of the evaluations is also an important factor. A previous study in healthy individuals compared two groups with high (baseline, weeks 2-3, week 6, week 9, and month 3) and low (month 6 and month 12) cognitive test frequency over one year, with the high frequency group presenting with prominent practice effects [25]. In our study, we opted for an intermediate frequency (every 3 months for 1 year), which we considered to be low enough to minimize practice effects, but high enough to make an efficient monitoring and to detect changes in the cognitive status of the participants over time [25]. Despite the implemented measures, our results show that practice effects probably played a role in the performance of both groups in the initial evaluations. The initial slope of the linear increase was similar in patients and controls, but posteriorly, the MCI group started to decline, following a parabola like trajectory that was significantly different from the healthy controls, that maintained a more stable performance. We cannot discard that, at least in part, the apparent practice effect in patients with MCI could be due to a cognitive improvement secondary to the effects of anti-dementia medication, as most patients in the sample have started cholinesterase inhibitors and/or memantine close to the start of the cognitive monitoring. Ultimately, some degree of learning effects are unavoidable, for that reason the existence of control groups that undergo the same protocol is essential [29], as it allows a direct comparison between the two groups for each successive trials.
Another key point to the efficiency of cognitive testing is addressing the individual pre-morbid differences in cognitive performance, known as cognitive reserve. A possible solution for this problem is the application of adaptive testing, in which the difficulty grade of a question is determined by the performance in the previous question, therefore adapting the test to the patient’s abilities. Several authors have argued in favor of that strategy and proposed theoretical models of adaptive testing in the cognitive assessment of the elders [8, 30]. However, although adaptive tests have already been developed to monitor the development of young children [31], such tools have never been used in the monitoring of cognitive changes over time in an elderly population. In this study, we performed a first step towards adaptive testing, by adjusting the difficulty of some subtests to the expected performance of the participants, based on academic achievement, making the evaluation process more adapted to each individual. This could be a crucial feature for successful long-term cognitive monitoring by limiting ceiling and ground effects, allowing shorter testing sessions without sacrificing precision, and the possibility to monitor patients with some degree of previous impairment. The inclusion of additional subtests to BoT resulted in an increase in the diagnostic accuracy in single use, with the AUC improving from 0.75 in the previous version [11] to 0.86 in the present version. We aim to further explore this strategy in future studies.
There are some limitations to this study. The number of individuals enrolled with probable MCI that had anxiety/depression and not a neurologic disorder was higher than expected, resulting in a relatively small sample of patients with definitive MCI. Furthermore, it would be interesting to better characterize their pattern of cognitive performance over time and compare them with the MCI and healthy controls, but the small number of patients in this group prevented any meaningful analysis.
The adherence to the monitoring strategy was quite high in the study, similar in both settings, and represents an interesting proof of concept for the feasibility of monitoring patients with cognitive impairment. Nevertheless, 42% of the general population sample did not participate in the study because they did not have access to a computer with internet connection or lacked familiarity with this interface. This is still a considerable number, but it is expected to decrease as the penetration of technology increases and as the younger, more educated strata of the population reaches older age.
Conclusions
In all, the results from this paper imply that the BoT test could be a suitable tool for an early identification and monitoring of cognitive impairment in elderly individuals, and hopefully improve the current approaches to manage individuals with early memory complaints in the primary care setting and their referral for specialized care. Additionally, this tool could prove useful to identify candidates for future pre-symptomatic or early symptomatic treatments for Alzheimer’s disease. Pre-symptomatic cognitive decline has been demonstrated in unimpaired presenilin-1 carriers using a composite score of neuropsychological tests over five years of follow-up [32]. If, as hoped, pharmaceutical treatments for Alzheimer’s disease, currently under phase II and III clinical trials [33], prove successful in the pre-symptomatic phase, monitoring the population at risk with BoT could effectively identify individuals with probable early cognitive impairment, who would then perform more expensive confirmatory imaging or molecular biomarker tests to demonstrate amyloid- β pathology, and start treatments with potential to delay or avoid the evolution to dementia.
Footnotes
ACKNOWLEDGMENTS
The authors gratefully acknowledge the participants enrolled in EpiPorto for their kindness, all members of the research team for their enthusiasm and perseverance, and the participating hospitals and their staff for their help and support. The authors would also like to thank Rute Costa for her important role in collecting patient data.
This research project was partly financed by the Programa Operacional Competitividade e Internacionalização (POCI) (POCI-01-0145-FEDER-016867), the Programa Operacional Regional do Norte 2014/2020 (NORTE 2020) (NORTE-01-0145-FEDER-000003) and by the Fundação para a Ciência e a Tecnologia, in the context of the Epidemiology Research Unit - Instituto de Saúde Pública da Universidade do Porto (EPIUnit) (POCI-01-0145-FEDER-006862; FCT UID/DTP/04750/2013), and the PhD Grant SFRH/BD/119390/2016 (Natália Araújo), co-funded by the FCT and the POCH/FSE Program.
VTC and JP have a shareholder position at Neuroinova, Lda a start-up company that conceived Brain on Track, holds registered trademark and commercialization rights.
