Abstract
Neuropsychological assessment using the Boston Process Approach (BPA) suggests that an analysis of the strategy or the process by which tasks and neuropsychological tests are completed, and the errors made during test completion convey much information regarding underlying brain and cognition and are as important as overall summary scores. Research over the last several decades employing an analysis of process and errors has been able to dissociate between dementia patients diagnosed with Alzheimer’s disease, vascular dementia associated with MRI-determined white matter alterations, and Parkinson’s disease; and between mild cognitive impairment subtypes. Nonetheless, BPA methods can be labor intensive to deploy. However, the recent availability of digital platforms for neuropsychological test administration and scoring now enables reliable, rapid, and objective data collection. Further, digital technology can quantify highly nuanced data previously unobtainable to define neurocognitive constructs with high accuracy. In this paper, a brief review of the BPA is provided. Studies that demonstrate how digital technology translates BPA into specific neurocognitive constructs using the Clock Drawing Test, Backward Digit Span Test, and a Digital Pointing Span Test are described. Implications for using data driven artificial intelligence-supported analytic approaches enabling the creation of more sensitive and specific detection/diagnostic algorithms for putative neurodegenerative illness are also discussed.
Keywords
INTRODUCTION
Neuropsychology is a scientific discipline that seeks to understand brain and cognition. It is certainly the case that other disciplines, such as speech pathology, psychiatry, and neurology share this goal. However, the methods germane to neuropsychology are unique and usually revolve around the application of controlled experimental paradigms, and the administration of well-constructed neuropsychological tests. During the latter part of the 20th century, clinical neuropsychology tended to coalesce around several ‘schools’ of neuropsychological assessment. Indeed, there was fierce debate on how best clinical neuropsychology should be practiced. One of these ‘schools’ is the Boston Process Approach (BPA). As reviewed below, the BPA has its origins in late 19th century and early 20th century European psychology and neurology. One of the goals of this paper is to show how the methods suggested by the BPA can be applied using digital assessment technology.
Origins of the process approach
The originator of the BPA is Edith Kaplan [1, 2]. The roots of the Process Approach can be traced to early 20th century Gestalt psychology (see [1]). Kaplan was greatly influenced by the work of her mentor, Heinz Werner. In his highly influential paper, Process and Achievement: A Basic Problem of Education and Developmental Psychology [3], Werner suggested that it is erroneous to assume that a single summary test score provides full information regarding underlying brain and cognition. Werner maintained that neurocognitive activity involves an “unfolding process” or a microgenesis. Thus, complex neurocognitive activity is best understood by appreciating how underlying, constituent goal-directed activity emerges, and how this activity is integrated over the time necessary to complete the task at hand [4, 5]. These notions underscore the importance of observing behavior, ‘en route,’ to its final solution.
Any complex neurocognitive operation can be completed using a wide variety of strategies. Therefore, an analysis of process, i.e., observing and quantifying neurocognitive operations in real time as tasks are brought to fruition, can convey critical information regarding brain and behavior. Thus, embedded within, say, the total time necessary to complete a task, is the need to document both when and how multiple neurocognitive resources are recruited and properly integrated for successful test completion. Along with process, Kaplan [4, 5] stressed how an analysis of errors made on neuropsychological tests conveys considerable information about underlying brain-behavior relationships.
An important example regarding how the process approach is able to clarify brain-behavior relations is the case description of patient PJK [6]. Patient PJK had suffered the surgical removal of a large left frontal lobe glioblastoma. At that time, there was an interest in the phenomenon of pure motor apraxia, a syndrome thought to be associated with a grasp reflex. Kaplan found that PJK, indeed, presented with a striking grasp reflex, but only on the right side. When asked to write with the right hand, motor problems were present, but letters and written output were correctly formed. Unexpectedly, when using the left hand, language-related output was both aphasic and apractic. Moreover, the patient could not name objects placed in the left hand. Norman Geschwind correctly interpreted this unique and interesting clinical presentation to be caused by a disconnection syndrome. As this case came to press, the patient died, and an autopsy revealed a left anterior cerebral artery infarct as well as thinning of the anterior two thirds of the corpus callosum. Patient PJK turned out to be the first modern case of a disconnection syndrome. Summary scores were able to document the presence of cognitive impairment. However, it was the analysis of process and errors made on neuropsychological tests that illustrated the precise brain–behavior relationships in this case.
In 1998, as part of the celebration of the first 50 years of the American Academy of Neurology, this research was cited as a landmark paper [7]. At that time, Kaplan wrote “ . . . we considered this paper to be of historic interest because it represented the first “modern” case of a verified callosal disconnection syndrome in a human being. Later, we considered it important because it served as a neurobehavioral model for the investigations carried out by Drs. Vogel, Sperry, Bogen, Gazzaniga, et al., on patients who had been callosally sectioned for epilepsy. It is noteworthy that our paper also prompted Dr. Geschwind to publish his two seminal papers on “Disconnexion syndromes in animals and man” in Brain in 1965. In addition, our work led to Dr. Geschwind’s contributions to concepts of cerebral asymmetries, as well as presaging contemporary views of neural networks.”
The Process Approach and dementia
The origins of BPA revolved around clinical syndromes associated with focal lesions, often due to stroke. Throughout the 1980s and 1990s interest in dementia associated with neurodegenerative diseases, such as Alzheimer’s disease (AD) emerged. Moreover, the availability of magnetic resonance imaging (MRI) technology during this time resulted in a renewed interest in vascular dementia (VaD). In a series of papers, Libon and colleagues [8–10] undertook a meticulous analysis of process and errors produced by dementia patients with AD and VaD associated with MRI subcortical vascular lesions (i.e., leukoaraiosis [11]). These analyses found that the dysexecutive behavior produced by VaD patients can be characterized as context independent in that executive dysfunction can be found across all neurocognitive domains that were assessed. By contrast, the neurocognitive disabilities produced by AD patients can be characterized as context dependent in that errors tended to be subordinate to amnestic and lexical access impairments, neurocognitive disabilities well known to typify AD.
For example, in context independent syndromes graphomotor perseveration [12] was found to be associated with poor performance on motor and other executive tests. On the ‘animal’ fluency test [13] output was clearly diminished compared to controls, but the semantic organization of the exemplars that were generated tended to be intact to suggest that low output could be primarily due to executive difficulty. The errors produced on tests that assessed verbal concept formation [14] suggests an inability to establish the requested mental set such that patients might convey how test items are different rather than alike (e.g., dog-lion: “one bark’s and one roars”). These out-of-set errors were found to be associated with poor performance on other executive tests. Finally, on a serial list learning delayed recognition test, false positive foils often came from the list B, interference test condition to suggest the presence of context independent retrieval [15, 16], behavior associated with dysexecutive difficulty.
By contrast, in context dependent syndromes the analysis of process and errors suggests that problems tended to be more circumscribed and related to language and memory problems. Thus, graphomotor perseverations [12] tended to be associated with language/lexical access deficits. Output on the ‘animal’ fluency test demonstrated very little semantic organization [13]. The errors produced on verbal concept formation tests suggests that the ability to established mental set was intact, but that output was concrete (dog –lion: “I like them”). On a verbal episodic delayed serial list learning recognition test condition, list B, semantic, and unrelated foils were all endorsed consistent with clinical amnesia.
Many of these patterns of behavior were subsequently found in patients with mild cognitive impairment (MCI) typified by mixed/dysexecutive and amnestic impairment [17, 18]. All of these data suggest that an analysis of process and errors provide a considerable amount of information regarding brain, behavior, and neurocognition [19].
Enabling the Boston Process Approach with digital technology
Widespread adoption of BPA methods for clinical use and research has been precluded because of the time and labor-intensive methods necessary to reliably quantify behavior, as well as the need for extensive training. However, the emerging availability of digital technologies to capture behavior such as eye movement, postural and facial expressions, as well as drawing, written, and spoken output, coupled with automated computerized methods for scoring BPA related features, has created a breakthrough opportunity. This opportunity is further enhanced given that the underlying biological substrate associated with dementia originates years before prominent symptoms are present, and the urgent need to identify nascent dementia related illness before actual clinical symptoms emerge.
Combining neuropsychological assessment using the BPA with digital technology provides the means to capture, record, and quantify behavior and errors not otherwise obtainable in real time, that is to say, ‘en route’ to identify subtle, but potentially meaningful neurocognitive impairment. For example, in the analysis of visuoconstructional behavior, say, in matching blocks to a model or copying a complex geometric figure Kaplan [4, 5] advocated the use of ‘flow charts’, i.e., duplicating how block constructions or figure copies were actually accomplished. Flow charting is a way to memorialize behavior as it unfolds in the context of the total time necessary to generate a response. Errors, and a sense of the strategies patients’ employ in the generation of their behavior, can be captured for a deeper analysis of the overall gestalt of the patients’ response. In this sense, the analysis of process and errors might provide the basis for identifying neurocognitive biomarkers before patients meet criteria for clinical syndromes such as subtle cognitive impairment (SCI) [20] or MCI [21].
Libon and colleagues [22] have commented that for digital technology to realize its maximum value, several requirements need to be met. First, digitally administered tests should be able to uncover and measure behavior not otherwise obtainable using standard paper and pencil tests. Second, this behavior should be used to operationally define key neurocognitive constructs known to underlie clinical syndromes associated with pre-dementia and dementia syndromes. For example, in pre-dementia syndromes, the time necessary to generate a correct response might signal the eventual emergence of a neurodegenerative illness. On verbal tests, digital technology measuring prosody, productive, and non-productive speech may afford new ways to operationally define word finding problems.
Third, the detected abnormalities using digitally administered tests scored using BPA methods should lead to actionable recommendations to mitigate the effects of illness. Described below are some recent data obtained from commonly administered tests including the Clock Drawing Test (CDT), the Backward Digit Span Test (BDST), and a digital pointing span test.
THE TRADITIONAL CLOCK DRAWING TEST
The CDT is one of the most widely used neuropsychological tests [23, 24]. Historically, problems with clock drawing have been linked to disorders such as constructional apraxia [25–28], i.e., an inability to assemble parts into a meaningful whole. The CDT has also been used to assess for problems associated with concept formation [29–32]. For example, Head [29] thought problems related to concept formation were associated with the difficulty involving time setting among patients with aphasia. Past research confirms that clock drawing is associated with an array of underlying neurocognitive operations and a variety of brain regions [33–36].
The CDT consists of two separate, but interrelated test conditions: clock drawing to command and clock drawing to copy. A wide number of analog administration and scoring procedures have been described [37–46]. Most researchers follow Kaplan’s [4, 5] suggestion and ask patients to “draw the face of a clock, put in all of the numbers, and set the hands for ten after eleven” [39]. Indeed, Kaplan is well known to have said, “If I had time to administer only a single test, I would administer the Clock Drawing Test” (personal communication). The reason for this enthusiasm stems from the multifactorial nature of the test, and the diverse types of errors produced by patients.
Libon and colleagues [41, 47] conducted two studies, primarily examining clock drawing errors made by patients with AD and VaD associated with MRI white matter alterations. This research found that a wide number of neurocognitive operations are associated with both clock drawing test conditions. Between-group analyses found that patients with VaD produced smaller drawings in the command test condition and often perseverated and subsequently failed to improve from the command to copy test condition. These behaviors were interpreted within the context of greater graphomotor/executive impairment that typify VaD patients.
In a follow-up study, Cosentino and colleagues [48] re-grouped dementia patients diagnosed clinically with either AD or VaD based on MRI white matter alterations (MRI-WMA) and compared these groups to dementia patients with Parkinson’s disease (PD). Groups did not differ on Mini-Mental State Examination [49] test performance. The analyses reported in this research confirmed the work of Libon and colleagues [41] such that patients presenting with minimal to mild MRI-WMA continued to improve from the command to copy test conditions, that is, produce fewer errors in the copy versus the command test condition, compared to patients with moderate to severe MRI-WMA and PD patients. Interestingly, errors produced in the command condition were correlated with overall dementia severity and tests assessing semantic knowledge. By contrast, errors produced in the copy condition were correlated with poor performance on executive tests. These findings were subsequently replicated by Price and colleagues [50].
In sum, an analysis of process and errors using a traditional analog CDT is able to document how each clock drawing test condition, command and copy, contributes to successful performance. Moreover, these analyses suggest how constituent clock drawing components are associated with specific neurocognitive operations, i.e., executive abilities and semantic knowledge.
THE DIGITAL CLOCK DRAWING TEST
The traditional CDT has recently been engineered using digital technology employing new, innovative software, in conjunction with digital pens, smart paper, and tablets for the creation of the digital Clock Drawing Test (dCDT) [51–55]. As described below, this technology is able to precisely measure all drawing and graphomotor output; time-based parameters associated with when various clock elements are drawn as a function of total time to completion; and a variety of kinematic parameters.
dCDT: Pen strokes, size, and placement
Digital technology is able to provide precise measurement of the construction of all clock drawing constituent parts including the number of pen strokes, size of the clock face, and size/position of the numbers within the clock face and clock hands. For example, data from the Framingham Heart Study [56], and three additional studies found that community dwelling research participants require approximately 25 pen strokes to complete a command or copy clock drawing [57–59].
The size of the clock face produced by community dwelling volunteers has also been quantified [58] showing that the clock face drawn as part of the copy condition is often approximately 40% smaller than drawn in the command test condition, behavior previously described by Libon and colleagues [41, 46]. Davoudi and colleagues [57] have provided some normative data for these and other clock drawing features including the size of the clock face and the location of the clock hands and digits within the clock face.
dCDT: Total time to completion –‘Thinking’ versus ‘Inking’
As described above, the BPA suggests that there is considerable value in the analysis of behavior ‘en route.’ In this context, one of the most interesting findings using digital technology is how clock drawing time to completion can be disambiguated into a variety of underlying subcomponents. For example, an analysis of data from the Framingham Heart Study [56] found that command and copy clock drawing total time to completion can be divided into time spent actually drawing or putting ink on the page, i.e., ‘ink time’, versus non-drawing time or ‘think time’. Indeed, Piers and colleagues [56] found that in the command test condition, across all age groups (age 20–90), approximately 60% of the total time necessary to complete clock drawings was spent not actually drawing, termed ‘think time’. By contrast, only 40% of the total time to complete drawings was spent actually drawing or putting ink on the page, i.e., ‘ink time’. In the copy test condition, participants continued to show that the majority of their time is spent ‘thinking’ rather than ‘inking’ (non-drawing=55% versus drawing = 45%). This behavior has been observed in community-dwelling younger and older adults with depression [60], patients with multiple sclerosis [61], and patients with mild cognitive impairment [62].
dCDT: Intra-component latencies
In addition to partitioning clock drawing total time to completion into drawing and non-drawing time, a number of key intra-component latencies have been identified as participants transition from one portion of their drawing to the next [61]. For example, the Post-Clock Face Latency (PCF-L) is the time measured after drawing the clock face to the production of the next pen stroke. The Pre-First hand Latency (PFH-L) is the time between the production of the first clock hand and the stroke that immediately preceded the first clock hand. Similarly, the Pre-Second hand Latency (PSH-L) is the time between the production of the second clock hand and the immediately preceding stroke. Additional intra-component latencies can be calculated before and after the clock face center dot are drawn, i.e., the time between the strokes that were produced immediately before and immediately after the center dot. The PCF-L, PFH-L, and PSH-L are termed, intra-component major as they are present in the vast majority of drawings. The pre-center dot and post-center dot latencies are termed intra-component minor, as many participants do not produce a center dot in either test condition.
Taken as a whole, all intra-component latencies are best understood as decision-making latencies. That is to say, upon completing one portion of the test, intra-component latencies appear to be providing a measure of how participants deploy necessary neurocognitive resources to complete the next portion of the test.
dCDT: Between-group differences in latency and construction
Cohen and colleagues [60] looked at dCDT behavior in community dwelling research participants with depression. Despite equivalent total time to completion, there was a significant interaction such that younger, depressed patients spent a smaller proportion of time actually drawing, relative to non-drawing time, compared to the older depressed group. Also, in the command and copy test conditions, percent of time spent not drawing was negatively correlated with neuropsychological tests that assess attention/information processing speed. This latter finding is especially provocative to the extent that ‘think’ time may be providing an operational definition of patients’ real time capacity to marshal or deploy necessary neurocognitive resources for successful test performance.
Libon and colleagues [61] studied patients with multiple sclerosis (MS) and normal controls and assessed how intra-component and quartile latencies provide additional information in conjunction with traditional, analog clock drawing scoring [41]. In the command test condition, impaired MS patients produced slower selected intra-component latencies (PFH-L) and slower 3rd and 4th quartile latencies compared to non-impaired MS patients and controls. Greater difficulty was noted in the copy test condition. Moreover, impaired MS patients continued to produce slower intra-component latencies (PCF-L). Also, impaired MS patients produced slower latencies in all four quartiles compared to control participants; and slower latencies only for the 3rd and 4th quartiles compared to non-impaired MS patients. Clinical experience suggests when copying a model of a clock there is considerable visual scanning from patients’ drawing to the model that is provided. The differences seen in quartile latency measures in the copy, compared to the command, test condition, might be providing a measure of this behavior.
Dion and colleagues [62] examined MCI versus non-MCI participants. Among MCI patients, command total time to completion was longer, the clock face was smaller, and greater ‘think’ versus ‘ink’ time was required. Also, longer command and copy intra-component latencies (PCF-L & PFH-L) were negatively associated with working memory, processing speed, and language test performance. Dion et al. [58] also investigated PD-dysexecutive, PD-amnestic, and PD-well participants. Analyses found that the production of clock drawings to command using fewer strokes was associated with higher odds of a PD diagnosis. By contrast, a larger clock face in the copy test condition, i.e., less evidence for micrographia, was associated with lower odds of a PD diagnosis. When PD phenotypes were examined, slower PCF-L and a smaller clock face in the command test condition was associated with higher odds of being PD-dysexecutive versus PD-well. A larger clock face was associated with higher odds of for classification into PD-amnestic versus PD-well groups.
The differences between PD phenotypes using the dCDT demonstrate the tremendous power of digital technology to capture very granular and clinically meaningful information. Schejter-Margalit and colleagues [63] found that dCDT parameters were superior to paper and pencil test performance in classifying PD and healthy controls into their respective groups. Finally, Zhao and colleagues [64] administered the dCDT to a small group of non-demented patients with moderate/severe MRI subcortical vascular disease (SVD), mild MRI SVD, and normal controls. dCDT performance to command found that patients with severe SVD presented with greater air-time percentage (i.e., ‘think time’) and lower handwriting/drawing pressure on their tablet than patients with mild SVD and healthy controls. Interestingly, greater air-time percentage was associated with reduced performance on both choice reaction and the Digit Symbol Substitution tests.
The quantification of such discreet and granulated behavior offers an unparalleled opportunity to extract meaningful neurocognitive biomarkers that may predict dementia related illness years before obvious clinical symptoms emerge.
dCDT: Patient classification
Recent studies have addressed the question regarding how well behavior extracted from the digital clock test is able to classify patients into their respective groups. Using machine-learning algorithms, Souillard-Mandar and colleagues [65] found that digitally obtained clock features were better able to classify patients into their respective clinical groups than traditional paper and pencil scoring methods including criteria suggested by other researchers [44–45, 47]. In this research a wide number of machine learning classifiers were generated including classification and regression tree (CART), C4.5 machine learning, random forest, boosted decision trees, and regularized logistic regression. Stratified cross-validation was used to divide the data into 5 folds to obtain training and testing sets. Optimal area under the curve (AUC) for group classification was obtained using random forest.
Bianco and colleagues [66] also analyzed digitally obtained clock features using machine-learning algorithms to see how well patients with MCI and AD could be classified into their respective groups. These analyses found that neural networks employing information theoretic feature selection approaches were able to achieve the best 2-group classification at or above 83% between patients diagnosed with AD versus and MCI; between amnestic versus mixed/dysexecutive MCI, and between non-MCI versus amnestic or mixed/dysexecutive MCI subtypes.
In a third study Davoudi and colleagues [67] extracted digital clock drawing kinematic, time-based, and visuospatial features and examined how well these features could classify AD, VaD, and normal control participants into their respective groups. Optimal area under the curve was achieved using models that combined both command and copy features. Subsequent follow-up analyses using a combination of command and copy variables found that groups could be dissociated based on a combination of kinematic (mean pen pressure, ratio of pen pressure to velocity), intra-component latency (PCF-L), and clock hand placement features. Other researchers have been successful in classifying dementia, MCI and healthy controls into their respective groups using dCDT parameters [68–70].
dCDT: Association with neuroanatomy and neuropathology
An important question to be addressed is the association between digital clock behavior, neuroanatomy, and neuropathology. Three recent studies have examined this issue. Behavior often seen on the clock drawing test includes the tendency of patients to initiate the drawing of numbers inside the clock face using anchor digits (i.e., the numbers “12, 6, 3, 9”). Lamar and colleagues [59] studied a group of non-demented/ non-depressed adults grouped based on whether anchor digits were initially drawn before other digits. Digital technology easily captures this behavior. Tract-based structural connectome analytics of MRI neuroimaging data found that participants who initially anchored digits in their clock drawings had higher local efficiency involving the left medial orbitofrontal and transverse temporal cortices; the right rostral anterior cingulate; and superior frontal regions versus participants not employing anchor digits. These data suggest that the prospective strategy of using anchor digits is associated with a higher degree of modular integration involving heteromodal regions of the ventral visual processing stream versus non-anchorers.
Dion and colleagues [71] examined both MCI and non-MCI patients with the dCDT, a comprehensive neuropsychological protocol, and brain MRI. Digit misplacement and total completion time were acquired for command and copy conditions. The a priori fMRI seed was the bilateral nucleus of Meynert (BNM). Indeed, greater digit misplacement was associated with less BNM-ACC connectivity. Also, command digit misplacement was negatively associated with reduced performance assessing semantic, visuospatial, and visuoconstructional operations.
Rentz and colleagues [72] were interested in how well digital clock behavior might be associated with biomarker evidence of amyloid and tau pathology in clinically normal controls, and a smaller group of MCI and mild AD patients. A total command/ copy clock drawing score was able to classify participants into respective groups. Also, among normal control participants, digital clock drawing behavior was associated with greater amyloid and tau burden, and demonstrated better discrimination compared to paper and pencil screening tests.
dCDT: Surgical outcome
Recent research using digital assessment technology in the perisurgical arena have demonstrated how patterns of performance on digital neuropsychological tests contribute to patient care (see Fig. 1).

Digital clock drawing pre-surgery (left) and post-surgery (right). Note the deterioration in the representation of ‘10 after 11’.
For example, Amini and colleagues [73] found that an analysis of errors from the dCDT copy test condition independently predicted length of hospital stay after transcatheter aortic valve replacement [74]. Moreover, these findings point to the relevance of preoperative executive and visuoconstruction functioning in predicting postoperative complications. In additional research, Hizel and colleagues [75] studied older adults electing total knee arthroplasty. Clock drawings tended to be slower and smaller than their peers at three weeks and three-months after surgery. These studies show that an analysis of subtle behavior appears to have considerable predictive power in this particularly important clinical environment. Buckley and colleagues [76, 77] made similar observations.
The precision with which dCDT behavior can be measured, along with the ability of dCDT parameters to predict membership into clinically meaningful diagnostic groups, and the relations between dCDT behavior and fluid biomarkers associated with dementia suggest that the dCDT is an excellent instrument that could screen for neuropsychological disabilities in primary care.
DIGIT SPAN AND POINTING SPAN
The term ‘digit span’ can be used to describe neurocognitive constructs related to frontal systems operations and the tests designed to measure these constructs [78] (see Richardson [79] for a review). The origins of digit span as a psychological construct derive from Gottfried Leibniz (1646–1716) who suggested that there is a finite capacity to hold information in mind, termed span of apperception. As a psychological test, James McKeen Cattell [80] (cited in Richardson [79]) was perhaps the first psychologist to include a test of span (for consonants) in his corpus of mental tests. Subsequently, an assessment of span has been included in many compendiums of tests [81, 82].
Until recently, the Wechsler corpus combined performance on the digits forward and digits backward tasks into a single score. However, clinical observation has long suggested that these tests measure parallel but differing underlying constructs. Kaplan and colleagues [83] tended to view the digits forward test condition as measuring auditory span or how many bits of information can be held in mind, whereas the digits backwards test condition was viewed as measuring the capacity for mental manipulation or mental control as suggested by Wechsler [82] or what is now understood as working memory. Kaplan and colleagues [83] also called attention to a wide variety of errors produced by patients on the digit span test. Kaplan’s prior work was the impetus for the creation of the BDST.
The Backward Digit Span Test
The BDST test [84, 85] consists of seven trials of 3, 4, and 5-digit span lengths for a total of 21 trials. All 4 and 5-span trials were constructed so that contiguous numbers were placed in strategic positions. This strategy was employed to examine the capacity of patients to resist the temptation to erroneously group contiguous numbers together (e.g., 15679 - “95671”). The BDST is administered using standardized Wechsler procedures except that there is no discontinuation.
Lamar and colleagues [84, 85] assessed performance on the BDST by calculating percent ANY ORDER and percent SERIAL ORDER recall. ANY ORDER recall tallies the sum of digits correctly recalled regardless of their serial position. By eliminating serial position, ANY ORDER recall is thought to measure immediate storage, span, and rehearsal mechanisms. By contrast, SERIAL ORDER recall tallies the sum of digits correctly recalled in exact serial position. This metric is thought to measure demanding executive abilities related to mental manipulation, temporal re-ordering, and disengagement. In addition to ANY and SERIAL ORDER recall, the BDST quantifies a large variety of between and within trial capture errors, perseverations, and transposition errors (see [86] for full details).
The BDST: Performance in dementia and MCI subtypes
Lamar and colleagues [84] assessed AD and VaD patients associated with MRI periventricular and deep white matter alterations. No difference was found for ANY ORDER recall, suggesting an equal capacity for storage, span, and rehearsal. However, VaD patients scored worse for SERIAL ORDER recall suggesting greater working memory impairment. In a follow-up study, Lamar and colleagues [85] examined regional MRI white matter disease in relation to ANY and SERIAL order recall and found that reduced SERIAL ORDER recall was associated with left inferior parietal white matter disease. These data are consistent with prior research suggesting that a visual imagery mechanism may underlie successful backward digit span performance [87].
Among patients with MCI, lower SERIAL ORDER recall appears to be driven by an attenuated recency effect and the production of transposition and dysexecutive errors [86]. Bezdicek and colleagues [88] administered the BDST to PD-MCI patients, PD-well patients, and normal control participants. Resting state MRI scans were obtained. PD-MCI patients demonstrated greater deficits involving SERIAL ORDER versus ANY ORDER compared to other patient groups. This was accompanied with significant bilateral dorsolateral prefrontal disruption. All of these data support the validity of these parameters as providing meaningful measures of underlying neurocognitive constructs.
The BDST: Digital assessment and latency
Emrani and colleagues [89] were interested in learning more about serial order position effects among non-demented memory clinic patients when repeating digits backward. An iPad version of the BDST was administered to patients meeting criteria for MCI [21] and non-MCI, cognitively normal patients, i.e., patients not meeting statistical criteria for MCI. Analyses were confined to correct 5-span test trials. Digital technology, therefore, was used for an analysis of discrete behavior ‘en route.’ Average time to completion for correct 5-span test trials did not differ across groups. However, despite the fact that recall on all test trials was 100% correct, between-group analysis of latencies yielded very different patterns of responding.
Non-MCI patients produced longer latencies on initial (position 2) and latter (position 4) correct serial order responses. By contrast, patients with MCI produced a longer latency for middle serial order responses (i.e., position 3). Emrani et al. [89] interpreted this behavior to suggest that the longer initial latency (position 2) was evidence that non-MCI patients were better able to marshal, deploy, and subsequently monitor the necessary neurocognitive resources for optimal mental manipulation and mental re-ordering early in the test trial. By contrast, longer latency by MCI patients toward the middle of the test trial could suggest the need for more time to deploy the necessary neurocognitive resources for successful responding.
In follow-up research Emrani and colleagues (unpublished data) found that individual BDST latencies were associated with different underlying neurocognitive activity. For the first response, longer latencies were associated with better performance on verbal working memory and visuospatial test performance. For the third response, shorter latencies were associated with better performance on tests assessing graphomotor information processing speed and visuospatial test performance. Thus, the analysis of the process by which correct responding is accomplished demonstrates that the neurocognitive operations underlying successful executive test performance are quite nuanced and that specific neurocognitive abilities appear to be associated with specific time epochs.
Pointing span: Assessment of latency
In companion research, Emrani and colleagues [90] investigated whether the pattern of responding described on the BDST might also be present using a different test paradigm. Thus, the Wide Range Assessment of Memory and Learning- 2 (WRAML- 2) Symbolic Working Memory Test, a pointing span test, was administered using an iPad. MCI versus non-MCI memory clinic patients were recruited. On this test, participants were asked to re-order numbers from lowest to highest. Pointing behavior was recorded on an iPad touch screen. The block of 4-span test trials was examined. Similar to data reported above using the BDST, only correct trials were analyzed. As seen with the BDST, no between-group differences were seen when average time to completion for the entire trial block was calculated. Nonetheless, MCI patients produced longer latencies, i.e., more time to produce the 1st and 3rd responses. Regression analyses using all participants found that longer latency to generate the 1st response was associated with better language and verbal episodic memory test performance. By contrast, shorter latency to generate the 4th or last response was associated with better verbal working memory test performance. These analyses show how, in real time, different but complimentary neurocognitive resources are recruited in order to produce a correct response.
These data extend prior findings using the BDST. The summary scores provided by tests such as the BDST and the WRAML-2 Symbolic Working Memory/pointing span tests are traditionally interpreted as assessing executive abilities. The data described above using digital technology does not obviate this interpretation. Nonetheless, underlying this superordinate executive capacity, the data described above suggests that, ‘en route’ to a final correct response, specific response latencies or test epochs are associated with a variety of discrete underlying neurocognitive behavior.
SUMMARY AND FUTURE DIRECTIONS
Dementia is a worldwide public health problem. As pharmacological treatment continues to evolve, there is considerable interest in identifying and treating modifiable medical and environmental risks associated with dementia [91, 92] so as to prevent or slow the onset of dementia and dementia-related illness. Rather than relying on the specialty memory clinic, screening for emergent neuropsychological problems, i.e., the assessment of cognitive vital signs should become routine in primary care [93–95]. Given their ease of administration, brevity, and automatic scoring, digital neuropsychological tests could be effectively deployed in primary medical care settings to screen for neuropsychological disabilities. Longitudinal assessment in primary care using digital neuropsychological tests could identify persons at risk for developing neurodegenerative illness. Indeed, artificial intelligence analytic approaches may assist in developing sensitive and specific detection/ diagnostic algorithms to flag emergent illness (see Battista and colleagues [96]).
The Boston Process Approach derives from theoretical constructs grounded in Gestalt psychology and suggests that complex neurocognitive activities are comprised of a multitude of underlying abilities. Within the context of the total time necessary to complete a task, the BPA suggests the need to document when underlying neurocognitive resources are recruited, and how these resources are properly integrated for successful goal-directed activities. Digital assessment technology provides the means to attain these goals.
The research reviewed above demonstrates the tremendous capacity of digital technology to extract graphomotor, time-based, and kinematic parameters. These observations have considerable theoretical importance regarding brain and neurocognition. In addition to the tests and data reviewed above, recent research has shown how meaningful digital biomarkers can be extracted from other neuropsychological tests including the analysis of speech [97] and graphomotor information processing speed [98–100]. Lyon and colleagues [101] have commented that most assessment of either cognitive or functional abilities occurs in the clinic where only a small sample of behavior is obtained. These researchers are developing technology that is deployed in the home and monitors behavior continuously on a 24-h basis. Gait, mobility, sleep, and a host of other behavior is monitored. From these data, potential problems involving motility and cognition can be assessed. Moreover, using this technology, it may be possible to capture speech and conversation in a naturalistic setting. These digital biomarkers, like the clinic-based biomarkers described above, have the potential to flag early emergent illness.
The advent of digital assessment technology represents a paradigmatic shift in the construction and analysis of neuropsychological tests. Total time to completion, one of the traditional methods by which test behavior is assessed, can now be partitioned into motor versus non-motor activity. The analyses of clock drawing, digit span, and pointing span behavior discussed above suggest that there is a rich underlying array of neurocognitive behavior associated with non-motor activity, now easily captured using digital technology. The tests described above have been traditionally viewed as assessing different neurocognitive abilities. However, the digital data obtained from these tests and other digital assessment techniques suggest that a through line underlying these data may be conceptualized as a construct called the ‘Allocation of Time.’ A task for future research will be to examine how total time to completion can be partitioned into meaningful constituent components along with concomitant underlying neuropsychological activity, and how these data may provide the means to predict emergent neurodegenerative illness.
In 1998 Kaplan noted that the description of patient PJK in 1962 resulted in new “conc epts of cerebral asymmetries, as well as presaging contemporary views of neural networks.” The data described above suggests that digital technology may very well herald a new context by which relationships between brain and neurocognition are studied.
