Abstract
The aim of the present study was to investigate the reliability and validity of a brief standardized assessment of children’s working memory; Lucid Recall. Although there are many established assessments of working memory, Lucid Recall is fully automated and can therefore be administered in a group setting. It is therefore ideally suited to large-scale screening or research purposes. The findings indicated suitable test–retest reliability. Scores were also correlated with children’s scores on the Wechsler Intelligence Scale for Children–IV working memory subtests, scholastic attainment, and ratings of children’s working memory behaviors. Working memory scores also distinguished between children with and without special educational needs. The findings are discussed in terms of practical implications for practitioners.
Working memory is a system responsible for the processing and storage of information (e.g., Baddeley, 1986). There are several theoretical models of working memory (e.g., Conway, Jarrold, Kane, & Towse, 2007; Miyake & Shah, 1999), but according to one widely accepted model (Baddeley, 2000; Baddeley & Hitch, 1974), working memory consists of four components. At the heart of working memory is a central executive system, a domain-general limited capacity system likened to a mechanism of attentional control (e.g., Engle & Kane, 2004; Unsworth & Engle, 2007). The central executive is supported by two domain-specific storage components; the phonological loop and the visuo-spatial sketchpad which are responsible for the maintenance of auditory and visuo-spatial information, respectively. Baddeley (2000) identified the episodic buffer as a further subcomponent of working memory, responsible for integrating information from the subcomponents of working and long-term memories.
Working memory is highly predictive of a number of scholastic skills during childhood, including literacy, mathematics, and comprehension (e.g., Alloway & Passolunghi, 2011; DeStefano & LeFevre, 2004; Engel de Abreu, Gathercole, & Martin, 2011; Seigneuric, Ehrlich, Oakhill, & Yuill, 2000). Between the ages of 7 and 14 years, children who perform poorly on measures of working memory also typically perform below expected standards in national curriculum assessments of English, mathematics, and science in England (e.g., Gathercole, Brown, & Pickering, 2003; Gathercole, Pickering, Knight, & Stegmann, 2004; St Clair-Thompson & Gathercole, 2006). Working memory deficits have also been implicated in many learning difficulties (e.g., Alloway, Gathercole, Willis, & Adams, 2005).
Each of the subcomponents of the multiple component model of working memory (Baddeley, 2000; Baddeley & Hitch, 1974) shares close links with learning and attainment. The phonological loop has been associated with learning vocabulary in native and foreign languages (e.g., Engel de Abreu et al., 2011; Majerus, Poncelet, Greffe, & van der Linden, 2006; Masoura & Gathercole, 2005), and also shares links with mathematical problem solving (e.g., Alloway & Passolunghi, 2011; Seyler, Kirk, & Ashcraft, 2003). The visuo-spatial sketchpad has been related to arithmetic and mathematics (e.g., Alloway & Passolunghi, 2011; Holmes, Adams, & Hamilton, 2008), and the central executive has been associated with learning and attainment across a number of domains, including syntax and reading development (e.g., Engel de Abreu et al., 2011), arithmetic and mathematical performance (e.g., Alloway & Passolunghi, 2011; Swanson & Kim, 2007), and overall academic attainment (Alloway & Alloway, 2010; St Clair-Thompson & Gathercole, 2006).
The subcomponents of working memory can be assessed using a range of measures. The phonological loop is assessed using the immediate recall of verbal information, for example, word recall or digit recall, and the visuo-spatial sketchpad is assessed using the immediate recall of visual or spatial information, such as in pattern recall (e.g., Pickering & Gathercole, 2001). The central executive is assessed using working memory measures in which participants engage in processing while maintaining information. For example, in counting span (Case, Kurland, & Goldberg, 1982) participants count the number of items in a series of arrays and then recall the successive tallies of each array. Within the multiple component model of working memory (Baddeley, 2000; Baddeley & Hitch, 1974), it is thought that processing uses the central executive and maintains the domain-specific storage systems (e.g., Baddeley & Logie, 1999; Duff & Logie, 2001).
There are several assessments of working memory which have been used within educational and clinical settings. These include the working memory subtests in the Wechsler Intelligence Scale for Children (WISC-IV; Wechsler, 2004); digit span forward and backward, and letter–number sequencing. Although widely used, the WISC-IV does not assess the visuo-spatial domain (e.g., Dehn, 2008). Another widely used assessment is the Wechsler Memory Scale (WMS-III; Wechsler, 1997), one of the most commonly used memory assessments in clinical settings (Rabin, Barr, & Burton, 2005). This includes spatial span in addition to digit span and letter–number sequencing. More comprehensive batteries have been used in educational settings, including the Working Memory Test Battery for Children (WMTB-C; Pickering & Gathercole, 2001) and the Automated Working Memory Assessment (AWMA; Alloway, 2007). The WMTB-C is a noncomputerized assessment designed for individual administration. Research has revealed suitable reliability and validity (e.g., Pickering & Gathercole, 2001). The AWMA is a computerized battery comprised of three measures each of verbal and visuo-spatial short-term memory. Although working memory is widely conceptualized as comprising a domain-general processing component (e.g., Alloway, Gathercole, & Pickering, 2006; Kane et al., 2004), it also comprises working memory tasks in verbal and visuo-spatial domains. Again, research has shown suitable reliability and validity (e.g., Alloway, Gathercole, Kirkwood, & Elliott, 2008). However, similar to the WMTB-C the AWMA is designed for individual administration and therefore requires extensive teacher or researcher time.
Recently, researchers have also developed tools for assessing working memory behaviors. Behaviors typically associated with a poor working memory include children losing their place in complex tasks with multiple steps, or requiring regular repetition of instructions in the school classroom. Such behaviors can be examined using the Working Memory Rating Scale (WMRS; Alloway, Gathercole, & Kirkwood, 2008), on which teachers are asked to rate how typical each of 20 behaviors of a child is using a four-point scale. Teacher ratings have been found to be significantly related to cognitive assessments of children’s working memory (e.g., Alloway, Gathercole, Kirkwood, & Elliott, 2009; St Clair-Thompson, 2011). However, research has also revealed a distinction between cognitive and behavioral aspects of working memory (e.g., Alloway et al., 2009), and thus cognitive assessments are recommended in addition to the WMRS (e.g., Alloway et al., 2009).
The absence of a cognitive assessment of working memory which does not require teacher or practitioner input motivated the construction of Lucid Recall (St Clair-Thompson, 2013). This is a brief assessment of working memory comprised of three tasks, one measure each of the phonological loop, visuo-spatial sketchpad, and central executive components of working memory. The episodic buffer component of working memory (Baddeley, 2000), and tasks used to assess this component, are not yet well understood. Thus, no task was included to assess this component of working memory. The tasks included in Lucid Recall are based upon well-established measures of working memory. However, the uniqueness of Lucid Recall lies in its potential to be used in group settings. It is fully automated and therefore requires little input, and it also allows for large groups of children to complete the assessment at any one time. It is therefore particularly suited to large-scale screening or for research purposes. The present study aimed to assess the reliability and validity of Lucid Recall. All testing was carried out in groups, to establish the appropriateness of the assessments for group testing situations.
Subgroups of children were tested on Lucid Recall on two separate occasions, allowing for a calculation of test–retest reliability. Predictive validity was explored by examining the relationships between performance on Lucid Recall and children’s attainment in school. The diagnostic utility was examined by comparing profiles of children with special educational needs with those of age-matched controls. Relationships between scores on Lucid Recall and scores on the WISC-IV working memory subtests and the WMRS (Alloway, Gathercole, & Kirkwood, 2008) were then examined to establish convergent validity.
Method
Participants
Participants were recruited from a larger sample of children who had taken part in the standardization of Lucid Recall. Ten schools in the North of England had been involved in the standardization. They were selected to include urban and rural schools, representing a range of socio-economic backgrounds. In the United Kingdom, all schools are regularly inspected by the Office for Standards in Education (Ofsted). In their most recent report, three of the schools had been rated as outstanding, and five had been rated as good or satisfactory. The remaining two schools had been in special measures within the last 3 years. The proportion of children eligible for free school meals was at or lower than the national average in four of the schools, and above the national average in six of the schools. The proportion of children with special educational needs was above the national average in four of the schools and either average or lower than the national average in the remaining six.
Reliability and validity data were gathered from subgroups of these children. A summary of the analyses and the number and age of the children in each analysis are given in Table 1. No information about intelligence, economic status, race, or other individual difference variables was sought.
Summary of Analyses and Number of Children Tested.
Note. WISC = Wechsler Intelligence Scale for Children; WMRS = Working Memory Rating Scale.
Materials and Procedure
All participants completed Lucid Recall (St Clair-Thompson, 2013). This is comprised of three measures of memory, designed to tap the phonological loop, visuo-spatial sketchpad, and central executive components of working memory.
The phonological loop task is word recall. Participants are asked to recall, in the same order, sequences of monosyllabic words presented aloud through computer headphones. There is a stimulus set of 210 high-frequency words and words are randomly selected from this set on each trial, with the constraint that two words within any one trial cannot rhyme with one another. This method was chosen rather than using standardized lists of stimuli to eliminate the possibility that children can benefit from seeking assistance from peers completing the assessment at the same time. The words are presented at the rate of one per second. Participants then use the computer mouse to select the targets (written words) from among a number of distracters. Although this is a recognition procedure the task demands the phonological loop, particularly due to the requirement for serial order. On each trial the targets and distracters are displayed in a 3 × 3 matrix. Instructions and an example are provided by the computer, and then after two practice trials testing begins with a maximum of six trials with two words to remember. The number of words is then increased by one if a child successfully recalls the words in four trials at a list length. When more than two lists in one block are recalled incorrectly, testing is terminated. The score is calculated as the number of trials on which the words are recalled correctly.
The visuo-spatial sketchpad is assessed using pattern recall. Participants are presented with a series of matrix patterns, for 2 s each. Following presentation of each pattern a blank matrix is presented and participants have to recreate the pattern by using the computer mouse to click on the squares to be filled. Following instructions and two practice trials, testing begins with a maximum of six trials with two filled squares in the matrix. The number of filled squares is then increased by one if a child successfully recalls the pattern in four trials of one matrix size. Discontinuation and scoring criteria are the same as those for word recall.
The complex working memory task is counting recall, in which participants count the number of items in a series of arrays and then recall the successive tallies of each array. The target items are red circles, which are presented among distracters which are red squares and blue circles. The distracters thus share a feature (either color or shape) with the target items. This is known to impose increased attentional demands (e.g., see St Clair-Thompson, 2007). For each counting array, participants use the computer mouse to select the total number of red circles at the bottom of the computer screen. The next counting array is then presented. At the end of each trial they again use the mouse to recall the count totals in the same order that they were presented. The structure of testing and discontinuation criteria are the same as that for the digit recall task described hereinbefore. However, due to the complex nature of this task additional instructions are provided. Participants first hear instructions and practice the counting part of the task. They then move on to hear instructions and practice the complete task involving counting and later recall of the count totals. Consistent with many other complex working memory tasks (e.g., the AWMA, Alloway, 2007; see also Conway et al., 2005), the score awarded is the number of trials on which the count totals recalled at the end of the trial match the count totals given during the trial. Thus, if a child makes a counting mistake and then remembers this mistake at the recall phase performance is still scored as correct. Each of the working memory tasks is depicted in Figure 1 (using black and white).

Depiction of the Lucid Recall tasks, with presentation of items shown in the left panels and recall on the right.
All children completed the tasks in the computer classroom in school. The software was loaded on to the school computers by the computer technician. Children then entered the classroom on a class basis (the largest class was comprised of 28 pupils). Each child sat at a computer and was given instructions of how to enter their name and date of birth. They were then instructed to wear headphones which had been provided, and to listen to the instructions given by the computer. The tasks were then automated, so no further experimenter involvement was required until all children had completed the tasks, at which point the results files were accessed. Two schools then agreed to allow some children to complete the assessments again to establish test–retest reliability. These children completed Lucid Recall again after a period of 6 weeks.
The schools supplied National Curriculum attainment levels in reading, writing, and mathematics for each pupil. Rather than being generated from standardized assessments, these comprised teacher’s assessments of children’s progress. It is common practice for teachers to rate each child’s progress according to the level they have achieved on the national curriculum each academic term. Previous research has revealed that these teacher ratings of children’s performance (in addition to performance on standardized tests) are closely related to children’s working memory (e.g., St Clair-Thompson & Sykes, 2010). A subgroup of children also completed the Group Reading Test II (NFER Nelson, 1992) sentence completion form A. This consists of 48 items, with an introductory set of picture recognition questions followed by sentences which have to be completed in a multiple choice format. There is no time limit for responding. Scores are calculated as the number of questions answered correctly, and then converted to a standardized score. Internal consistency of the Group Reading Test II is .87 (NFER Nelson, 1992).
Two schools were also asked to provide the names of children who had completed the assessments and were registered as having special educational needs. This included children who had received a statement of special educational needs, or were at either the School Action or School Action Plus stages. School Action and School Action Plus are levels of support that are available in mainstream schools in the UK. School Action is used when there is evidence that a child is not making progress at school and may need additional support from teachers or require different learning materials, equipment, or strategies. School Action Plus is used where School Action has not been able to help. At this stage external advice is sought, for example, from an Educational Psychologist. For the purposes of analysis we did not distinguish between children with a statement of special educational needs, or children at the School Action or School Action Plus stages of support. The majority of children were recognized as at risk of learning disabilities rather than emotional or behavioral difficulties.
A subgroup of children were then administered the working memory subtests from the WISC-IV (Wechsler, 2004); digit span and letter–number sequencing. Both these tasks were administered individually to each child. In the digit span task, a child hears a sequence of digits and is required to repeat the sequence in either the same order that it was presented, or in backward order. The test is discontinued if the child is unable to recall two sequences correctly at a span length. In the letter–number sequencing task, the child hears a sequence of letters and numbers and then has to recall the numbers in ascending order followed by the letters in alphabetical order. The test is discontinued if a child fails to recall three sequences at a span length. For digit recall and letter–number sequencing the raw scores are converted into scaled scores with a mean of 10 and an SD of 3. The test–retest reliability of each of the subscales is .83.
Two teachers were also asked to complete the WMRS (Alloway, Gathercole, & Kirkwood, 2008) for each child in their class. The WMRS requires teachers to rate how typical 20 behaviors of a child are on a four-point scale. The behaviors include, for example, “The child raised his hand but when called upon, he had forgotten his response” and “The child had difficulty remaining on task.” The ratings are 0 (not typical at all), 1 (occasionally), 2 (fairly typical), and 3 (very typical). A total score was then computed for each child.
Results
Reliability
Test–retest reliability for each subtest was computed using the Pearson’s product–moment correlation coefficient. Correlations above .70 are usually considered to indicate good reliability (e.g., Maltby, Day, & Macaskill, 2010). However, test–retest reliability is expected to reduce over time and therefore the 6-week interval used in the present study was a fairly stringent test of reliability. The resulting reliability estimates were .71 and .68 for word recall (children aged 7-9 and 10-12 years, respectively), .69 and .79 for pattern recall, and .59 and .76 for counting recall.
Validity
The first step taken in exploring the validity was to compute the correlations between scores on the working memory subtests and children’s National Curriculum levels. The correlations, and the number of children in each group, are shown in Table 2. There were statistically significant correlations between scores on each subtest of Lucid Recall and children’s scholastic attainment. 1
Correlations between Lucid Recall Scores and National Curriculum Levels.
p < .05. **p < .01.
To ensure that scores on Lucid Recall were related to standardized test performance in addition to teacher ratings of national curriculum levels, correlations were then computed between scores on Lucid Recall and scores on the Group Reading Test II (NFER Nelson, 1992). Performance on each subtest of Lucid Recall was significantly related to reading scores, with r(66) = .50, p < .01; r(66) = .48, p < .01; and r(66) = .35, p < .01 for the word recall, pattern recall, and counting recall tasks, respectively.
The second step in examining validity involved comparing the working memory profiles of children with special educational needs with the profiles of age-matched children at the same school without special educational needs. The performance of the two groups of children is shown in Table 3. Children with special educational needs performed significantly poorer than children without special educational needs on the pattern recall task, F(1, 92) = 12.23, p < .01, µ2 = .12, and the counting recall task F(1, 83) = 11.27, p < .01, µ2 = .12. The difference between the two groups was not statistically significant for word recall, F(1, 91) = 1.10, p = .30, µ2 = .01. A logistic regression analysis further confirmed that pattern recall and counting recall scores significantly predicted special educational needs (p < .05 in each case), correctly classifying 69% of the cases. Furthermore, each measure was capable of individually predicting special educational needs; when counting recall was entered first, pattern recall continued to significantly predict group membership (p < .05), and when pattern recall was entered first, counting recall remained a significant predictor (p < 05).
Lucid Recall Scores in Children With and Without Special Educational Needs.
The relationships were then explored between scores on Lucid Recall and scores on the digit span and letter–number sequencing subtests of the WISC-IV. The correlations are shown in Table 4. Performance on each Lucid Recall subtest was significantly related to scores on the WISC-IV subtests.
Correlations Between Scores on Lucid Recall and Scores on the WISC-IV Working Memory Subtests (n = 91).
Note. WISC-IV = Wechsler Intelligence Scale for Children–IV.
p < .05. **p < 01.
Relationships between scores on Lucid Recall and scores on the WMRS (Alloway, Gathercole, & Kirkwood, 2008) were then examined. Performance on each memory subtest was significantly negatively related to teacher ratings on the WMRS, with −.52, −.53, and −.46 for word recall, pattern recall, and counting recall, respectively. Negative correlations were expected because on the Lucid Recall subtests a higher score indicates a better working memory, whereas on the WMRS a higher score indicates more problematic behaviors.
Discussion
The aim of the present study was to examine the reliability and validity of Lucid Recall for use as a working memory assessment in group testing situations. Overall the results suggested that Lucid Recall is a reliable and valid assessment of children’s working memory.
The results revealed adequate test–retest reliability after a period of 6 weeks. As test–retest reliability is expected to reduce over time, a 6-week interval provided a stringent test of reliability. Reliability could also reasonably be expected to be lower in group testing situations. It is, however, important to note that reliability was lower for counting recall in children aged 7 to 9 than children aged 10 to 12 years, and reliability was only calculated as .59 for this age group. Previous research has, however, revealed lower test–retest reliability values for counting recall in children. For example, Pickering and Gathercole (2001) reported a reliability of .48 for children aged 9 to 11 years. In this case, retesting took place after only 2 weeks, so the reliability would be expected to be higher than that in the present study. In the present study, the lower reliability of counting recall in the younger age group could be attributed to young children finding it difficult to grasp this task, as it imposes processing and storage demands. This would have been particularly true at the first time of testing, when the children were attempting a new unfamiliar task. Thus, future research may benefit from exploring the instructions provided for counting recall, to ensure that young children fully understand task instructions, and thus that performance is only dependent upon working memory.
The results of the present study indicated good predictive validity of Lucid Recall. Scores on each subtest were significantly related to children’s national curriculum attainment levels. This finding is consistent with previous research using other working memory measures (e.g., Gathercole et al., 2003; Gathercole et al., 2004; St Clair-Thompson & Gathercole, 2006), and the strength of the correlations observed in the present study were also comparable with those in previous research. Scores were also significantly related to performance on the Group Reading Test, a standardized assessment of reading. It is, however, interesting to note that for children aged 7 to 9 years, the strongest correlations were consistently found for counting recall, a pattern that did not emerge for children aged 10 to 11 years. Therefore, practitioners may benefit from further developmental research using Lucid Recall.
The results also indicated suitable diagnostic utility of Lucid Recall for identifying children who are likely to require special educational provision. Children with special educational needs performed significantly more poorly on the measures of the central executive and visuo-spatial sketchpad components of working memory relative to age-matched controls. The central executive and visuo-spatial sketchpad tasks were also able to predict membership of the special educational needs group. This is consistent with previous studies. For example, Gathercole and Pickering (2000) compared children with low achievement on the national curriculum with children with average achievement, and found that the groups differed significantly in performance on measures of the central executive and visuo-spatial sketchpad, but not the phonological loop. Gathercole and Pickering (2001) revealed a similar pattern of findings in children who had been identified as having special educational needs.
It is, however, important to note that the children with special educational needs in the present study were not compared with control children on other measures such as IQ. Future research may benefit from a more detailed examination of the diagnostic utility of Lucid Recall. Different working memory profiles may also be expected in children with different categories of special educational needs. For example, Pickering and Gathercole (2004) found that children with problems in the area of language showed impaired performance on measures of the phonological loop and central executive, whereas children with general learning difficulties performed poorly on tasks assessing each component of working memory. In the present study, the children with special educational needs were likely to have had a diverse set of learning disabilities. Future research would therefore benefit from an examination of the Lucid Recall profiles of children with specific learning difficulties or with neurodevelopmental disorders.
Scores on Lucid Recall were also significantly related to scores on the WISC-IV working memory subtests, which are well-established measures of working memory. The correlation between scores on word recall and digit span was particularly high (.78). This is not surprising as word recall and digit recall are widely assumed to assess the phonological loop (e.g., Alloway, 2007; Pickering & Gathercole, 2001). Close relationships have also been found between word recall and backward digit recall (e.g., Pickering & Gathercole, 2001). Scores on the other subtests of Lucid Recall were also significantly related to digit span and letter–number sequencing. This provides evidence for suitable convergent validity. It is also important to note that the WISC-IV subtests were administered individually and Lucid Recall was administered in a group setting. Therefore, the results suggest that working memory can be reliably assessed in group settings, with no teacher or researcher input. However, future research would also benefit from exploring the convergent validity of scores on Lucid Recall with scores on other cognitive assessments of working memory, such as tasks included in the WMTB-C (S. Pickering & Gathercole, 2001) and the AWMA (Alloway, 2007).
The present study also revealed significant correlations between scores on Lucid Recall and teacher ratings of behavior on the WMRS (Alloway, Gathercole, & Kirkwood, 2008). This not only provides further evidence for the validity of Lucid Recall, but also supports previous findings of close relationships between cognitive and behavioral aspects of children’s working memory (see also Alloway et al., 2009; St Clair-Thompson, 2011).
The findings of the present study therefore have important practical implications. Educators, clinicians, and researchers now have access to a suitable computerized assessment of children’s working memory that can be administered in group settings. The assessment will be widely available for commercial acquisition (see www.lucid-research.com). Children in the present study (and those in the standardization sample for Lucid Recall) completed the working memory assessment with no researcher input. They also completed the assessment in group settings, with up to 28 children being tested in a classroom at any one time. The present study demonstrated good reliability and validity of the assessment in such group testing situations. Although the tasks included in Lucid Recall to assess the phonological loop, visuo-spatial sketchpad and central executive are not new, and are closely based on well-established measures of working memory, Lucid Recall is unique in that it is the only working memory assessment specifically designed for group testing. In this way it is a useful and convenient tool for educators, clinicians, and researchers. The curriculum in schools places many demands upon teachers, and requiring children to leave school lessons to complete assessments on an individual basis can be disruptive to the learning process. Using Lucid Recall to screen large groups of children will minimize this disruption to normal school routine.
Lucid Recall not only requires minimal input and can be administered in group settings, but is also a brief assessment that can be completed in approximately 20 min. The assessment provides standardized scores, and displays graphical profiles, and therefore clearly indicates to children (or their teachers or parents) if they have a low, average, or high working memory. This will assist with the effective management of working memory deficits in clinical settings, and in particular in school classrooms. Lucid Recall is therefore an easy as well as an effective method for assessing children’s working memory.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
