Abstract
Background:
Some authors report steeper slopes of forgetting in early Alzheimer’s disease (AD), while others do not. Contrasting findings are thought to be due to methodological inconsistencies or variety of testing methods, yet they also emerge when people are assessed on the same testing procedure.
Objective:
We aimed to assess if forgetting slopes of people with mild cognitive impairment due to AD (MCI-AD) are different from age-matched healthy controls (HC) by using a prose paradigm.
Methods:
Twenty-nine people with MCI-AD and twenty-six HC listened to a short prose passage and were asked to freely recall it after delays of 1 h and 24 h.
Results:
Generalized linear mixed modelling revealed that, compared to HC, people with MCI-AD showed poorer encoding at immediate recall and steeper forgetting up to 1 h in prose memory as assessed by free recall and with repeated testing of the same material. Forgetting rates between groups did not differ from 1 h to 24 h.
Conclusion:
The differences observed in MCI-AD could be due to a post-encoding deficit. These findings could be accounted either by a differential benefit from retrieval practice, whereby people with MCI-AD benefit less than HC, or by a working memory deficit in people with MCI-AD, which fails to support their memory performance from immediate recall to 1 h.
INTRODUCTION
The extent to which the slopes of forgetting in Alzheimer’s disease (AD) and in mild cognitive impairment (MCI) differ or not from healthy people is still a matter of debate. Previous research involving AD or MCI cohorts has provided contrasting findings, both in support [1–4] or against [5–9] the presence of faster forgetting in this clinical population. Divergent findings have been accounted for in term of methodological inconsistencies across studies [10] and to the variety of the testing methods [11].
However, contrasting findings are reported even within the same procedure. Here we consider prose memory. Walsh et al. [3] observed a greater decline in free prose recall among MCI patients within 30 min and 1-week delays, after equating their learning performance with that of the healthy controls (HC). In another recent cross-sectional study involving people with pre-symptomatic autosomal dominant AD, Weston and colleagues [4] reported faster forgetting on free recall of both verbal (wordlist and story) and visual (figures) material after 1-week, as compared to immediate and 30 min recall. Similarly, Zimmermann and Butler [12] observed faster forgetting on free recall of a wordlist after 1-week in a cohort of asymptomatic people at sporadic genetic risk (i.e., not inherited) to develop AD. On the other hand, Stamate et al. [9] employed a 70% correct encoding criterion on a task assessing cued recall for prose material and observed similar forgetting rates to controls, concluding that memory impairment in AD originates from an encoding deficit.
In the current study, we aimed to assess whether or not the forgetting slopes in a group of people with MCI due to AD (MCI-AD) are different from age-matched control participants by using a prose paradigm [13]. The term MCI due to AD refers to the symptomatic predementia phase of AD. In this phase of AD, people might experience a gradually progressive cognitive decline resulting from the build-up of AD pathology in the brain. Therefore, Albert et al. [14] considered this phase of impairment as a continuum, and they outlined the core clinical and research criteria for MCI-AD (see also [15, 16]).
METHODS
Participants
A group of 33 people with a clinical diagnosis of MCI-AD [14], Mini-Mental State Examination (MMSE) [17] score between 24 to 30, and unimpaired Independent Activities of Daily Living [18] were recruited for the experiment. All these outpatients underwent general and neurological examinations, brain magnetic resonance imaging (MRI) (or computed tomography in the case MRI was unfeasible due to contraindications), [F-18] fluorodeoxyglucose positron emission tomography (FDG-PET), complete blood and urine screening tests, and a full neuropsychological assessment (see Table 1) within 6 months before the experimental procedure. The presence of leukoaraiosis and/or small white matter vascular lesions was not considered as an exclusion criterion. Among these 33 participants, 22 (67%) underwent either amyloid PET (n = 6, 18%) or cerebrospinal fluid (CSF) biomarker assay (n = 16, 49%) and had either abnormal amyloid PET or CSF markers. For amyloid PET scan, they showed the presence of brain amyloidosis based on validated criteria as indicated in the instructions provided by the manufacturer, whereas for CFS analysis they had a pathologic value of amyloid-β (both Aβ42/Aβ40 ratio) and p-Tau181 according to the vendors’ guidelines. They were therefore considered as high likelihood for conversion to AD [14]. The remaining 11 (33%) participants had an intermediate likelihood to convert to AD [14] based on positive neurodegeneration biomarkers (as assessed by MRI andFDG-PET [20]).
Mean scores on standardized neuropsychological test achieved by people with MCI-AD (first column) and their number scoring below 1.5 Standard Deviations (second column)
The full neuropsychological battery included cognitive screening (MMSE) [17], attention and executive - Trail Making Test, A & B [23], symbol digit, [23], Stroop test [24]; memory – Digit Span forward [25, 26], Corsi Span forward [25, 26], Babcock Story Recall [27], Selective Reminding Test [28]; visuo-constructional abilities – Praxis [29], Clock Drawing Test [30]; and language – Semantic and Phonological fluency [31]. *Scores on each neuropsychological test was corrected by age and education for every participant (whenever possible).
Four participants (2 with intermediate and 2 with high likelihood for conversion to AD) were excluded due to floor performance at immediate recall (see below, Procedure section). We defined ‘floor’ as memory performance equal to a score of 0, in line with previous studies that used the same experimental paradigm [13, 19]. Therefore, the performance of the 29 people with MCI-AD (15 women, age range: 66–84, M = 77.24, SD = 4.91) was considered for the analyses. Their years of formal education ranged 5–18 (M = 10.44, SD = 3.99). These participants were also assessed on functional scales, including Independent Activities of Daily Living [18] (M = 6.75, SD = 1.32, cut-off:≤4) and the Geriatric Depression Scale [21] (M = 2.77, SD = 2.87, cut-off:≥11).
The control participants were 26 age and education matched healthy volunteers (15 women, age range: 67–85, M = 75.65, SD = 5.78; education range 5–18, M = 10.76, SD = 4.45) who gave their informed consent to participate. Their healthy condition was carefully checked by means of general medical history and they were assessed with the Italian Telephone version of the Mini-Mental Status Examination (ITel-MMSE) [22] to exclude an incipient cognitive decline. The cut-off score for the ITel-MMSE was set at 20/22 and no participants was excluded at screening (M = 21.30, SD = 0.78). Participants from this group were recruited through community-based social centers or word of mouth.
In line with the health and safety regulations for COVID-19 in place at the time of testing, all the participants were assessed through telephone-based interviews, while people with MCI-AD undertook neuropsychological testing during their most recent face to face examination at the hospital clinic.
Ethical approval was obtained from the Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Maternal and Child Health (DINOGMI), of the University of Genoa (Italy), in line with the current norms of General Data Protection Regulation (GDPR).
Materials
One prose narrative was selected from previous research [13, 32–34] and translated into Italian (see Table 2). The prose passage described a single episode, and it comprised five sentences for a total of 59 words. Reading it takes approximately 26 s.
The narrative presented to the participants at encoding
This text was assigned a fixed score for central events of the prose passage based on previous research that designed and initially used this prose material [32–34]. Following previous studies, memory items were considered as a precise recall of the passage, as an event (“Juice squirts out”), the people involved (“Woman in grocery store”), interaction among characters (“Cashier is angry”), and actions (“Woman is squeezing a peach”). Although the prose narrative contained only five sentences, the second and fourth sentences of the narrative were quite articulated and therefore contained two central events (see Table 2).
The scores on this task ranged from 0 (the lowest) to 7 (the highest). As in Sacripante et al. [13], the scoring procedure did not include partial credits, therefore if participants reported “the peach squirts out” they would be assigned the same score of 1 as in “woman squeezes (the fruit) and the juice squirts out”. Also, memory scores were assigned following a lenient criterion (i.e., not strictly verbatim), therefore a score was given for other analogous version of the event (e.g., “the fruit bursts” or “the peach explodes”). False memories were also recorded (see Results section).
Before commencing data collection, the prose passage had been piloted for comprehension over the phone on 4 people with MCI-AD (all outpatients at the IRCSS San Martino Hospital). These participants were not included in the MCI-AD group for analysis. Furthermore, inter-rater reliability of the scores was analyzed by comparing a subset of 30 scores from 15 participants included in the MCI-AD group given by the first rater (R.S.) to those of a second rater (N.G.). The Krippendorf’s alpha was 0.82, which indicates a good agreement among the two raters.
Procedure
All the participants were assessed through telephone-based interviews to avoid non-essential face to face contact. Prior to the first testing session, all participants and their respective caregivers were contacted by telephone to verify their interest and willingness in taking part to the study. All sessions were recorded. A flow chart of the experimental procedure is shown in Fig. 1.

Steps of the research paradigm comprising three sessions.
The experimental procedure was the same for both groups, with the only exception that people with MCI-AD were not administered any cognitive screen over the phone, as they had been already tested in the clinic on a previous face to face appointment, and that control group were tested with only one repetition of the story to obtain an average performance at immediate recall that was closer to that of the MCI-ADgroup.
On the first telephone-based session, the testing protocol was explained to the participants who were also informed about follow-up calls, without offering any information regarding the purpose of these phone calls. In the first session, people with MCI-AD were repeated the prose passage twice (one after the other) so to decrease the expected difference of performance at encoding between the two groups and to avoid floor performance, whereas HCs were read the story only once. The decision to present the MCI-AD group with two repetitions of the story derived from a pilot study where we compared the performance of four people with MCI-AD with one presentation vs two repetitions. During this pilot, we observed that, after two repetitions, MCI-AD participants were less likely to perform at floor at immediate recall and their performance was closer to the HC group.
Participants were instructed to listen carefully to the story and were told not to write any notes during and after its presentation. Immediately after, participants were asked to freely recall all they could remember about the story, without any specific instruction regarding specific events and without time limit. No interpolated task was used between the story presentation(s) and immediate recall.
In both follow-up sessions (after 1 h and 24 h), participants were contacted again by phone and asked to freely recall the story, without offering any cues or prompts. At the end of each follow-up testing session, participants were asked whether they tried to recall or thought about the story between the first and the second session, or between the second and the third session. This procedure was meant to check the potential role of active rehearsal. Overall, 35 participants reported having tried to recall or thought about the story among intervals at least once, 18 in the control group and 17 in the MCI-AD group. No participant reported writing down the story. At the end of the second follow-up session, participants were also debriefed about the experiment.
The performance of 4 participants from the MCI-AD group were excluded due to floor performance at immediate recall.
Statistical analysis
A 2×3 mixed factorial design was considered for the analysis, with group (MCI-AD versus HC) as between-subjects variable and recall time (immediate versus 1 h versus 24 h) as within-subjects variable. Memory score was the outcome dependent variable.
Given the non-normal distribution of the data, analyses were performed with a generalized linear mixed model (glmer) as implemented by lme4 package [35] in R (version 4.1.1). In this model, the outcome variable represented the number of correct responses out of 7 questions, which was the maximum score possible (Score, 7-Scores). As in Sacripante et al. [13], we defined correct responses as the number of units of information that were recalled correctly. Since the data did not follow a normal distribution, they were instead modelled using the binomial distribution (family = binomial). The fixed effects considered the interaction between group (MCI-AD versus HC) and recall (immediate, 1 h, 24 h) with a random intercept for participants (cbind (Score, 7-Score) Group*Recall+(1|ID)). The intercept of this model was memory scores at immediate recall in the control group.
RESULTS
As revealed by one-tailed independent t-tests, the MCI-AD and the HC groups were matched for age, t(53) = 1.10, p = 0.13, and for level of education, t(53) = –0.28, p = 0.38.
Results and descriptive data are reported in Fig. 2 and Table 3 respectively. Table 4 also reports the odds ratios and the confidences intervals from the output of the generalized linear mixed model.

Mean scores in Healthy Controls (HC, n = 26) and in the MCI-AD (n = 29) group at immediate recall, 1 h, and 24 h with Standard Errors.
Descriptive statistics of memory scores at the three-time intervals (Immediate, 1 h and 24 h) for Healthy Controls and MCI-AD, including 95% confidence intervals (CI) around the mean
Output from the generalized linear mixed model with Odds Ratios, 95% confidence intervals, and p-values.
As illustrated in Fig. 2, at immediate recall the HC and MCI-AD groups were not equated in terms of initial level of performance. Also, while HC group’s performance remained relatively stable across the three-time intervals, the MCI-AD group showed a steeper decrease in performance, especially from immediate to 1 h recall.
The between-subjects factor of group significantly predicted performance at immediate recall, as memory scores of the MCI-AD group were significantly lower when compared to the HC, b = –1.00, SE = 0.38, z = –2.63, p < 0.01, d = 0.55. This means that, despite being presented with two repetitions of the story, the MCI-AD group’s encoding was impaired at immediate recall.
Among the healthy controls, memory performance did not significantly decrease after 1 h, b = 0.03, SE = 0.26, z = 0.13, p = 0.89, d = 0.01, and after 24 h, b = –0.30, SE = 0.25, z = –1.18, p = 0.23, d = 0.16.
When considering the interaction between group and recall, memory performance of the MCI-AD group decreased significantly more than healthy controls from immediate to 1 h, b = –0.78, SE = 0.34, z = –2.26, p = 0.02, d = 0.43, and from immediate to 24 h, b = –0.91, SE = 0.34, z = –2.63, p < 0.01, d = 0.50.
To better explore these interactions, post hoc comparisons between 1 h and 24 h with a Bonferroni correction were carried out as implemented by the package emmeans [36]. Such comparisons revealed that memory performance of the MCI-AD group did not decrease significantly more than HCs from 1 h to 24 h, b = –0.12, SE = 0.34, z = –0.36, p = 1.00, d = 0.06.
Rate of floor performance (i.e., scores of 0) was prevalent in the MCI-AD group, which accounted for 13.8% of the scores after 1 h and 27.6% of the scores after 24 h (8 participants in total). After excluding floor performance, the MCI-AD group was still impaired at immediate recall, b = –0.81, SE = 0.36, z = –2.23, p = 0.05, d = 0.44. The overall shape of the forgetting rate slopes did not change, however the decreased statistical power resulted in differences short of significance.
As the nature of our memory task might have facilitated the performance of the HC group, the rate of ceiling performance (i.e., scores of 7) in HC was also evaluated. Scores at ceiling were 19.2% at immediate recall and after 1 h (5 participants), and 11.5% after 24 h (3 participants).
In the MCI-AD group, 20 participants reported items not presented in the original narratives (i.e., false memories), for a total of 59 instances, 12 at immediate recall, 24 at 1 h, and 23 after 24 h. Eight participants from the HC group also committed false memories, for a total of 15 instances, 7 at 1 h, and 8 after 24 h.
DISCUSSION
In this experiment, we aimed at investigating if people with MCI-AD present with different slopes of forgetting as compared to HC when assessed on prose memory via free recall and repeated testing.
We therefore compared the forgetting rates of a group of people with MCI-AD and a group of HCs matched for age and years of education. The MCI-AD group was presented with two repetitions of the story, whereas the HCs were presented with only one repetition. This procedure was devised to minimize floor scores at encoding in the MCI-AD group and to equate as much as possible their performance to the HC. Participants from both groups were then asked to freely recall the story immediately following encoding, and then 1 h and 24 h later.
People with MCI-AD showed poorer encoding at immediate recall and steeper forgetting of gist memory from immediate recall up to 1 h as assessed by free recall and with repeated testing of the material. However, forgetting between groups did not significantly differ from 1 h to 24 h. Therefore, the different rates of forgetting observed in the MCI-AD after 1 h could be due to a post-encoding deficit at immediate recall.
Encoding deficits in this population have been already documented in previous research [5, 37–39]. Furthermore, Moulin et al. [40] reported both encoding deficits and faster forgetting in people with amnesic MCI and AD despite showing practice effects during learning, yet these two deficits were not correlated with one another, leading the authors to postulate that they originated from two different mechanisms. In another study, Vallet et al. [7] observed that people with amnestic MCI and AD presented with an encoding deficit combined with a greater reliance on gist memory (see also [41]). Other studies showed normal forgetting on recognition memory but steeper forgetting rates on recall memory for lists of words [42–44] and prose passages [45, 46] in people with amnesia or AD. Green and Kopelman [44] observed that, when tested on free recall of words, amnesic patients presented with a primary encoding deficit and also with a deficit in long-term retention.
To account for these findings, it could be argued that, in our paradigm, people with MCI-AD appeared to benefit less than the HCs from repeated testing. Lack of practice effects after repeated testing has been reported with prose recall in this clinical population [47–52]. Lower practice effects on episodic memory in people with MCI are associated with an increased risk of progression to AD [53] and they also predict cognitive decline at one year follow-up [54]. Indeed, lower practice effects on episodic memory tasks reliably predicted future cognitive decline [55–58] and they were commonly reported in groups who tested positive to specific biomarkers, such as amyloid-β (Aβ) [59–61] and APOE ɛ4 [62]. Moreover, they have been observed within the same testing session [59], after one week [60], or up to more than a year [63, 64]. In support of the hypothesis of a differential benefit of practice effect in MCI-AD, in a recent review of the literature, Jutten et al. [48] observed greater average practice effects among healthy older adults compared to people with MCI [59, 65–67]. Nevertheless, given the absence of a direct comparison with and without repeated retrieval (i.e., unrepeated testing at immediate recall and at 1-day only, see [9]), this interpretation can only be based on a comparison between our data and previous literature.
However, a differential benefit from practice effects is not the only possible explanation of our findings. If people with MCI-AD benefitted less from retrieval practice than HC, we might have observed further differential forgetting between 1 h and 24 h. Between those time intervals there was more retrieval practice, as participants from both groups recalled the story for the second and the third time respectively. Yet, our results showed that, after an initial difference in the forgetting rates up to 1 h, people with MCI-AD did not show faster forgetting from 1 h to 24 h.
Therefore, the difference observed between the two groups could be due to a working memory impairment of people with MCI-AD. Previous studies included an interpolated (or filler) task between story presentation at learning and immediate recall [5, 69]. Given the absence of an interpolated task in our research paradigm, we are unable to control for working memory differences between groups. However, the use of an interpolated task between the stimuli presentation and immediate recall in a telephone-based paradigm would have likely confused our participants.
The current experiment presents with some limitations. Firstly, delayed memory scores could have been confounded by floor performance in some people with MCI-AD, especially after 24 h. However, if anything floor effect would act against the steepness of the forgetting slope. To prevent the issue of floor effect in the MCI-AD group, these participants could have been offered more repetitions at encoding. Nonetheless, this procedure would have biased our data due to the higher number of repetitions offered to this group as compared to HCs. Moreover, a score of zero has been chosen as floor. It should be noted that no participants from either group scored 1 at immediate recall.
Another confound might derive from ceiling scores among HC, especially at immediate recall. However, the number of healthy controls scoring at ceiling (i.e., 7 items out of 7) was small. According to Garin [70], floor and ceiling performance is considered problematic when more than 20% of participants score the minimum or the maximum score possible. In our experiment, the rate of HCs scoring at ceiling was never over 20% (equal to 5 participants).
Another caveat derives from the modality of presentation of the study material, as both the encoding and the retrieval sessions could only take place over the phone. Despite this being a validated methodology [9, 71–73], phone-based assessment might have affected the memory performance of our participants, regardless their cognitive status. This modality of presentation might have placed additional cognitive constraints on attentional control [74].
Lastly, in our experiment the initial level of performance between the two groups at encoding was not equated. Even though the MCI-AD group was offered a second repetition of the story, this could have not been enough for some participants whose performance was just above floor. On the contrary, previous studies in favor [3] or against [9] faster forgetting in early AD managed to equate the initial level of performance (see also [4, 46]).
In conclusion, when people with MCI-AD were tested on a prose recall paradigm requiring repeated testing, their encoding was found to be impaired at immediate recall and their memory for prose was also forgotten at a steeper rate after 1 h but not after 24 h. We argued that the differences observed between groups could be due to a post-encoding deficit, either caused by a differential benefit to repeated retrieval or by working memory differences at encoding.
Footnotes
ACKNOWLEDGMENTS
We are very thankful to all the participants who took part in this study and their families.
FUNDING
This work was part of a PhD project funded by Fondazione Majid, Ascona, Switzerland.
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
DATA AVAILABILITY
The data supporting the findings of this study are available on request from the corresponding author. The data are not publicly available due toprivacy.
