Abstract
Activating a previously consolidated memory trace brings it back into a labile state where it must then undergo a re-stabilisation process known as reconsolidation. During this process memories are susceptible to interference and may be updated with new information. In the studies showing this effect in human episodic memory, the reconsolidation process has been triggered primarily using spatial context or prediction error manipulations to reactivate an established memory. However, these studies have produced conflicting results, showing both that spatial context is necessary and sufficient to trigger reconsolidation and that prediction error is necessary and sufficient to trigger the process. We examined this conflict in two experiments, one investigating the role of context cues and another investigating the role of prediction error. In Experiment 1, spatial context triggered a reconsolidation process and prediction error was irrelevant. In Experiment 2, prediction error triggered reconsolidation, and spatial context cues were irrelevant. These findings replicate prior research but add to the puzzle concerning the roles of these two means of triggering reconsolidation.
It is well established that a consolidated memory can be altered as a result of reactivation and destabilisation, or reconsolidated. Evidence of reconsolidation has been consistently found in the animal literature, human procedural memory, human fear memory, and human episodic memory (Dudai, 2009; Hupbach et al., 2007; Sevenster et al., 2014; Walker et al., 2003). Thus, across species and memory systems it has been found that reactivating a previously consolidated memory can bring it into a labile state in which it may undergo change via updating, strengthening, or weakening. Of course, this lability is not induced every time a consolidated memory is activated and several boundary conditions of reconsolidation have been identified. Among these are the age of the memory (Forcato et al., 2009), the strength of the memory (Pedreira et al., 2004), and a limited time window during which the memory is labile (Forcato et al., 2007; Hupbach et al., 2007). The means by which a memory is reactivated is also an important factor in that not all reminders will successfully trigger reconsolidation. The two most popular means of triggering reconsolidation are spatial context and prediction error (a violation of expectations). However, studies using these two methods produce different results. Studies using spatial context show that it is necessary and sufficient to trigger reconsolidation (e.g., Hupbach et al., 2007), whereas studies using prediction error show that it is necessary and sufficient to trigger reconsolidation (e.g., Forcato et al., 2009). Thus, these paradigms contradict each other in showing which type of reminders can effectively induce reconsolidation.
Studies using spatial context or prediction error are similar in that they use a 3-day procedure with an original learning session, a second session with a reminder of the original learning followed by new learning, and a third session in which a test over the material learned in Session 1 (and sometimes Session 2) is given. However, the two types of studies differ on most other details. In studies using spatial context as a reminder to trigger reconsolidation (e.g., Hupbach et al., 2007), participants have typically memorised a set of objects (balloon, sock, etc.) to a criterion. After a 2-day delay, one group (the reminder group) returned to the same spatial context to learn a second series of objects, and another group went to a different context to learn the second series (the no-reminder group). A third group of participants in a control condition did not learn a second list of objects. Finally, all participants returned to the lab 48 hr later and freely recalled List 1 objects. In these studies, memory for List 1 was updated with items from List 2, but only when participants had learned the second list in the same spatial context as the first. Several other studies replicated this effect, showing that spatial context reliably triggers reconsolidation (Capelo et al., 2018; Hupbach et al., 2008, 2009, 2011). Importantly, the effect in this paradigm cannot be attributed to source confusion, it is limited to a specific time window, and to unfamiliar spatial contexts (Hupbach et al., 2008, 2009, 2011). In addition, context cues other than spatial context, such as the experimenter, have failed to trigger reconsolidation (e.g., Hupbach et al., 2008). Overall, these studies provide strong evidence that spatial context can serve as an effective way of reactivating an established memory and trigger a reconsolidation process that updates List 1 memory with List 2 items (see Scully et al., 2017 for a review). Also, because there was no (obvious) source of prediction error in these studies, they suggest that it may not be necessary to trigger reconsolidation.
In direct contradiction to the finding that spatial context is an effective trigger of reconsolidation are studies investigating the role of prediction error in reconsolidation, which have shown prediction error to be necessary and spatial context to be irrelevant. Prediction error studies also use the 3-day procedure but differ from spatial context studies in other methodological details. For example, in Forcato et al. (2009), participants memorised a set of five pairs of nonsense syllables (e.g., mar—cle) on Day 1 within a specific context created with lights, images, and sounds. On Day 2, they returned to the same spatial context as Day 1 and were divided into three experimental groups that varied according to the type of reminder that was given, as well as three control groups. Focusing on the experimental groups, all participants expected to take a test over the material learned on Day 1 and the contextual cues associated with Day 1 learning were reinstated. However, in the context-reminder condition, the computer appeared to crash before the test of List 1 began. In the cue-reminder condition, the first trial of the test started with the presentation of a cue, but the computer appeared to crash before the target could be recalled. In the response–reminder condition, the computer crashed after participants recalled one cue–target pair but before they could finish the test. Following these reminders, participants in the experimental conditions (and some control conditions) learned a second set of cue–target pairs, which were associated with a different light/sound/image context than List 1. On Day 3, participants’ memory for Lists 1 and 2 was tested using cued recall. Importantly, spatial context was held constant across all conditions. Thus, from a spatial-context perspective, reconsolidation should have occurred in all the experimental conditions. However, the results indicated that only the cue-reminder group, which experienced the crash after seeing the cue but before being allowed to recall the target, showed evidence of reconsolidation. These results suggest that prediction error, of a very specific type, is necessary to trigger reconsolidation.
Like the studies on spatial context, results with prediction error have been replicated multiple times, and are subject to similar boundary conditions. For instance, there is a specific time window during which reconsolidation will occur (Forcato et al., 2007; Sinclair & Barense, 2018), and it can depend on age and strength of the established memory (Fernandez et al., 2016). Thus, there are some reassuring replications of boundary conditions across these paradigms, but mixed results regarding how reconsolidation may be triggered. Specifically, there is a conflict between the two paradigms, because spatial context is irrelevant in studies using prediction error as a reminder, and prediction error is irrelevant in studies using spatial context as a reminder.
The explanations for why each of these reminders work are grounded in well-established memory theories. Spatial context, a key feature of episodic memory, is thought to serve as an effective reminder because it forms a sort of “scaffolding” upon which memories can be built, thereby making an established memory vulnerable to updating by new information that binds to the same scaffolding (Hupbach et al., 2008; see Sederberg et al., 2011 for a similar idea about temporal context). Likewise, the role of prediction error in learning is well established (see Krawczyk et al., 2017 for a review); specifically, learning is thought to take place when there is novel information present, and novelty can be established via prediction error (Rescorla & Wagner, 1972). That is, when the expectations for a given situation are violated, the established memory must be updated. This may trigger reconsolidation or the creation of a new memory depending on the nature of the prediction error (e.g., Forcato et al., 2009; Krawczyk et al., 2017). These two ideas are not necessarily mutually exclusive, but the results stemming from the different studies would make it appear as though they are.
One reason it is difficult to reconcile these results is that there are consistent and significant differences in the methods across the two types of studies that make them difficult to compare. First, the different sets of results stem from different theories that are associated with different methods. Whereas the investigation of the role of spatial context has been guided by multiple trace theory (Nadel & Moscovitch, 1997; for a review see Nadel et al., 2012), the prediction error framework in humans stems from research in animal learning (e.g., Pedreira et al., 2004; for a review, see Sara, 2000). Methodologically speaking, in experiments studying the role of spatial context, objects are learned and recalled, sessions are separated by 48 hr, and typically, only memory for the first list is tested (but see Hupbach et al., 2009). In experiments studying the role of prediction error in humans (e.g., Forcato et al., 2007, 2009), short lists of unrelated syllable pairs are learned, sessions are separated by 24 hr, spatial context is held constant but non-spatial context cues associated with Lists 1 and 2 differ across sessions, both Lists 1 and 2 are tested by cued recall, and the patterns of errors (commission and omission) on Lists 1 and 2, including a lack of a retrieval-induced forgetting effect of List 1 on List 2, are taken as evidence of reconsolidation. Thus, the studies differ not only in the nature of the reminder, but also in the memoranda, the type of test, the length of the delay, non-spatial context cues, the dependent measure, and pattern of results taken as evidence of reconsolidation. One could argue that they are such different paradigms as to be incomparable. However, if each of the types of triggers is robust, their ability to trigger reconsolidation should generalise across labs and across experimental procedures. One of the goals of the current study was to level the playing field such that spatial context and prediction error could be investigated by a single lab, using the same materials, delay, memory tests, and dependent measure.
One other aspect of the findings from these two paradigms warrants attention: within each, it is unclear why some forms of reminders work but others do not. Studies have shown that spatial context is special in its ability to trigger reconsolidation in comparison to other context cues (as long as it is an unfamiliar context, Hupbach et al., 2009), though it is still not clear why that is and whether there are any other context cues that may work (Hupbach et al., 2008). Scully and Hupbach (2019) showed that the degree of activation of List 1 was linked to the intrusion rate of List 2 items into List 1, such that only specific reminders were effective in producing memory updating. Reminders that were too weak or too strong did not produce a memory-updating pattern. Likewise, studies of the prediction error account have shown that only a very specific type of prediction error is successful. Although all experimental conditions in the Forcato et al. (2009) study included a mismatch between expectations and reality, they concluded that Day-1 memory is not always made labile by prediction error; rather, the reminder must have a specific structure to destabilise the memory and render it vulnerable to change. However, it is difficult to ascertain in an a priori manner what degree or structure of the error should be effective and what other manipulations of prediction error might work.
As a means of equating methods to test the two types of reminders we used the same 3-day procedure as that used by Hupbach and colleagues and Forcato and colleagues. In both experiments, participants learned a list of unassociated words on Day 1. On Day 2 (24 hr later), a reminder of Day 1 consisted of either context cues (Experiment 1) or prediction error (Experiment 2), and participants learned a second list of unrelated words. Twenty-four hours later, participants returned to the lab and were tested by source-specific free recall of both lists. To measure the effect of reconsolidation we used a new dependent variable, the asymmetrical intrusion score (henceforth, the AI score). The intrusions measured in the AI score were List 2 responses on the test of List 1, and List 1 responses on the test of List 2 (extra-list intrusions were not analysed, as they were not of interest here). The AI score is the difference between the number of intrusions from List 2 into List 1 and the number of intrusions from List 1 into List 2 ([List 2 → List 1]—[List 1 → List 2]). Therefore, positive values reflect memory updating, values near zero reflect no updating, and negative values would reflect more List 1 intrusions into List 2 than vice versa (i.e., an effect more likely to be due to source confusion or interference, not reconsolidation). This measure improves upon previous dependent variables, because it includes intrusions on both tests and compares asymmetry in both directions. For example, it is possible to have more intrusions from List 2 in the reminder condition than in the no-reminder condition, but still have an AI score that does not differ from zero, indicating that the intrusions were actually symmetrical. Other studies that have investigated the directionality of intrusions have typically found intrusions from one list into another, but have not tested both lists in a single experiment. For example, Hupbach et al. (2007) tested memory of List 2 on Day 3 in their third experiment, but did not test memory of List 1. They found that there were no differences in intrusions between reminder groups when List 2 was tested. In a separate experiment, they did find differences between reminder groups when List 1 was tested. Thus, although they addressed the directionality of intrusions, they did so across experiments. In the current experiments, we tested both Lists 1 and 2 on Day 3. We used our AI measure to test the robustness of the reconsolidation effects found in the spatial context and prediction error studies.
Experiment 1
The purpose of Experiment 1 was to investigate the role of contextual cues as triggers of reconsolidation. To date, the only contextual cues that have been found to successfully trigger reconsolidation have been spatial cues (e.g., Hupbach et al., 2008). However, non-spatial cues have been directly compared with spatial cues only once and those cues, such as a general reminder question, may not have been particularly salient or well-integrated with the original memory (Hupbach et al., 2008). It may be the case that contextual cues other than space can be effective if they are a salient part of the environmental context. To investigate this question, spatial context and a non-spatial context cue (a neon pink laptop sleeve) were manipulated as between-subjects variables with spatial context as same or different and the non-spatial cue as present or not present on Day 2 (see Figure 1). Thus, there were three reminder conditions (same present, same not present, different present) and one condition with no reminder (different not present). The lab served as the spatial context on Day 1; those in the same condition returned to the lab on Day 2 and those in the different space condition went to the university library on Day 2. All participants returned to the lab for testing on Day 3. The non-spatial context cue, the neon pink laptop sleeve, was visible throughout study and test on Day 1. This cue was either present or absent throughout Day 2 procedures. It was not present during testing on Day 3 in any of the conditions. On Day 3, participants took two source-specific free recall tests, one for List 1 and another for List 2, with the order of tests counterbalanced across participants. Findings from prior studies led to the prediction that spatial context should be an effective trigger of reconsolidation, whereas it was more questionable whether the non-spatial context cue would be. Notably, the prediction error literature (e.g., Forcato et al., 2009; Krawczyk et al., 2017) would predict that reconsolidation should not occur in any condition because there was no obvious source of error and instructions were explicit about the procedures participants would go through on the following days so as to ward off the possibility of any inadvertently created prediction error. Specifically, participants in the same conditions were told that they would return to the lab on Day 2 to learn a different list of words, would work with the same experimenter, and would return to the lab again on Day 3 to take a test over both sets of words (again with the same experimenter). Participants in the different conditions were told that Day 2 would take place at the university library with the same experimenter (and how to meet the experimenter in the right place), and that they would return to the lab on Day 3 to take a test over both sets of words, with the same experimenter. Thus, any expectations that the participants had were matched across all groups (i.e., they were all told exactly what to expect).

We manipulated spatial context and the presence of a salient cue as triggers for reconsolidation. On Day 1, subjects studied a list of words to criterion in the lab with the cue present. On Day 2, they came back to the lab or went to the library, the cue was either present or absent, and subjects learned another list of words. On Day 3, subjects took a source-specific recall test over the words from Days 1 and 2.
Method
Participants
Power analyses using GPower were conducted using the effect size reported by Scully et al. (2017) in a meta-analysis of memory-updating effects for a comparison between reminder and no-reminder conditions, Cohen’s d = 1.03. The power analysis indicated that 13 participants were needed per condition. We included more participants in our sample because 13 was not sufficient for our counterbalance, and the power analysis used Cohen’s d for t-tests rather than
Materials
Fifty words from the MRC psycholinguistic database were used. Words were split equally into two lists. List 1 and List 2 consisted of 25 unrelated words each. Words were of middle to high frequency (List 1 M = 103; List 2 M = 102) and between four and five letters each (List 1 M =4.76; List 2 M = 4.88). Lists were counterbalanced such that each list appeared equally often as List 1 and List 2. The order in which participants were asked to recall the lists was also counterbalanced; half were administered the test of List 1 first and half were administered the test of List 2 first. A laptop was used to administer the materials regardless of location. A questionnaire which included a question about whether the sleeve caught their attention on the days it was present was given at the end of Day 3. The experiment was programmed with E-Prime 2 (Psychology Software Tools, 2014).
Procedure
Participants came into the lab for three individual sessions that took place on three consecutive days, 24 hr apart, with the same experimenter on all 3 days. Participants were informed that they would learn different lists on each day of the experiment and would recall them on the third day. On Day 1, participants were told they would take a memory test. They studied words from List 1 that were shown one at a time on the screen for 4,000 ms. A bright pink laptop sleeve was always visible throughout the procedure on Day 1. After studying, participants were asked to recall the words from List 1. If fewer than 20 words (80%) were recalled, participants repeated the learning process and were tested again. Once five learning trials took place, or at least 20 words were recalled, the session ended. At the end of the session, participants were told that they would either come back to the lab (same space condition) or would meet the experimenter at the university library (different space condition), where they would learn a different set of words.
On Day 2, participants were split into the four experimental groups. Following Hupbach et al. (2008), we did not include a question about Day 1 as part of the reminder conditions (see also Jones et al., 2015); thus, space and the external cues served as the reminders (or lack of reminders). Spatial context was manipulated by administering List 2 either in the same lab setting or in a study area of the university library. The bright pink laptop sleeve served as the non-spatial context cue and was either present and visible throughout the procedure or was absent. Thus, there were four between-subjects conditions crossing spatial context with the presence or absence of the non-spatial cue. Other than the presence or absence of the non-spatial context cue and the new items studied, List 2 instructions, study, and test procedures were the same as that used on Day 1. At the end of the session, all participants were informed that they would return to the lab on Day 3 for a memory test. Importantly, participants were informed of what they would be doing in all conditions, and those expectations were not violated; thus, prediction error was not manipulated and all precautions were taken to avoid unplanned sources of prediction error.
On Day 3, all participants returned to the lab and were given source-specific recall tests; they were asked to recall the words from both lists, one list at a time. Participants were asked to recall all 25 words; they were instructed to guess once they had recalled all that they thought they could to reach 25 if necessary as a means of keeping performance away from floor. Participants were instructed to recall words specifically from List 1 or List 2, on the appropriate test, and that each word they recalled could be used only once. Specifically, they were instructed that if they recalled a word for Test 1 they could not use that word again for Test 2. Finally, a questionnaire was administered to assess if any specific memory strategies were used, and to determine if the laptop sleeve was salient and memorable. In all, 59 participants indicated that they noticed the laptop sleeve. Of the 13 people that reported not noticing the sleeve, 5 were in the same space/cue absent condition; 4 were in the different space/cue absent condition; 2 were in the same space/cue present condition, and 1 was in the difference space/cue present condition.
Results and discussion
Acquisition performance on Days 1 and 2
On Days 1 and 2, participants were required to go through study-test trials until they could recall 20 out of the 25 words. On Day 1, participants took an average of 3.06 learning trials to reach criteria (SD = 1.00). On Day 2, participants took an average of 3.11 learning trials to reach criteria (SD = 1.16). Acquisition performance on Days 1 and 2 was analysed with a paired-samples t-test and showed no significant difference, t(71) = −.415, p = .680, d = .05. Experimental groups were established on Day 2, and thus a two-way ANOVA, with spatial context and non-spatial cue as between-subjects factors, was conducted to examine potential group differences in acquisition on Day 2. No significant differences were found, spatial context and non-spatial cue Fs < 1.5, interaction F(1, 68) = 2.68, p = .106,
Free recall on Day 3
A three-way mixed ANOVA, with spatial context and non-spatial cue as between-subjects factors and list as a within-subjects factor, was conducted to compare accurate free recall on Day 3 across lists and groups. There was a significant difference between lists, with higher accuracy on List 2 (M = 13.31; SD = 4.30) than List 1 (M = 10.50; SD = 4.65), F(1, 68) = 18.24, p < .001,
Asymmetrical intrusion scores
The AI scores were analysed using a two-way between-subjects ANOVA (see Figure 2). There was a main effect of spatial context, such that AI scores were higher when participants returned to the same space on Day 2 than if they went to a different space on Day 2, F(1, 68) = 9.33, p = .003,

Effects of spatial context and an external reminder on asymmetrical intrusion scores. Error bars reflect standard error of the mean.
List-2 intrusions
The AI score that we present here is novel, and it is worth examining the more typical way of analysing the memory-updating pattern. The typical analysis has compared List-2 intrusions across the reminder and no-reminder conditions (e.g., Hupbach et al., 2007), with the assumption that asymmetrical intrusions would be evidenced by more List-2 intrusions in the reminder than in the no-reminder condition. For comparison with AI analysis, we also examined List-2 intrusions on the test of List 1 specifically. To do so, we conducted a 2 (space) × 2 (external cue) between-subjects ANOVA with the number of List-2 intrusions on the List 1 test (see Table 1). Similar to the analysis of AI scores, there was a main effect of space, F
Means (and standard deviations) of List-2 intrusions in Experiment 1.
In sum, the presence of a salient non-spatial cue did not affect the AI score (or the List-2 intrusion rate) but spatial context did, indicating that spatial context can trigger a reconsolidation process though other salient cues may not. These results replicate and extend previous findings showing spatial context to be an effective means of triggering reconsolidation and to be superior to other, non-spatial, context cues. Moreover, because both lists were tested on Day 3, the results confirm that the effect is not simply due to source confusion, but in fact demonstrates that List 1 items are more likely to be updated with List 2 items than List 2 items are to be interfered with by List 1 items. Notably, these results also suggest that prediction error is not a necessary component of an effective reminder because it was not included in this experiment. Thus, the argument that a mismatch between expectations and reality is necessary is too strong; it is not always necessary to produce a reconsolidation pattern. Whether spatial context is a necessary ingredient in triggering reconsolidation when prediction error is manipulated was one of the subjects of the second experiment.
Experiment 2
In studies examining prediction error manipulations, spatial context has proven to be irrelevant to reconsolidation effects, in that it is held constant across prediction error conditions. It has also been shown that prediction error must be created in very specific ways to trigger reconsolidation (e.g., Forcato et al., 2009). In Experiment 2, we investigated whether a more obvious manipulation of prediction error, not related to the memory test itself, would trigger reconsolidation and whether spatial context would be effective within an experiment that varied prediction error rather than spatial context. Thus, prediction error was created in a novel way, by creating explicit expectations of what participants would do on Day 2. As illustrated in Figure 3, a salient procedure, playing a 2-min game of Nerf basketball, was used to create a memorable activity on Day 1. At the end of Day 1, participants were told that they would come back to the lab to do play basketball again followed by learning new words, or that they would do something different (which was left unspecified; the different task was watching YouTube videos). This manipulation was crossed with what subjects actually did on Day 2; that is, they went through the same procedures or different procedures than they expected. This created two conditions in which expectations were violated, thereby creating prediction error, and two conditions in which expectations were confirmed, and thus no prediction error. Based on prior research, we predicted that reconsolidation would be found when participants’ expectations were violated. However, because the procedure was significantly different than that used in previous research, and because it was a large and obvious manipulation, it was possible that it would simply trigger the formation of new memories rather than a reconsolidation process. Whether reconsolidation would be found when spatial context was the same on Days 1 and 2, but there was no prediction error, was an open question. Prior research within the prediction error paradigm would suggest that space should not trigger reconsolidation, whereas studies directly examining spatial context, including Experiment 1, suggest that it should.

We manipulated conscious expectations by having subjects complete a novel task (Nerf basketball) before learning the list on Day 1, later telling subjects that they would perform the same task or a different unspecified task on Day 2. The task on Day 2 either matched or violated subjects’ expectations (e.g., a match would be when they were told they would do a different task and they did a different task on Day 2). Before learning the second word list, half of each expectation group completed the basketball task and the other half rated YouTube videos on their entertainment value. On Day 3, subjects took a source-specific recall test over the words from Days 1 and 2.
Method
Participants
A total of 115 healthy adults (51 males, 64 females, age M = 19.46) were recruited from the University of Nevada, Las Vegas subject pool to participate in Experiment 2. As noted for Experiment 1, pilot studies indicated that some participants copied words from the first test onto the second test on Day 3, and thus we planned to exclude and replace participants who had more than five copies, as well as those who did not reach criterion for learning on Day 1 or 2 (see procedure). Nine participants were excluded on the basis of copies made from the first to the second test on Day 3, and two were excluded for failing to recall any words on tests of List 1 or List 2 on Day 3. Thus, the final sample included 104 participants.
Materials
The materials and counterbalancing were the same as that used in Experiment 1. To create a salient procedure as a means to create distinct expectations, we used a Nerf basketball set consisting of an orange foam ball and a plastic net hung over a door frame, along with a list of six YouTube videos consisting of cats and other small animals, each between 15 and 30 s long. A questionnaire on memory strategies was given at the end of Day 3.
Procedure
Participants came into the lab for three sessions that took place over 3 consecutive days; all three sessions took place in the same spatial context and with the same experimenter. Sessions were held 24 hr apart, and all participants were tested individually.
On Day 1, participants arrived in the lab and were informed that the study investigated different types of learning as a means of providing a cover story for the unique procedures (see Figure 3). They completed a Nerf basketball task in which they attempted to score as many baskets as possible within a 2-min period, standing approximately 10 ft away. Scores were recorded. After the basketball task, the memory test procedures were the same as those used in Experiment 1: participants learned a list of words to a criterion of 80% (or for a total of five study-test trials if 80% was not achieved). To manipulate expectations, at the end of the session on Day 2 participants in the expect-same group were told that they would be repeating the Nerf task when they returned on Day 2. Those in the expect-different group were told that they would be rating the entertainment value of YouTube videos instead of completing the Nerf basketball task on Day 2.
On Day 2, participants either completed the task they were told they would do (no-prediction error groups) or they completed a different task than they were told they would do (the prediction error groups). Specifically, those participants who were told they would play basketball on Day 2 and did so were in the no-prediction error group (n = 28). Likewise, those told that they would rate YouTube videos and did so were also in the no-prediction error group (n = 26). Those told that they would play basketball on Day 2 but instead rated YouTube videos were in the prediction error group (n = 25), and those told they would rate videos but instead played basketball were also in the prediction error group (n = 25).
Participants who completed the video task rated the entertainment value of a series of YouTube videos. Ratings were recorded on a scale from 1 to 5, 5 being the most entertaining. The video task took approximately 2 min to complete, similar to the basketball task. Following the video or basketball task, the study and test procedures of List 2 proceeded in the same manner as on Day 1.
On Day 3, participants recalled words from both lists, one list at a time. Participants were told that each word was to be used only once and could not be submitted for both lists. Participants were asked to recall all 25 words and to guess as needed to produce 25 words. The order in which participants were asked to recall the lists was counterbalanced across participants. After the recall phase, a questionnaire about each participant’s experience in the study was given as a manipulation check to make sure expectations were violated if the condition called for it. For example, participants noted whether the researcher had been mistaken about the procedure they had completed on Day 2.
Results and discussion
Acquisition performance on Days 1 and 2
On Days 1 and 2, participants were required to go through study-test trials until they could recall 20 out of the 25 words. The number of trials that participants took to reach that criterion was compared on Days 1 (M = 2.85, SD = 0.83) and 2 (M = 2.63, SD = 0.87). Participants required more trials to reach criterion on Day 1 than on Day 2, t(103) = 2.27, p = .026, d = .22, a potential practice effect. Experimental groups were established after acquisition on Day 1, and therefore another analysis comparing acquisition across experimental groups was conducted for Day 2. A two-way between-subjects ANOVA with reality and expectation as factors revealed no significant effects (expectation F(1, 100) = 2.10, p = .150,
Free recall on Day 3
A three-way mixed ANOVA, with expectation and reality as between-subjects factors and list as a within-subjects factor, was conducted to compare accurate free recall performance across lists and between the groups. There was a significant interaction between list and reality, F(1, 100) = 9.25, p = .003,
Asymmetrical intrusion scores
Means for the AI scores are presented in Figure 4. The left graph in Figure 4 illustrates the four conditions, and the right graph illustrates the results as grouped by those who experienced prediction error and those who did not, regardless of their specific experiences. The AI scores were analysed using a two-way ANOVA with expectations (basketball or video) and reality (basketball or video) as between-subjects factors. Neither main effect was significant, Fs < 1, but the interaction between the factors was, F(1, 100) = 13.63, p < .001,

Left panel: effects of expectations and reality on asymmetrical intrusion scores. Right panel: the expectations and reality conditions from the left panel are collapsed into those conditions with and without prediction error. Error bars reflect the standard error of the mean.
List-2 intrusions
For comparison with the AI analysis, we also examined List-2 intrusions on the test of List 1. To do so, we conducted a 2 (expectation) × 2 (reality) between-subjects ANOVA with the number of List-2 intrusions on the List 1 test (see Table 2). Similar to the analysis of AI scores, there was an interaction, F(1, 100) = 5.08, p = .026,
Means (and standard deviations) for List-2 intrusions in Experiment 2 as a function of what participants expected and what actually happened.
In sum, Experiment 2 shows that violating participants’ expectations produced higher AI scores (and List-2 intrusions) than conditions without prediction error. These results are consistent with several studies showing that prediction error is necessary to trigger reconsolidation (e.g., Forcato et al., 2007, 2009). However, Forcato et al. (2009) found that only a very specific sequence of events could create prediction error. In contrast, we found that prediction error can be created in a novel and obvious way. These results also indicate that spatial context was irrelevant; reconsolidation was found only when prediction error was present regardless of the fact that spatial context was the same on Days 1, 2, and 3 across all conditions.
General discussion
The experiments in the present study have verified both spatial context and prediction error as sufficient and effective, though not necessary, triggers of memory updating. Using novel methods and measures, Experiment 1 demonstrated that a spatial context cue increased AI scores, a reflection of updating, whereas a non-spatial context cue did not. As no prediction error was used in Experiment 1, prediction error cannot be said to be a necessary ingredient for memory updating to occur. One may argue that there was some non-obvious source of prediction error, but if so, it is not clear what conditions it would have been present in. Therefore, it is unlikely that prediction error could explain the patterns found in Experiment 1. Experiment 2 used the same novel materials and measures as Experiment 1 to test the role of prediction error when it was created in an obvious fashion. Results showed increased AI scores in conditions with prediction error and scores near zero in conditions with no prediction error. This suggests that large and obvious manipulations of prediction error can trigger reconsolidation, in contrast to prior findings (e.g., Forcato et al., 2009). Space was held constant across prediction error conditions, and therefore a spatial context account predicts that reconsolidation should have appeared in all conditions. It clearly did not. Thus, spatial context is not always sufficient to trigger reconsolidation. Overall, these results replicate and generalise prior findings showing that both spatial context and prediction error can trigger reconsolidation. However, they also extend the conflicting findings, suggesting that neither is necessary or sufficient when the other variable is being directly investigated.
Experiment 1 reinforced prior findings that spatial context is special in terms of its ability to trigger reconsolidation, but why is it special? Space is a critical feature of episodic memories, perhaps more so than other environmental cues, and may be more salient, relevant, or better integrated with the original memory. We suspect that integration is particularly important. For example, different types of stimuli may be integrated with the context to different degrees. Three-dimensional objects are likely more integrated in a spatial context than are words, and thus might show more robust effects. It may still be the case that other contextual cues could be effective insofar as they are well integrated into the original memory. For instance, if the experimenter read word stimuli, the experimenter could serve as a well-integrated context cue and spatial context could be less important (but see Hupbach et al., 2008 for a case where the experimenter did not trigger reconsolidation).
It is also possible that it is the temporal rather than (or in addition to) the spatial context that is critical to the original memory and the effectiveness of a reminder. Sederberg et al. (2011) suggested that the temporal context model could explain the patterns of results found in the spatial context studies, without requiring the idea of reconsolidation. Specifically, they argued that a cue of Session 1, such as the spatial context, would activate the temporal context of List 1 and that List 2 items could bind to that temporal context, which would result in the asymmetrical pattern of intrusions found in the current and previous studies. Thus, it is worth noting that the distinction between a spatial context and a temporal context may be difficult to make. From a temporal context perspective, it would be expected that an environmental cue, such as a noticeable laptop sleeve, would activate the temporal context of Day 1 and trigger reconsolidation, independently of spatial context. However, there was no evidence of that in the current study. If temporal context was at play, it would seem that it was only activated by the spatial context, rendering the two indistinguishable. Thus, as Sederberg et al. noted, while the temporal context model offers an explanation of how List 1 is updated with List 2 items, it does not necessarily account for what types of cues will activate a previous temporal context, other than space. Overall, context cues, in general, will not necessarily trigger reconsolidation; spatial context appears to hold a unique quality that renders it a more effective cue than others.
One may question whether the experiments examining spatial context are free from prediction error. Could it be the case that prediction error is the only trigger of reconsolidation and that it has been present in the spatial context experiments all along? To ward off the possibility of any mismatches of expectations in Experiment 1, we informed participants of what they would be doing on all 3 days; specifically, that they would return to the same lab, or would meet the experimenter at the library, and that they would learn a second list of words, followed on the third day by tests over those lists. Moreover, if prediction error were the trigger in the spatial context experiments, one would expect that it would be more likely when participants tested in a different space on Day 2. In fact, what has been consistently found, including in the current Experiment 1, is that reconsolidation appears only when participants are in the same space on Days 1 and 2. Thus, it is unlikely that there is a hidden source of prediction error lurking in the spatial context experiments given that the conditions most likely to contain it do not show the asymmetrical intrusion pattern that reflects memory updating.
Nonetheless, the idea that spatial context can trigger the reconsolidation process is still somewhat controversial. For example, some research has failed to replicate Hupbach et al. (2007; Klingmüller et al., 2017), and other animal research has demonstrated that the reconsolidation pattern could be attributed to state effects rather than a unique neural process (Gisquet-Verrier et al., 2015). Given that there are well-known context effects on memory (Smith & Vela, 2001), a general context account could also potentially explain the reconsolidation effect in the spatial context paradigm, making reconsolidation an unnecessary concept. For example, it is possible that manipulations of Day 3 context or state will demonstrate that the findings are better explained by known context and state effects than by a unique reconsolidation process in episodic memory (but see Capelo et al., 2018).
Experiment 2 bolstered prior findings that prediction error triggers reconsolidation and extended them by showing that the error can come in different forms, some quite obvious. The results of this type of blatant manipulation are in some conflict with previous findings that the error must be neither to small nor too large to destabilise a memory and trigger reconsolidation (e.g., Forcato et al., 2009; Scully & Hupbach, 2019). Moreover, most prediction error manipulations occur within the memory task itself, rather than as part of a different task. However, we see this conflict as positive in that it demonstrates that prediction error is not a fragile manipulation and in fact can be quite blunt. As long as expectations are violated, new learning has the potential to update old learning (Krawczyk et al., 2017; but also see Ortiz-Tudela et al., 2018). However, it is unclear why an obvious violation of expectations worked here while only very subtle and specific violations have been effective in other studies (e.g., Forcato et al., 2009). Moreover, given that the prediction error in this study was not directly relevant to the memory test itself, it may be questionable why it had an effect at all. One possibility is that the procedures that take place in the lab are perceived as a holistic event by the participants, and therefore a significant surprise, relevant to the memory test or not, would trigger the need to update the event as a whole, which would affect all the procedures, including updating of List 1 with List 2 items. Thus, while the prediction error generated here may not seem directly relevant to the memory test from a researcher’s perspective, it likely was relevant to the experimental experience as a whole from the participant’s perspective. It is therefore possible that the prediction error need not always be part of the specific learning task, but may “leak” from closely related events; however, further research will be needed to determine if this is in fact possible in other conditions. Overall, the mix of findings to date still make it difficult to predict a priori what degree or type of prediction error will trigger the reconsolidation process.
Although each experiment in the current study bolsters the findings within the spatial context and prediction error literatures, they also show that these variables can be completely irrelevant in some settings. There was no obvious source of prediction error in Experiment 1, yet reconsolidation was evident, as a result of a spatial context manipulation. Likewise, spatial context was held constant across conditions in Experiment 2 and thus reconsolidation should have been evident in all conditions, but it was found only in conditions with prediction error, indicating that spatial context will not always trigger reconsolidation. Thus, the puzzle remains: why are spatial context and prediction error irrelevant to triggering reconsolidation unless they are specifically under investigation? It is clear that there are multiple means of triggering reconsolidation, but it is not clear why these triggers are inconsistent across all the experiments conducted over the past several years.
One limitation of the current study is that the prediction error manipulation may have overpowered potential effects of spatial context in Experiment 2. It is possible that having any expectation (correct or not) might render the effects of spatial context null. However, in Experiment 1, expectations were created to ensure that all participants had the same expectation in all conditions (i.e., they were told what they would be doing). If any expectation were to overpower the effects of space, we should not have seen significant effects in that experiment. Nonetheless, it would be useful to compare a group with no obvious predictions with those with predictions (correct or incorrect) in another study examining the roles of prediction error and spatial context, to ascertain this definitively. In addition, this is the first study that we know of that has manipulated prediction error with explicit instructions, and thus it warrants replication to ensure that this is an effective means of creating prediction error.
Although there is no definitive resolution of the conflicting results from spatial context and prediction error experiments, it is clear that there are benefits of using similar materials and measures to test the two types of reminders. The current results showed that the effects generalise to new memoranda (single words), and that effects of both spatial context and prediction error can be found on the same type of free recall test using a within-subjects difference score (AI scores). Therefore, we can rule out the possibility that the conflicting results over the past several years can be chalked up to methodological differences. Putting the two types of triggers on an equal footing shows that both are effective in some situations but not others. Thus, although a puzzle remains, the data show that there are multiple means of triggering reconsolidation, and future research must determine the when and whys of each.
Footnotes
Acknowledgements
The authors thank Andrew Cardenas, Stephen Costello, JoHannah Kalito, and Linda McClarnan for their assistance with data collection, and Laura Werner for comments on initial drafts.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
