Abstract
Decades of memory research demonstrate the importance of temporal organization in recall dynamics, using laboratory stimuli (i.e., word lists) at seconds- to minutes-long delays. Little is known, however, about such organization in recall of richer and more remote real-world experiences, in which the focus is usually on memory content without reference to event order. Here, 119 younger and older adults freely recalled extended real-world experiences, for which the encoding sequence was controlled, after 2 days or 1 week. We paired analytical tools from the list-learning and autobiographical memory literatures to measure spontaneous contextual dynamics and details in these recall narratives. Recall dynamics were organized by temporal context (contiguity and forward asymmetry), and organization was reduced in older age, despite similar serial position effects and recall initiation across age groups. Across participants, organization was positively associated with richness of episodic detail, providing evidence for a link between reexperiencing past events and reinstating their spatiotemporal context.
Time is what keeps everything from happening at once.
We recall past experiences in the form of extended narratives, dwelling on certain details and then jumping in time and space to others. These memories are considered episodic to the degree that they are (a) populated by event-specific details (e.g., perceptions, thoughts, and feelings) and (b) organized by their spatiotemporal relations (Tulving, 1972). Episodic memory is thus defined by both its content and its underlying organization.
Detail recovery and spatiotemporal organization both depend on the hippocampus (Palombo, Di Lascio, Howard, & Verfaellie, 2018; Sheldon et al., 2018), which exhibits structural decline in older age (Raz et al., 2005). Yet they have been explored in largely separate literatures using incompatible methods. Consequently, little is known about how the spatiotemporal structure of real-world experiences transforms into structure in memory, how this structure shapes the details we retrieve, and how these two dimensions jointly change as episodic memory declines with age.
Free-recall dynamics, reflected in spontaneous recall transitions, reveal underlying structure in memory (Polyn, Norman, & Kahana, 2009). In word-list-learning studies, items encoded nearby in time tend to be recalled successively (the temporal contiguity effect) with a forward-going bias (Kahana, 1996); these are considered universal organizational principles of episodic memory (Healey & Kahana, 2014). Retrieved-context models hold that during encoding, items become associated to a gradually changing context representation such that items closer in time share more contextual overlap (e.g., Howard & Kahana, 2002; Polyn et al., 2009). Recalling an item triggers a “jump back in time” to an earlier state of context, cuing items from proximal positions. Computational modeling of recall behavior suggests that aging, and to a greater extent hippocampal amnesia, impairs temporal context reinstatement but does not affect measures that ignore the trajectory of recall through time (i.e., serial position effects and recall initiation; Healey & Kahana, 2016; Howard, Kahana, & Wingfield, 2006; Palombo et al., 2018).
Standard laboratory recall studies operationalize events as chains of discrete stimuli measured in binary fashion (i.e., recalled or not). Real-world episodes, on the other hand, are rich and continuous and are recalled as extemporaneous narrative descriptions varying in detail and specificity (Heusser, Fitzpatrick, & Manning, 2020; Koriat & Goldsmith, 1996) and at delay intervals far exceeding those typical of laboratory experiments. Accordingly, the generalizability of contiguity effects to real-life events has been questioned (Hintzman, 2016). Although there is evidence for macroscale temporal organization across discrete events spanning individuals’ autobiographical timelines (e.g., Friedman, 2004; Linton, 1986; Moreton & Ward, 2010; Nielson, Smith, Sreekumar, Dennis, & Sederberg, 2015; Uitvlugt & Healey, 2019), relatively little is known about microscale temporal organization in recall of extended real-world episodes, the basic units of episodic and autobiographical memory (Anderson & Conway, 1993).
In the autobiographical memory literature, as in word-list learning, free recall (i.e., extemporaneous and self-governed generation of details) is a dominant retrieval task. The Autobiographical Interview (AI) scoring method, which has been used in more than 200 studies (https://levinelab.weebly.com/ai-testing.html), decomposes recall narratives into their constituent details and classifies them as either internal (episodic, event specific) or external (e.g., semantic, metacognitive, or unrelated; Levine, Svoboda, Hay, Winocur, & Moscovitch, 2002). Here, older adults retrieve fewer internal details and more external details, a pattern that is exaggerated with hippocampal dysfunction, reflecting reduced episodic reexperiencing (Levine et al., 2002; Sheldon et al., 2018). But because autobiographical events are typically sampled retrospectively and are therefore heterogeneous and unverifiable, their recall dynamics typically cannot be mapped to the encoded event dynamics. One cannot, therefore, differentiate between two memories with similar detail counts but different contextual organization (but see, e.g., Dede, Frascino, Wixted, & Squire, 2016; St-Laurent, Moscovitch, Tau, & McAndrews, 2011).
Thus, the relationship between temporal context reinstatement and episodic-detail generation remains unknown, despite their theoretical and presumed neurophysiological connection (Howard & Eichenbaum, 2013; Tulving, 1972). If they share underlying mechanisms, spontaneous temporal organization should be positively associated with the quantity or density of episodic details in recall narratives. To our knowledge, there is no direct evidence for this prediction (but for related evidence, see Anderson & Conway, 1993; Sadeh, Moran, & Goshen-Gottstein, 2014). On the other hand, the content and organization of episodic memories need not be related. For example, vivid or detailed memories can be fragmented and temporally disorganized, as has been observed in posttraumatic stress disorder and hippocampal amnesia (Brewin, 2014; Dede et al., 2016).
Statement of Relevance
With respect to H. G. Wells, people do not need a machine to travel in time. We do it whenever we remember a past event. An interesting feature of our mental time travel is that we reinstate not only what happened but also the order in which it occurred. Although this phenomenon has been investigated using word-list recall, for example, it is unclear whether it extends to complex real-world events. Here, we created immersive audio-guided walking tours featuring a sequence of artworks and other items. Days later, younger and older adults recalled the tours. Our participants spontaneously recovered the temporal structure of the original event—“jumping” between nearby items with a forward-going bias—though older individuals had more temporally disorganized memory. Moreover, the more recall organization followed the temporal structure of the original event, the more vivid and detailed the recollection. These findings suggest that mental reconstruction of the order of events lends vividness to our travels in time.
Note: A companion article by Diamond, Armson, and Levine (2020) appears online at https://doi.org/10.1177/0956797620954812 and on pages 1544–1556 of this issue. The two articles overlap in several ways, including research participants, but the theoretical issues explored in the two articles are sufficiently distinct to warrant their publication as separate works.
We paired methods from the list-learning and autobiographical memory literatures to measure temporal organization and detail in single extended recall narratives. In Study 1, we assessed free recall of an immersive yet controlled real-world walking tour (Baycrest Tour 1.0) in younger and older adults after a 2-day delay. Control over the encoding sequence allowed us to bring to bear analytical tools for measuring recall dynamics from the list-learning literature on naturalistic recall. We reasoned that if patterns of temporal organization in word-list-recall dynamics reflect core mechanisms underlying episodic memory retrieval, then similar patterns should be observed in remote naturalistic event descriptions. In addition to the established effect of aging on the balance of internal (episodic) and external (nonepisodic) details, we predicted that older adults would also show reduced temporal organization. In Study 2, we replicated our findings on recall dynamics in a larger independent younger adult sample recalling a different tour event (Baycrest Tour 2.0) at a 1-week delay. Having derived hallmark measures of episodic detail and temporal context reinstatement within memories, we tested the hypothesis that they would be positively related.
Study 1
Method
Participants
There were 22 participants in both the younger group (age: M = 23.81 years, SD = 3.92; education: M = 15.64 years, SD = 1.09) and the older group (age: M = 69.00 years, SD = 3.07; education: M = 16.56 years, SD = 3.73). Two older participants were excluded for poor neuropsychological test performance, and one was excluded for revisiting the tour area between encoding and recall sessions. An additional older participant was excluded from analyses because of insufficient recall quantity for measuring recall dynamics (see below), leaving 18 older participants in the final analyzed sample. All participants were recruited via the Rotman Research Institute Participant Database at Baycrest Health Sciences and from advertisements in the Toronto community. Participants were screened for history of neurological or psychiatric illness, active significant medical illness, and substance abuse. Participants were fluent English speakers, had normal or corrected-to-normal vision and hearing, were not color-blind, and gave informed consent in accordance with institutional guidelines. We excluded participants who had been to Baycrest within the past month or more than five times in total. Data from Baycrest Tour 1.0 participants are included in two other reports (Diamond, Abdi, & Levine, 2020; Diamond, Armson, & Levine, 2020). The relationship of the present data to those reports is described in the Supplemental Material available online.
Materials and procedure
The study consisted of two phases: a controlled real-world encoding phase and a retrieval phase, during which participants freely recalled their tour experience. Prior to encoding, participants completed a neuropsychological assessment battery, including the Rey Auditory Learning Verbal Test (Schmidt, 1996), Brief Visuospatial Memory Test (Benedict, Schretlen, Groninger, Dobraski, & Shpritz, 1996), face-name associative memory (Troyer et al., 2012), Symbol Digit Modalities Test (Smith, 1978), Trail Making Test (Spreen & Strauss, 1998), verbal fluency (Spreen & Strauss, 1998), and the Shipley Vocabulary Test (Shipley, 1946). Expected age effects were observed on tests of memory and executive functioning (see Table S1 in the Supplemental Material).
Encoding phase
Participants underwent an audio-guided walking tour of artwork and assorted items on the first floor of Baycrest Hospital (Baycrest Tour 1.0; Fig. 1a). The route formed a generally unidirectional loop. Thus, the physical structure of the tour route, along with the audio guide, controlled the sequence structure of the experience. Participants were instructed to approach and examine different target items (e.g., paintings, portraits, and exhibits) and occasionally identify specific items or features (e.g., locate a particular individual in a frame of portraits, locate a particular item in the gift shop). In the middle of the tour, participants had a scripted interaction with a research confederate, during which the confederate asked a series of questions. Baycrest Tour 1.0 took an average of 23.0 min (SD = 3.0) for younger adults and 27.15 min (SD = 4.32) for older adults, likely reflecting differences in walking speed, as the audio guide controlled item-viewing time.

Schematics of the two tour events and example recall narrative for Baycrest Tour 1.0. In the schematics (a, b), red circles indicate locations of the target items, and black dashed lines depict the tour routes. In Baycrest Tour 1.0 (Study 1; a), there were 27 target items with defined and universal ordinal positions. Participants were instructed, for example, to find the chef cookie jar in the gift shop (eighth item; right photograph), examine a large curved painting called “Let There Be Light” (15th item; left photograph), and stand between the trees in the atrium (23rd item; top photograph). Baycrest Tour 2.0 (Study 2; b) was split into two sections (13 target items in Section 1 and 18 in Section 2); only Section 1 is depicted here. Participants were instructed, for example, to examine a wall sculpture called “Head With Armstrong” (second item), stand in the red Number 8 in the shuffleboard game (fifth item), and examine a wall-mounted wood sculpture (eighth item). An example recall narrative (c) is shown for Baycrest Tour 1.0 (younger participant). The first field in each code identifies the detail as internal or external. The second field is the detail type (“event,” “perceptual,” “place,” “thought,” or “semantic”). In this segment, the participant recalled nine internal details (orange text) and one external detail (blue text). The participant’s recall vector is [4, 6, 7, 8, 9]. Note that the participant skipped the fifth item in the tour.
The audio guide is publicly available on OSF (https://osf.io/yhm7n/). It was recorded and edited by N. B. Diamond using Audacity (Version 2.02; https://www.audacityteam.org/download/). It was narrated by four different speakers (two female and two male), each of whom narrated two segments of the tour. The order of speakers was different for the first four and the last four segments of the tour, so that speaker order was decoupled from the sequence structure of the tour. The guide was broken down into multiple tracks, each associated with a target item. Each track was initiated by the participant by pressing a button on the MP3 player. After arrival at a target item, the guide instructed participants to examine the item, followed by a silent period in the recording, and then directed participants to the next item. This silent period controlled viewing time. For some items, the guide provided information (e.g., about the artist) or cued participants’ attention to certain features. Participants were given extensive instructions before the tour began, and they were given an opportunity to practice using the MP3 player to control the audio guide. Specifically, participants were informed that their memory of the tour would be tested and that they should pay close attention to what they see and hear. The experimenter unobtrusively observed participants to verify that they followed the audio guide’s instructions. Participants exited (after the encoding session) and reentered (for the retrieval session) the building through specific doors to minimize reexposure to tour content.
Retrieval phase
Participants returned to the lab after 2 days. They were asked to freely recall their experience of the tour (instruction: “Tell me everything you can remember about the tour”). They completed a true/false memory test consisting of 40 true and 40 false statements about details from the tour (e.g., “The piano was black”) before performing free recall. False statements were created by altering features of tour items. This true/false memory test was conducted for the purposes of another experiment (Diamond, Abdi, & Levine, 2020); in Study 2, we report recall data from participants who performed free recall for a different tour with no intervening true/false test. Participants in Study 1 were also asked to recall a time-matched personal event, with the order of recall (tour vs. personal) counterbalanced across participants. Data from the time-matched personal events are not included in this article. Participants received the standard AI administration (Levine et al., 2002), including a general probe (“Is there anything else you can tell me about this event?”) following free recall for participants with limited output during free recall, as well as a subsequent specific probe for particular kinds of details. Given that our present interest is in spontaneous memory organization, only data from free recall and general probe are analyzed here. For participants who received the general probe, the transition from the last item in free recall to the first item in general probe was masked from all analyses of recall dynamics (see below). Participants’ recall sessions were audio recorded and transcribed.
Analysis of details
To measure memory detail, we used the standard AI scoring method (Levine et al., 2002). Transcribed recall narratives were broken down into discrete informational units/clauses and categorized as either internal or external. Internal details are episodic and specific in space and time to the event being described. External details are not specific to the event being described; they may be semantic details (describing general features of oneself or the world that are not specific in space and time), repetitions, metacognitive statements (e.g., “I’m not sure”), or episodic details about different events. Internal details are further categorized as “event,” “place,” “time,” “perceptual,” or “emotion/thought.” Memories were scored by research assistants, all of whom were trained on the method and achieved intraclass correlation coefficients (ICCs) of .90 or higher for the internal- and external-detail composite scores. Each scorer completed 20 transcripts from a reliability set, and ICCs were computed with reference to trained scorers’ data. We measured the density of episodic details in recall narratives as the ratio of internal details to total details (i.e., the internal-detail proportion), which measured the proportion of all details in a memory that refer to specific episodic information, unbiased by group differences in event content (Study 1 vs. Study 2) and individual differences in verbosity. This measure reliably differentiates age groups as well as memory-impaired groups from healthy comparison participants (Levine et al., 2002; Sheldon et al., 2018).
Analysis of temporal organization
We augmented the AI scoring method to investigate temporal organization. In recall of word lists or other standard laboratory stimuli, the items are discrete and occupy clearly defined serial positions (i.e., first, second, third, . . .). In recall of naturalistic episodes, particularly real-life experiences, perceptual experience is continuous rather than discrete, and the stimuli include any attended feature of experience. We defined items as the elements in each tour to which the audio guide explicitly cued participants’ attention, ensuring that they were encoded by all participants in the first place and in a specified order. These order-tagged items have clearly defined and homogeneous ordinal positions. There were 27 such items in Baycrest Tour 1.0. When participants mentioned any of these items, we coded the item’s ordinal position in the tour. Because a given naturalistic event can be described in myriad ways, mentioning was defined as any reference to an item that would be unambiguous to a listener who was familiar with the tour. Thus, for each memory, we derived a vector of order tags, or ordinal positions, representing the items that person recalled and the order in which they were recalled. We note that this method can in principle be applied to any naturalistic memory for which the encoded sequence is known, making the data amenable to analytic techniques developed for word-list-learning experiments.
To quantify temporal organization, we submitted the recall vectors to analysis of lag-conditional response probability (CRP; Kahana, 1996), which is the canonical behavioral measure of temporal context reinstatement in recall. Lag CRP measures the probability of transitioning from a recalled item i to the next-recalled item i + 1 as a function of their distance (lag) and order (positive = forward, negative = backward) at encoding. For instance, a lag of +1 represents a forward transition from a given item to the item that occurred next in the tour (e.g., from the fifth encoded item to the sixth encoded item), whereas a lag +2 represents a forward transition that skipped an intervening item (e.g., from the fifth encoded item to the seventh encoded item). Negative lags indicate recall transitions made in the backward direction, opposite the encoded order.
The CRP is calculated, for each lag, as the number of transitions made divided by the number of transitions that were available. Repetitions are excluded and thus unavailable. The longest possible lag is N − 1, where N is the total number of order-tagged items, that is, transitioning from the first item to the last, or −(N − 1) if transitioning from the last item to the first. For each participant, values of zero for a given lag indicate that the participant had an opportunity to make that transition but did not. Transition lags that were never possible, given the preceding recall train and the total number of items, were excluded. For example, if a participant recalls every encoded item in perfect order, beginning at the first item, there was never an opportunity to make a transition in the backward direction, and all negative lags would thus be unavailable and omitted for that participant. The analysis therefore considers at every transition which items have or have not already been recalled. This accounts for individual differences in the overall number of items recalled as well as differences in the number of items in each event (i.e., the tours in Study 1 vs. Study 2).
We also derived overall measures of temporal clustering (tendency to successively recall nearby items) and forward asymmetry (tendency to make forward recall transitions) for each participant. We measured temporal clustering using the temporal factor (Polyn et al., 2009), which quantifies, for each transition, the proportion of possible transition distances (in absolute lag) that are greater than the observed transition distance. Averaging over all transitions in a memory, it outputs a single proportion score representing the tendency to successively recall items that were nearer in time (and space, in our case). It is blind to transition direction (i.e., forward or backward). A score of 1 indicates that the participant always made the shortest transition available, and a score of .5 indicates chance-level temporal clustering. Forward asymmetry was defined as the proportion of all transitions that moved forward in time with respect to the encoded order, excluding repetitions. It is blind to transition distance.
To derive stable measures of recall organization at the level of individual participants, we decided a priori to exclude recall trials with fewer than five unique order-tagged items (i.e., four transitions). This resulted in the exclusion of one older adult, leaving 18 older adults in the final analyzed sample. Lag CRPs and temporal clustering were calculated using publicly available MATLAB (The MathWorks, Natick, MA) scripts from the Behavioral Toolbox (Version 1.01) from the Computational Memory Lab (http://memory.psych.upenn.edu/Behavioral_toolbox). For post hoc group comparisons, we used Welch’s t tests to account for unequal variances. We report 95% confidence intervals (CIs) around effect sizes (Cohen’s d and Pearson’s r).
Results
Table 1 presents an overview of summary statistics and age effects on measures of detail and temporal organization.
Measures of Detail and Temporal Organization in Study 1
Note: Values in parentheses are standard deviations. Values in brackets are 95% confidence intervals.
p < .05 (significant age effect).
Details
Analyses of internal- and external-detail counts (Table 1) were performed on log-transformed data to correct for significant positive skewness. One younger participant recalled no external details, so we artificially added 0.5 external details to each participant’s data before log transformation. We replicated the interaction between age and detail type (internal vs. external) on detail counts, F(1, 38) = 17.29, p < .001, previously established in autobiographical memory studies with self-selected events (e.g., Levine et al., 2002): Older adults reported fewer internal details and more external details than younger adults (Table 1; data also reported by Diamond, Armson, & Levine, 2020). As expected, younger adults had considerably higher internal-detail proportions than older adults, t(22.03) = 3.97, p < .001, d = 1.35, 95% CI = [0.64, 2.07], replicating the established negative effect of age on richness of episodic detail (Levine et al., 2002).
Recall dynamics
As described above, CRP as a function of lag (lag CRP) measures the probability of transitioning from an item i to the next-recalled item i + 1 as a function of their distance (lag) and order at encoding. Negative lags indicate recall transitions moving backward in time, opposite the encoded order (see “Analysis of Temporal Organization”). Figure 2 presents the average lag-CRP curves for younger and older groups. Both groups exhibit the two canonical features of the lag-CRP curve indicating retrieval of temporal context: (a) the contiguity effect, whereby the transition probability peaks for neighboring items and declines as a function of distance in both directions, and (b) forward asymmetry, whereby forward transitions are more common than backward transitions.

Conditional response probability as a function of lag, separately for older and younger participants (Study 1). Dots represent group means, and error bars and shaded areas represent bootstrap-derived standard errors (1,000 resamples).
Age differences in the lag-CRP curves are visually apparent, particularly at lag +1, t(28.70) = 2.82, p = .009, d = 0.93, 95% CI = [0.26, 1.61] (age differences were not significant at other lags; ps > .09 uncorrected). To test the effect of age on the shape of the lag CRP, we ran participant-wise linear regressions predicting CRP from lag. We did this separately for positive (1 to 5) and negative (−1 to −5) lags (Sadeh et al., 2014). This measure produced a coefficient for each participant, in each direction, representing the steepness of their lag-CRP curve (the change in recall probability as a function of increasing lag). Participants with no above-zero CRP values in either direction were excluded from analysis of that direction. In the positive direction, one older participant was excluded; in the negative direction, three younger and four older participants were excluded. We tested group-level differences in coefficients with t tests. For positive lags, younger adults (M = −0.113, SD = 0.04) had steeper curves than older adults (M = −0.061, SD = 0.07), t(25.31) = 2.62, p = .015, d = 0.90, 95% CI = [0.21, 1.58]. There was not a statistically significant age difference in the negative direction (younger: M = −0.040, SD = −0.03; older: M = −0.028, SD = 0.03), t(28.55) = 1.19, p = .245, d = 0.42, 95% CI = [−0.31, 1.14]. We note that a similar approach was used in previous studies analyzing differences in lag-CRP curves, although they fitted power functions to the curves rather than linear models (Sadeh et al., 2014). We opted for a more parsimonious approach, given the novelty of our encoding conditions and encoding-recall delay.
We next derived overall temporal-clustering and forward-asymmetry scores for each participant. We measured temporal clustering using the temporal factor (Polyn et al., 2009), which quantifies the tendency to successively recall items that were encoded nearby (in absolute lag; see “Analysis of Temporal Organization”). Consistent with the peaked lag-CRP curves, results showed that both groups exhibited greater-than-chance temporal clustering (ps < .001), but there was a group difference in the degree of temporal clustering (Fig. 3a). Younger adults exhibited greater temporal clustering than older adults, t(29.38) = 2.75, p = .010, d = 0.91, 95% CI = [0.23, 1.58], indicating a large negative effect of age on temporal context reinstatement.

Temporal clustering and forward asymmetry, separately for younger and older participants (Study 1). For temporal clustering (a), a score of .5 (dashed line) indicates chance-level temporal clustering, and a score of 1.0 indicates that every observed recall transition was the shortest available one. Forward asymmetry (b) was measured as the proportion of all transitions that moved forward in time with respect to the encoded order. A score of .5 (dashed line) indicates that recall transitions were made in backward and forward directions with equal probability, and a score of 1.0 indicates that only forward transitions were made. Black circles with white fill represent group means; colored dots represent individual participant means and are slightly vertically jittered to reduce overlap. Error bars represent standard errors. The shaded regions are smoothed distributions.
Forward asymmetry was calculated as the proportion of all transitions that moved forward in time with respect to the encoded order. Younger and older adults both exhibited greater-than-chance forward asymmetry (ps < .001), but groups differed here, too (Fig. 3b). Younger adults exhibited significantly higher forward asymmetry than older adults, t(31.81) = 2.59, p = .014, d = 0.84, 95% CI = [0.17, 1.51], indicating that age was associated with a large reduction in the tendency to recall in chronological order.
We supplemented the analysis of forward asymmetry, which measures spontaneous recovery of the encoded temporal order in recall transitions, with analysis of explicit sequence errors, which were defined as any explicit reference to temporal order (using clauses such as “and then,” “after,” and “before”) that were verifiably incorrect. These were rare (younger: M = 0.73, SD = 1.12, maximum = 4, 0 errors in 64% of participants; older: M = 0.94, SD = 1.51, maximum = 4, 0 errors in 61% of participants). In contrast to the deleterious effect of age on spontaneous forward asymmetry, results showed that there was not a significant difference between age groups in explicit sequence errors (Mann-Whitney U test: U = 191, p = .838, d = 0.17, 95% CI = [−0.48, 0.81]). To account for the fact that younger adults recalled more order-tagged items overall, and thus had more opportunities to make explicit sequence errors, we repeated this analysis, dividing each participant’s number of sequence errors by the number of transitions that he or she made (younger: M = 0.04, SD = 0.06; older: M = 0.06, SD = 0.08). There was not a significant group difference on this measure either (U = 187, p = .745, d = 0.19, 95% CI = [−0.46, 0.83]).
Serial position effects
We have focused on group differences in the contextual dynamics of recall, as evidenced by the lag-CRP curves, temporal clustering, and forward asymmetry. The observed age effects, however, could be influenced by features of recall aside from its dynamics—for instance, which items are recalled overall and from which ordinal position one initiates recall. Figure 4a shows serial position curves for each age group. One item (Serial Position 13) was removed because it was recalled by only two participants overall (.05), which is fewer than 2 standard deviations below the mean across groups (M = .59, SD = .22). Consistent with established findings, results showed that younger and older adults exhibit visually similar primacy and recency effects (Healey & Kahana, 2016; Howard et al., 2006): Older adults recalled fewer items overall (younger: M = 18.00, SD = 3.82; older: M = 13.61, SD = 6.39), t(26.54) = 2.56, p = .016, d = 0.86, 95% CI = [0.18, 1.53]. We note that the shape of these serial position curves should be interpreted with caution because, unlike laboratory stimuli, the items in the tour were not matched in their perceptual features nor visual or environmental saliency, nor were they equidistant from each other. However, given that younger and older adults were exposed to the same event, the similarity in their serial position curves is noteworthy.

Serial position and probability-of-first-recall curves for younger and older participants (Study 1). The proportion of participants who recalled items encoded at each serial position (a) and the proportion of participants in each group who began recall at each serial position (b) are shown as a function of serial position and age group. The dashed lines in (a) represent the quadratic fit for each group. Dots in (b) are slightly horizontally offset for each group to reduce overlap.
Figure 4b shows the proportion of participants in each group who began recall at each serial position (probability of first recall). Consistent with established findings, results showed that younger and older adults’ probability-of-first-recall curves were nearly identical (Healey & Kahana, 2016; Howard et al., 2006). Nearly all participants exhibited a strong primacy effect, initiating recall at the first or second tour item.
Interim summary
As predicted, age effects were observed for richness of episodic detail and measures of spontaneous temporal organization (contiguity and forward asymmetry). Although caution is warranted because of the small sample sizes, the pattern of observed age effects—shallower and less asymmetric lag-CRP functions in older relative to younger adults despite similar serial position and recall-initiation curves—mirrors that observed in previous word-list-recall studies (Healey & Kahana, 2016; Howard et al., 2006).
Furthermore, we acknowledge that the true/false memory test may have contaminated subsequent free recall. In Study 2, we analyzed data from a larger younger adult sample who recalled a different controlled real-world event with no intervening testing. Leveraging this larger sample to investigate individual differences, we tested the hypothesis that richness of episodic detail would correlate with temporal-contextual organization.
Study 2
Method
Participants
The Study 2 sample consisted of 90 younger participants (age: M = 24.90 years, SD = 4.51; education: M = 16.46 years, SD = 2.87; 65 female). Eleven of these participants were excluded for insufficient recall quantity according to the threshold described in Study 1 (minimum five order-tagged items recalled). Participants were screened for the same exclusion criteria as in Study 1, except participants with any prior exposure to the tour area were excluded (as opposed to just those exposed within the past month or more than five times in total, as in Study 1).
Materials and procedure
Encoding phase
Study 2 participants underwent a different audio-guided tour, this one on the second floor of Baycrest Hospital (Baycrest Tour 2.0; Fig. 1b). Like the Baycrest Tour 1.0 audio guide, the Baycrest Tour 2.0 audio guide is publicly available on OSF (https://osf.io/j25y4/). There was no confederate interaction, and the guide narrator was one man, but it was otherwise similar to Baycrest Tour 1.0. For the purposes of a different experiment, Baycrest Tour 2.0 was split into two sections. Participants completed one section of the tour, were taken to a testing room to complete a battery of tests and questionnaires for approximately 45 min, and then completed the other section. The order of the sections was counterbalanced across participants. There were 13 order-tagged items in Section 1 and 18 order-tagged items in Section 2. On average, Section 1 took 9.12 min (SD = 1.27), and Section 2 took 10.17 min (SD = 1.95).
Retrieval phase
Participants performed free recall (they did not undergo general probe or specific probe) after a 1-week delay. They completed neither an intervening true/false test nor a time-matched personal event recall as in Study 1. For the purposes of a separate experiment, these participants underwent eye tracking during recall, using a head-mounted EyeLink system (SR Research, Ottawa, Ontario, Canada). They recalled each section of the tour separately; the order of recall (Section 1 vs. Section 2) was counterbalanced across participants. During the first-recalled section, participants could move their eyes freely (free viewing). During the second-recalled section, participants were instructed to restrict their viewing patterns (fixed viewing). Eye movements did occur in the fixed-viewing condition, albeit within a restricted range. As a result, viewing condition had little effect on recall measures (Armson, Diamond, Levesque, Ryan, & Levine, 2021). Specifically, there was no detectable effect of viewing condition on measures of memory detail and temporal organization (see the Supplemental Material), so we included data from both recalls in the present study.
Analysis
We used the same measures of detail and temporal organization from Study 1. As in Study 1, we excluded recall trials with fewer than five unique order-tagged items (i.e., four recall transitions). Here, this resulted in the exclusion of both recall trials for 11 participants and one recall trial for 17 participants. To clarify, after exclusions, there were 79 participants in Study 2: 62 participants with two recall trials (analyzed separately and then averaged within participants) and 17 participants with one recall trial. Participants and trials excluded for this reason were also excluded from the analysis of details.
To investigate the relationship between recall organization and episodic detail across participants in Studies 1 and 2, we ran mixed-effects models and follow-up bivariate correlations predicting detail type from temporal clustering and forward asymmetry (mean centered) while controlling for group. The total analyzed sample size of 119 exceeds Cohen’s (1992) recommendation of 85 participants to reliably detect a medium-size population correlation at a power of .8 and an alpha of .05.
Results
Detail
On average, Study 2 participants recalled 40.28 (SD = 19.63) internal details and 8.44 (SD = 7.07) external details (averaged over tour section within subjects, as described above). Their average internal-detail proportion was .84 (SD = .10).
Temporal organization
The lag-CRP curve (Fig. 5a) exhibited the canonical peaks at lags ±1 with graded drop-offs as a function of lag, indicating that participants transitioned between items according to their spatiotemporal proximity and with a forward bias. In spite of the differences in events and testing methods, the lag-CRP curve obtained in this experiment was similar to that obtained for younger (but not older) participants in Study 1. As in Study 1, temporal clustering (M = .82, SD = .09; Fig. 5b) and forward asymmetry (M = .85, SD = .11; Fig. 5c) were greater than chance (ps < .001), and explicit sequence errors were rare (raw count: M = 0.32, SD = 0.71, maximum = 3, 0 errors in 85% of participants; as a proportion of transitions: M = 0.02, SD = 0.05). Post hoc comparisons between the two tour events (restricted to younger participants from Study 1) indicated that forward asymmetry was not significantly different across the two tours, t(32.11) = 0.35, p = .728, d = 0.09, 95% CI = [−0.39, 0.57], nor was temporal clustering, t(29.24) = 1.97, p = .059, d = 0.53, 95% CI = [0.05, 1.02], although Study 2 participants had numerically lower clustering, possibly reflecting a longer retention interval or other cross-experiment differences.

Study 2 recall dynamics. Conditional response probability (a) is shown as a function of lag. Error bars and shaded areas represent bootstrap-derived standard errors (1,000 resamples). For temporal clustering (b), a score of .5 (dashed line) indicates chance-level temporal clustering, and a score of 1.0 indicates that every observed recall transition was the shortest available one. Forward asymmetry (b) was measured as the proportion of all transitions that moved forward in time with respect to the encoded order. A score of .5 (dashed line) indicates that recall transitions were made in backward and forward directions with equal probability, and a score of 1.0 indicates that only forward transitions were made. Black circles with white fill represent group means; colored dots represent individual participant means and are slightly vertically jittered to reduce overlap. Error bars (standard errors) are not visible. The shaded regions are smoothed distributions.
Relating detail and temporal organization (combining Studies 1 and 2)
We hypothesized that greater temporal context reinstatement, as measured by temporal clustering and forward asymmetry, would be associated with recall of more episodic details. We tested this hypothesis by combining all three groups of participants across both studies. We ran separate linear mixed-effects models predicting detail count from detail type (internal and external) and temporal clustering (Fig. 6a) or forward asymmetry (Fig. 7a), with each model also including group as a factor (Study 1 older, Study 1 younger, and Study 2), and all interactions. As above, internal and external-detail counts were log-transformed for analysis (after artificially adding 0.5 external details to each participant’s data) to correct for positive skew.

Scatterplots showing the relation between temporal clustering and (a) detail count (internal vs. external) and (b) ratio of internal to total details, collapsed across groups from Studies 1 and 2. Raw detail counts are shown in (a), but log-transformed detail counts were analyzed. Solid lines show linear-regression fits, and the shaded region in (b) represents the 95% confidence interval.

Scatterplots showing the relationship between forward asymmetry and (a) detail count (internal vs. external) and (b) ratio of internal to total details, collapsed across groups from Studies 1 and 2. Raw detail counts are shown in (a), but log-transformed detail counts were analyzed. Solid lines show linear-regression fits, and the shaded region in (b) represents the 95% confidence interval.
There was a significant interaction between temporal clustering and detail type, F(1, 113) = 13.12, p < .001, whereby temporal clustering was positively associated with internal details, r(117) = .27, 95% CI = [.10, .43], p = .003, and exhibited a weak but statistically significant negative association with external details, r(117) = −.19, 95% CI = [−.01, −.36], p = .036 (Fig. 6a). These coefficients were significantly different from each other (z = 5.30, p < .001; Steiger z test). Critically, the absence of an interaction between group and temporal clustering, F(2, 113) = 0.47, p = .625, or the three-way interaction including detail type, F(2, 113) = 0.25, p = .781, suggests that the relationship between temporal clustering and details did not significantly vary across groups. To further clarify this finding, we tested the association between temporal clustering and the ratio of internal to total details, again including group in the model (Fig. 6b). As expected, temporal clustering was significantly associated with the proportion of internal to total details with group included in the model, F(1, 113) = 12.53, p < .001; r(117) = .45, 95% CI = [.30, .59], p < .001. There was no interaction between temporal clustering and group, F(2, 113) = 1.38, p = .256. For completeness, the correlation between temporal clustering and internal-detail proportion is shown within each group in Figure S1 in the Supplemental Material.
We followed the same logic with forward asymmetry. As with temporal clustering, there was a significant interaction between forward asymmetry and detail type, F(1, 113) = 10.58, p = .002, and no interaction between forward asymmetry and group, F(2, 113) = 1.76, p = .177, nor among forward asymmetry, group, and detail type, F(2, 113) = 0.74, p = .481. However, unlike temporal clustering, forward asymmetry was not correlated with internal details, r(117) = .09, 95% CI = [−.10, .26], p = .355, but rather was significantly negatively correlated with external details, r(117) = −.29, 95% CI = [−.11, −.45], p = .001 (Fig. 7a). These coefficients were significantly different from each other (z = 4.27, p < .001). Comparing temporal clustering and forward asymmetry directly, we found that temporal clustering had a significantly greater association with internal details than did forward asymmetry (z = 2.39, p = .017), although they did not significantly differ in their relationship to external details (z = 1.25, p = .212). Forward asymmetry, like temporal clustering, was significantly associated with proportion of internal to total details with group included in the model, F(1, 113) = 10.44, p = .002; r(117) = .46, 95% CI = [.30, .59], p < .001 (Fig. 7b), and the interaction with group was not significant, F(2, 113) = 1.69, p = .188. For completeness, the correlation between forward asymmetry and internal-detail proportion is shown within each group in Figure S2 in the Supplemental Material.
Although temporal clustering and forward asymmetry were significantly correlated with each other, r(117) = .63, 95% CI = [.51, .73], p < .001, they each explained unique variance in internal-detail proportion when modeled together in a multiple regression (mean centered)—temporal clustering: β = 0.27, t(115) = 2.57, p = .011; forward asymmetry: β = 0.28, t(115) = 2.62, p = .010. This suggests that they capture different aspects of temporal structure, within single extended memories, that predict detail richness, which is consistent with their different relationships to internal and external details. They did not significantly interact, β = −0.01, t(115) = 0.07, p = .942.
One might be concerned that the internal-detail proportion would necessarily be higher for participants who recalled more order-tagged items and thus made more recall transitions. However, our measures of temporal organization analytically controlled for the number of items recalled: Lag CRPs and temporal clustering were both computed on a transition-by-transition basis, adjusting for the number of available transitions at every point, and forward asymmetry was computed as a proportion of recalled items. Nonetheless, we regressed each measure on internal-detail proportion while including the number of unique order-tagged items recalled in the model. In separate models, both temporal clustering, β = 0.39, t(116) = 4.47, p < .001, and forward asymmetry, β = 0.41, t(116) = 5.08, p < .001, significantly predicted internal-detail proportion. As expected, the number of unique items recalled was also positively associated with internal-detail proportion when modeled with temporal clustering (although not significantly), β = 0.16, t(116) = 1.88, p = .063, or forward asymmetry, β = 0.27, t(116) = 3.82, p < .001. Thus, recall dynamics are predictive of the density of episodic detail in a memory when analyses controlled for the quantity of order-tagged items retrieved (i.e., the number of transitions made).
Lastly, to visualize the relationship between richness of episodic detail and temporal context reinstatement, we replotted lag-CRP curves collapsing across groups, binned by internal-detail proportion quintiles (Fig. 8a). Quintiles line up in gradientlike fashion at lag +1, further highlighting the relationship between richness of episodic detail and sequentially organized recall dynamics. This pattern may be driven in part by group differences, but the same general pattern was observed within the Study 2 sample (Fig. 8b).

Conditional-response-probability curves as function of lag, binned by internal-detail proportion quintiles. Results are shown (a) collapsed across all three groups (N = 119) and (b) for Study 2 participants only (n = 79). Error bars and shaded areas represent bootstrap-derived standard errors (1,000 resamples).
General Discussion
We measured detail and temporal organization in free recall of extended real-world episodes. Across events, participants’ recall dynamics revealed strong contiguity and forward-asymmetry effects, extending established principles from laboratory paradigms to naturalistic recall at multiday delays. Relative to younger adults, older adults exhibited temporally disorganized memory as expressed in spontaneous recall transitions but not in explicit sequence errors, with no age-related change in either serial position effects or recall initiation. Finally, contiguity and forward asymmetry were positively correlated with the density of internal details in recall narratives, suggesting that memories unfolding with more sequentially organized recall dynamics are richer in episodic detail. These findings support the notion that episodic memory entails recovering not only snapshots of past experiences but also spatiotemporal trajectories through them (Hasselmo, 2012). They further suggest that the trajectory one takes to, or through, a memory is related to the quantity and kind of information retrieved.
The magnitudes of temporal clustering and forward asymmetry observed here are striking given the long delays (by laboratory standards) between encoding and recall (see also Diamond, Romero, Jeyakumar, & Levine, 2018). Furthermore, whereas organizational strategies develop over multiple encoding and recall trials in laboratory experiments (Tulving, 1962), we used one-shot events with no explicit instruction to encode sequence information (although participants were aware that their memory would be tested). These results add to existing evidence that temporal organization is not driven by relative differences in recency or trace strength among items, which should be rendered negligible by multiday delays. Nor can such temporal organization be driven by schematic or scriptlike knowledge, as has been hypothesized for autobiographical memories (Anderson & Conway, 1993), given that the tour items were idiosyncratic and arbitrarily ordered. Studies examining naturalistic events separated by months or years (Moreton & Ward, 2010; Uitvlugt & Healey, 2019) found temporal clustering only at the shortest lags, which may partly reflect direct event-to-event associations or reference to temporal landmarks (Friedman, 2004). Our finding of a temporally graded and forward-asymmetric contiguity effect extends principles from retrieved-context models unambiguously to recall of naturalistic events.
The positive relationship between temporal organization and detail richness builds on previous work suggesting that autobiographical memory organization shapes access to details (Anderson & Conway, 1993; Linton, 1986). More recent evidence from recognition-memory tasks constrains contiguity effects to items remembered with high confidence (Folkerts, Rutishauser, & Howard, 2018; Schwartz, Howard, Jing, & Kahana, 2005) or self-reported recollection (Sadeh et al., 2014), although participants may make these subjective judgments on the basis of retrieving temporal context information itself. Conversely, we objectively quantified episodic detail using a method that is blind to recall dynamics.
Temporal contiguity and forward asymmetry each explained unique variance in episodic detail density. Contiguity correlated positively with quantity of internal (episodic) details and to a greater extent than forward asymmetry. Jumping back in time to an earlier moment from the encoding episode—an earlier “now”—may bring into higher resolution moments that were nearby (Buzsáki & Moser, 2013; Howard & Kahana, 2002; Trope & Liberman, 2010). That contiguity related to richness of episodic detail over and above the number of order-tagged items recalled (see Sederberg, Miller, Howard, & Kahana, 2010) suggests that it facilitated access to featural-level details, ambient environmental details, or journeys between items. Reduced temporal-context binding or reinstatement in older adults may explain, in part, the established age-related reduction in internal details: Despite initiating recall like younger adults, older adults may benefit less from iterative cuing through contextually linked items (see also Howard et al., 2006). This would increase demands for multiple deliberate recall-initiation attempts, with which older adults struggle (Craik, 1986), with initiation attempts jumping around more in space and time. The large effects of age on detail richness and temporal organization observed here are not accompanied by effects on recall accuracy, which we explored in a related study using real-world events (Diamond, Armson, & Levine, 2020). Mechanisms underlying detail recovery and temporal context reinstatement may be particularly susceptible to age-related decline.
Forward asymmetry was also positively associated with richness of episodic detail, but this association was driven by reduced external details. Excessive external details can reflect overreliance on semantic information or poor cognitive control (Sheldon et al., 2018). It is possible that external-detail interjections interrupt the forward flow of temporal context, truncating memory search. Alternatively, participants with temporally disorganized memories may terminate memory search earlier and compensate with nonepisodic information. In any case, we can only speculate about the causal direction, if there is one, between temporal organization and detail. It may be that temporal clustering is relatively automatic and dependent on hippocampal dynamics, whereas forward-biased search is more strategic and dependent on cognitive control processes coordinated by the prefrontal cortex (Vriezen & Moscovitch, 1990). This interpretation is consistent with the associations between temporal clustering and internal-detail generation, on the one hand, and between forward asymmetry and external-detail suppression, on the other. It is also consistent with the decline of both contiguity and forward bias in older age, with accompanying hippocampal and prefrontal atrophy (Raz et al., 2005), versus the decline in contiguity but preservation of forward bias in medial temporal lobe amnesia (Palombo et al., 2018).
We acknowledge that the unidirectional structure of the tour events confounds spatial and temporal distance, which, when separated, can make dissociable contributions to recall organization (Miller, Lazarus, Polyn, & Kahana, 2012). As noted above, we use temporal organization to refer to sequence, or ordinal structure, rather than absolute time. Such structure may shape the representation of episodes in memory more than space or time per se (Buzsáki & Tingley, 2018; Friedman, 2004), particularly for recall of real-life experiences that flow continuously through physical and semantic space (unlike artificially constructed stimuli that jump abruptly in these dimensions). Future work may determine how multiple dimensions (e.g., semantic associations, narrative structure, and boundaries) compete for influence on memory for complex episodes.
In conclusion, spatiotemporal proximity and chronological order shape the dynamics of naturalistic-event recall. Both of these aspects of spontaneous temporal organization are compromised in older age, suggesting that aging is associated with a decline in the spatiotemporal organization of episodes in memory. Finally, recall organization relates to the content and specificity of retrieved details. These findings bridge a gap between detail richness and temporal context reinstatement, two influential yet disconnected behavioral hallmarks of episodic memory function, providing a more fully articulated description of naturalistic memory retrieval and how it declines in older age.
Supplemental Material
sj-pdf-1-pss-10.1177_0956797620958651 – Supplemental material for Linking Detail to Temporal Structure in Naturalistic-Event Recall
Supplemental material, sj-pdf-1-pss-10.1177_0956797620958651 for Linking Detail to Temporal Structure in Naturalistic-Event Recall by Nicholas B. Diamond and Brian Levine in Psychological Science
Footnotes
Acknowledgements
We thank Laryssa Levesque, Alissa Papadopolous, and Yarden Levy for assistance with memory scoring.
Transparency
Action Editor: D. Stephen Lindsay
Editor: D. Stephen Lindsay
Author Contributions
N. B. Diamond developed the analysis concept. Both authors contributed to the study designs. N. B. Diamond analyzed and interpreted the data under the supervision of B. Levine. N. B. Diamond drafted the manuscript, and B. Levine provided critical revisions. Both authors approved the final manuscript for submission.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
