Abstract
Reliance on remembered facts or events requires memory for their sources, that is, the contexts in which those facts or events were embedded. Understanding of source retrieval has been stymied by the fact that uncontrolled fluctuations of attention during encoding can cloud results of key importance to theoretical development. To address this issue, we combined electrophysiology (high-density electroencephalogram, EEG, recordings) with computational modeling of behavioral results. We manipulated subjects’ attention to an auditory attribute, whether the source of individual study words was a male or female speaker. Posterior alpha-band (8–14 Hz) power in subjects’ EEG increased after a cue to ignore the voice of the person who was about to speak. Receiver-operating-characteristic analysis validated our interpretation of oscillatory dynamics as a marker of attention to source information. With attention under experimental control, computational modeling showed unequivocally that memory for source (male or female speaker) reflected a continuous signal detection process rather than a threshold recollection process.
Imagine the following scenario. You recently learned about a genetically engineered tomato-tobacco hybrid plant, “tomacco.” Unfortunately, you cannot remember whether you learned about tomacco from The New York Times or from an episode of “The Simpsons.” This failure to recall the source of your memory leaves you unsure of the veracity of the information. Clearly, knowing a memory’s source is very important. Source memory can prevent serious faux pas (e.g., relaying the development of tomacco as fact) and can aid recognition of stimuli that are encountered in new contexts (Mandler, 1980).
Source memory is commonly assessed using a paradigm in which subjects study a list of words varying in some contextual detail, such as spatial location if the words are visually presented or gender of a recorded voice if the words are spoken (Johnson, Hashtroudi, & Lindsay, 1993). Next, a recognition test is given. Previously studied and new words are presented in random order. Subjects must decide whether each word is “old” or “new” (item recognition) and make a decision as to any remembered word’s source (e.g., male or female voice; source retrieval). One important aspect of the paradigm is its combination of a traditional recognition measure (item recognition) with a cued recall measure (source retrieval).
Item recognition can be bolstered by recall of source (or other encoding-specific) details. This process, termed recollection (Parks & Yonelinas, 2009), may complement a graded feeling of familiarity. Although most accounts agree that item recognition depends on a graded familiarity signal, there is disagreement about the nature of the recollection process. The influential dual-process signal detection theory (DPSDT; Yonelinas, 1994, 1999) describes recollection as a threshold process: Attempts at recollection either do or do not retrieve a detail. When recollection succeeds, it is always accurate and always produces a highly-confident “old” response. A competing model, unequal-variance signal detection theory (UVSDT; Mickes, Wais, & Wixted, 2009), assumes that all responses reflect a single, continuously distributed memory-strength variable. That is, UVSDT assumes that recollection is a continuous process.
Note that both models focus on retrieval operations and have no parameters describing encoding processes, although recent UVSDT extensions acknowledge the importance of encoding-related factors (Hautus, Macmillan, & Rotello, 2008). Additionally, assessments of the competing views of recollection have often focused on either neuroimaging or cognitive modeling (see Malmberg, 2008, for a review). In the study reported here, we combined both methods and incorporated an encoding-specific design. This hybrid methodology allowed us to characterize attention’s role in recognition memory and to provide a clear assessment of the nature of recollection.
DPSDT was motivated by a distinctive regularity in ratings-based receiver operating characteristics (ROCs). These functions plot the hit rate (the probability of an “old” response given an old item, or p(“old”|old)) against the false alarm rate (the probability of an “old” response given a new item, or p(“old”|new)) at varying levels of response confidence for fixed accuracy. Typically, ROCs are asymmetric for item recognition: Accuracy at the left-most ROC point (the false alarm rate and hit rate conditional on a “sure old” response) is higher than simpler models predict. DPSDT attributes this result to threshold recollection’s selective generation of highly-confident “old” responses (Yonelinas, 1994).
An alternative explanation is that recent exposure increases old items’ memory strength, and as this increase varies in magnitude across items, the variance of the strength of old items increases (Starns, Rotello, & Ratcliff, 2012). This explanation forms a central assumption of the UVSDT model illustrated in Figure 1. In the model, subjects respond “old” whenever a test item’s strength exceeds a response criterion, and respond “new” otherwise. Memory strength is continuously distributed, with higher average strength assigned to old than to new items. Plotting the response proportions predicted at each of several confidence criteria generates a curved, asymmetric ROC.

The unequal-variance signal detection model of recognition memory (Mickes, Wais, & Wixted, 2009). The figure shows the distributions of memory strength for old and new items. The vertical line represents a response criterion: Test items falling to the right of the criterion are called “old,” and those falling to the left are called “new.” A key aspect of the model is its assumption that both familiarity and recollection can be described with a single, aggregate memory-strength variable.
A key point about UVSDT is its conception of recollection and familiarity as aggregate strength. Regardless of the number of processes recognition involves, they are assumed to be adequately described by a single, continuously distributed memory-strength variable. That is, UVSDT assumes that recollection is a continuous process (Mickes et al., 2009; Rotello, Macmillan, & Reeder, 2004).
To evaluate the competing views of recollection, one can assess ROC data for tasks varying in the degree to which recollection is required (e.g., item and source ROCs). For example, for source recall, one can plot p(“male”|male) against p(“male”|female) at different levels of source confidence for items correctly identified as old. According to DPSDT, the fact that source responses are collected via cued recall (i.e., only recollection can be used) means the resulting source ROCs should be linear, as predicted if threshold recollection were operating alone. Yonelinas (1999) reported the first source ROCs and concluded that the functions were indeed linear, which was consistent with DPSDT. However, subsequent work has suggested a simpler explanation for this linearity. Specifically, fluctuations in attention to the source dimension during the study phase could produce test trials on which little or no source information is available. The resulting low-accuracy, low-confidence source responses would “flatten” the source ROCs (Slotnick & Dodson, 2005). UVSDT extensions incorporating this assumption fit existing source ROCs well, which supports the attention-fluctuation hypothesis (Hautus et al., 2008).
Despite this success, direct tests of the attention-fluctuation hypothesis are lacking: Support for it has been largely inferred through goodness of fit. Fortunately, recent advances in the electrophysiology of attention suggest novel ways to both (a) test this hypothesis directly and (b) track and control attention. To understand how, consider electroencephalogram (EEG) oscillations in the alpha (8–14 Hz) frequency band. Recent studies have exploited prestimulus alpha-band oscillations as a marker of selective attention for an upcoming stimulus. Visual attention entails both a decrease in prestimulus alpha activity over cortical regions responsible for actively encoding a stimulus (Thut, Nietzel, Brandt, & Pascual-Leone, 2006) and an increase in such activity in regions whose function must be reduced to suppress task-irrelevant processing (Freunberger, Fellinger, Sauseng, Gruber, & Klimesch, 2009; Payne, Guillory, & Sekuler, 2013). Recent work has documented analogous (but not identical) effects in audition. For example, Banerjee, Snyder, Molholm, and Foxe (2011) reported dynamic changes in prestimulus auditory alpha, with alpha being higher when attention was not focused on auditory stimuli and lower when it was so focused. This pattern is similar to what has been found in EEG studies of vision (Freunberger et al., 2009; Payne et al., 2013). However, the auditory effect had a right parieto-occipital electrode topography. As Banerjee et al. demonstrated, this pattern differs from the patterns reported in studies of vision, which tend to be most pronounced in centro-occipital electrodes.
Alpha power is an attractive candidate for linking attention at encoding to theories of recognition. However, the generality of the auditory effect remains to be established. If the results Banerjee et al. (2011) reported reflect some general signature of auditory attention, then a similar effect and topography should hold for more complex tasks, such as source monitoring. Furthermore, such a signature would provide a unique opportunity to test the attention-fluctuation hypothesis. Specifically, compared with trials on which low alpha power precedes the auditory stimulus, trials with high alpha power should be associated with poorer recognition performance and with source ROCs nearer to chance. Moreover, trials on which prestimulus alpha power is low should produce high-accuracy source ROCs. If the attention-fluctuation hypothesis is correct, its implications for memory models are clear: The source ROCs for trials with high attention should be clearly curvilinear, contrary to source ROCs reported for previous studies that neither measured nor controlled attention at encoding (e.g., Yonelinas, 1999). Finally, changes in item-recognition ROCs that accompany increases in the contribution of recollection have been central in motivating DPSDT. Thus, it is important to know how the addition of attended source information actually affects the corresponding item ROCs (Wixted, 2007).
In this article, we report an experiment that examined the role of alpha power in the encoding of auditory source information and the consequent effects on recognition memory performance. We show how analyses of test data that take attention at encoding into account, via alpha power, can reveal patterns that greatly refine understanding of both recognition and source memory.
Method
Subjects
Eleven subjects (8 female, 3 male) participated. Subjects’ mean age was 21 years (SD = 1.95). All had normal or corrected-to-normal vision (measured with Snellen targets), reported no hearing deficits, and denied having psychological or neurological disorders. Subjects were naive to the purpose of the experiment and were paid for participation.
Stimuli
Singular nouns were presented visually and auditorily. Each had four to eight letters and one to three syllables. The words’ average written frequency was 123 per million (SD = 117; Kučera & Francis, 1967). Written words were displayed in lowercase on a 21-in. CRT monitor (99.8-Hz refresh rate; screen resolution = 1280 × 960 pixels). Displays were viewed binocularly from a distance of 57 cm. Audio files of the spoken words were presented at 48 db SPL and had an average length of 490 ms (SD = 102 ms). There were two speakers, one male and one female. The average frequency of the male speaker’s audio files (computed over full duration) was 111 Hz (SD = 15.27 Hz); the female speaker’s files averaged 235 Hz (SD = 20.33 Hz).
Procedure
Encoding
Figure 2 illustrates the encoding procedure. Trials were of two types: attend voice (AV) and ignore voice (IV); 40 trials of each type were randomly intermixed. AV trials began with a fixation screen (300 ms; white cross on a black background), which oriented subjects to the region of the display within which the visual stimuli would be presented. A red box (500 ms) followed, cuing subjects to ignore the upcoming visual stimulus. This box was centrally presented, subtending 0.70° of visual angle. Next, a study word was presented briefly (100 ms), in either an italic or a nonitalic font. A pattern mask (100 ms; a row of Xs) was presented before and after the word. A green box (500 ms) then cued subjects to attend the upcoming (male or female) voice, which spoke the study word. The monitor went blank at the onset of the recording and remained blank for 700 ms (i.e., a variable duration after the end of the recording). A fixation screen (300 ms) followed. Subjects were then asked to indicate the voice’s gender via key press. Response times greater than 3,000 ms were followed by visual display of the words “Too Slow” (500 ms). Trials ended with the visual presentation of the study word (3,250 ms; nonitalic font). This interval gave subjects an opportunity to fully attend to and more deeply encode the word. Specifically, they were instructed to use this time to study the word for an upcoming memory test.

Illustration of the event structure of encoding trials. Each trial began with a fixation cross that oriented the subject to the region of the computer display that would contain the trial’s visual stimuli. The fixation point was replaced by a green square or a red square. A red square cued the subject to ignore the upcoming written word and attend the auditory presentation (attend-voice trials); a green square cued the subject to attend the font (italic or nonitalic) of the written word and ignore the auditory presentation (ignore-voice trials). Following a brief cue-stimulus interval, the word was presented on-screen in either an italic or a nonitalic font. After another brief interstimulus interval (ISI), a second cue was presented: a green square on attend-voice trials and a red square on ignore-voice trials. Following the next ISI, an audio clip of the word spoken in either a male (M) or a female (F) voice was played. The auditory presentation was followed, after a brief interval, by the trial’s probe. On attend-voice trials, the probe prompted subjects to indicate via a key press whether the voice that had spoken the auditory word belonged to a male or to a female. On ignore-voice trials, the probe prompted subjects to indicate whether the stimulus had been in an italic font (“It”) or a nonitalic font. Every trial ended with a study period during which that trial’s word was visually presented for extra study. The dashed outline highlights the key interval for the analyses.
IV trials resembled AV trials except that the cues traded positions: Subjects were to attend the font of the words and ignore the voices. Their task was to indicate whether the font used was italic or nonitalic. Our main goal was to examine the effects of changes in attention to male and female voices (the source variable explored in key studies of source recall), so we fixed the order of the intervals in which the visual and auditory stimuli were presented, relying on the font task to draw attention away from the voice during IV trials. Thus, for our analyses, the key interval in the trials contained the second attention cue and the auditory source stimulus.
Test
Test instructions began immediately after encoding. All 80 old words and 40 new words were presented in random order on the memory test. 1 Below each was a 6-point confidence rating scale ranging from 1 (sure new) to 6 (sure old). After subjects rated their confidence that a given word was old or new, they rated their confidence that the word had originally been spoken in a male or female voice. The source scale ranged from 1 (sure female) to 6 (sure male). Subjects were instructed that source ratings would follow even their “new” responses, and that in such cases they should provide their best guess for the voice, assuming that their “new” decision had been wrong. This was in keeping with previous experiments in which such responses were collected for theoretical reasons; we do not consider these ratings further.
Electroencephalographic recording
Electroencephalographic (EEG) signals were recorded throughout the experiment using a 129-electrode array (Electrical Geodesics Inc., Eugene, OR) and high-impedance amplifiers. All channels were adjusted for scalp impedance less than 50 kΩ. Sensor signals were sampled at 250 Hz with a 0- to 125-Hz analogue band-pass filter. Periocular electrodes recorded from above and below each eye, and near each outer canthus.
Analyses
Ratings ROCs were constructed for (a) source recall at each level of source attention (AV and IV) and (b) item recognition when the old items’ source dimension was attended and when it was not attended. These functions allowed us to assess whether source recollection for attended items is best described as a continuous or a threshold process and to characterize old/new performance in conditions varying in the extent to which recollection contributes (i.e., whether or not the source dimension was attended at encoding).
We assessed source ROC linearity by fitting UVSDT to each function and comparing its fit with that of a threshold model consistent with the assumptions of DPSDT as applied to source decisions (essentially, DPSDT without the signal detection component; Yonelinas, 1999). UVSDT predicts curvilinear ROCs for above-chance performance, whereas the threshold model predicts linear ROCs. By assessing the relative fit of these two models, we could draw conclusions regarding ROC form and the likelihood that threshold recollection drove source responses. We used standard methods for comparing and fitting models. The UVSDT model required a mean and variance for the “male” source distribution and five ratings criteria. The threshold model required “male” and “female” recollection probabilities and five bias parameters. Thus, both models used seven parameters to fit the 10 independently varying points of a given attention condition. Models were fit using the optim procedure in R (R Development Core Team, 2006) to minimize the log-likelihood statistic G2df=3.
To measure variation in recollection across the item ROCs, we conducted conventional tests of changes in the hit rate at a given false alarm rate and also tested fits of DPSDT models. For the latter, we first fit a baseline DPSDT model to the two item ROCs. This model required nine parameters: two recollection probabilities (for attended and ignored old items), two distribution means (for attended and ignored old items), and one set of five ratings criteria. This model fit 15 independent data points (df = 6). We found no difference in familiarity estimates between the two conditions and therefore adopted an eight-parameter model with one distribution mean for our analyses. We compared its fit with that of a restricted model incorporating one value of the probability of recollection. The difference in fit, ΔG2, is χ2-distributed with degrees of freedom equal to the difference in the number of free parameters (i.e., 1).
EEG signals were preprocessed using MATLAB’s EEGLAB toolbox (Delorme & Makeig, 2004). Recordings were rereferenced to the grand average. The 0.5-Hz Butterworth high-pass and 60-Hz Parks-McClellan notch filters were applied. Epochs containing muscle artifacts, blinks, or saccades were rejected via independent components analysis and visual inspection. Wavelet analysis, cluster permutations, and visual display were performed using MATLAB’s FieldTrip toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011). Time-frequency representations were computed using Morlet wavelets (width = 4 cycles/wavelet; 1- to 70-Hz centers), in 1-Hz steps.
Alpha amplitude was defined by mean oscillatory power in the 8- to 14-Hz band. Wavelet alpha power for all electrodes was calculated for the epoch from the onset of the second cue through the offset of the auditory stimulus. A cluster-based, nonparametric randomization test (Maris & Oostenveld, 2007) compared AV and IV trials. The cluster-based test statistic was calculated by comparing alpha power for the two conditions at every sensor, leaving the time window open from cue offset to stimulus offset. All sensors for which the t value exceeded the .05 level were clustered via spatial adjacency. The sum of t values from the cluster with the maximum sum was then used as the test statistic, so as to avoid the problem of multiple comparisons. A reference distribution of test statistics was generated by randomly permuting the data across the two conditions 1,000 times. A cluster was characterized as exhibiting a statistically significant difference in alpha power if the proportion of randomized values larger than the observed test statistic fell below the .025 level.
Results
Behavioral results
Results for source recall, conditional on a recognition rating of 6 (sure old), are plotted in Figure 3. 2 Discrimination performance (in da units 3 ) varied with attention: IV performance (Figs. 3a and 3c) fell below AV performance (Figs. 3b and 3d), t(10) = 3.76, p < .01, d = 1.13, 95% confidence interval (CI) = [0.40, 1.54]. IV performance was not above chance (i.e., da = 0), t(10) = 0.27, p = .79, d = 0.08, 95% CI = [−0.38, 0.48]. Unsurprisingly, both models fit the IV ROC well, G2df=3 = 2.92 for UVSDT and G2df=3 = 0.58 for the threshold model. As performance approaches chance, ROCs necessarily became more linear. The key data for assessing the nature of the source variable in our study are in the AV condition. These data, plotted in Figures 3b and 3d, are clearly curvilinear and were well described by UVSDT, G2df=3 = 1.18. The threshold model departed from the data, G2df=3 = 9.34, p < .05. These results are inconsistent with the idea that these source-recall responses were driven to a significant extent by threshold recollection. They were, on the contrary, well described by a continuous model.

Observed and predicted source-recall receiver operating characteristics (ROCs) in the (a, c) ignore-voice and (b, d) attend-voice conditions, conditional on a high-confidence “old” response. Each ROC plots the probability of a “male” response given a male voice against the probability of a “male” response given a female voice. Individual data points plot these probabilities at a particular level of response bias, obtained by cumulating from high-confidence “male” responses (left-most point) to high-confidence “female” responses (right-most point, where x = y = 1). Model predictions were derived from (a, b) the unequal-variance signal detection model and (c, d) the threshold model.
It is important to know how item recognition varies with the addition of this seemingly continuous source information. The item ROCs for the two conditions, plotted in Figure 4, are informative. When source recall was “added” to item responses (AV condition), accuracy appeared to increase primarily at the left-most, high-confidence-“old” operating point. As the false alarm rates of these two functions were necessarily equal, we could assess the increase in accuracy by comparing hit rates. These comparisons showed that the hit rate was higher for AV than for IV trials at the left-most (sure old) point, t(10) = 2.85, p < .05, d = 0.86, 95% CI = [0.02, 0.12], but the difference failed to reach significance at any subsequent operating points—for example, t(10) = 1.42, p = .19, d = 0.43, 95% CI = [−0.03, 0.12], for the immediately rightward AV-IV pair. In other words, a seemingly continuous source variable produced an effect in the item ROCs that was similar to that which originally motivated the incorporation of threshold recollection into DPSDT. At the very least, these results demonstrate that a continuous variable can produce effects that will be misinterpreted as reflecting threshold recollection, as long as that continuous variable tends to produce high-confidence “old” responses.

Item-recognition receiver operating characteristics in the attend-voice and ignore-voice conditions. Note that the contributing old items’ source dimension was attended at encoding in the former condition but ignored at encoding in the latter condition. The asterisk indicates a significant difference between conditions (p < .05).
To assess this implication directly, we fit DPSDT to the item ROCs in Figure 4. We started with the nine-parameter full model, which fit well, G2df=6 = 8.05, p = .24. Its estimates for recollection in the two conditions differed by .086—AV: precollection = .454; IV: precollection = .368. That is, DPSDT indicated a nearly 9% increase in recollection probability when source recall was added in. Familiarity estimates, however, did not differ across the conditions (d′ = .67 and .64 for AV and IV trials, respectively). Thus, we evaluated the increase in recollection by comparing two nested DPSDT models: Each used five criteria and one value of d′, but they differed in whether precollection varied across conditions or whether a single value was estimated. This test indicated a reduction in fit in the restricted model, ΔG2df=1 = 4.18, p < .05. In other words, DPSDT indicated a significant increase in “threshold recollection” when source recall was added in.
EEG results
Results of the cluster-based permutation test revealed a right-lateralized cluster of 13 posterior electrodes at which alpha power in the IV condition exceeded that in the AV condition, p < .01 (Fig. 5). This effect extended across a 200-ms epoch following the offset of the cue but preceding the onset of the auditory stimulus. Figure 6 shows the time-frequency transforms averaged across these 13 electrodes. 4 Together, the figures reveal that posterior alpha-band activity following a cue to ignore the auditory stimulus began immediately upon cue offset and continued throughout the duration of the stimulus. The brief alpha activity at the presentation of both cues and at the onset of the to-be-attended stimulus (see Fig. 6) likely reflects the well-documented alpha-band phase locking that occurs in response to a stimulus onset (Freunberger et al., 2009).

Topographic display of the between-condition comparison of alpha-band power within the window preceding the onset of the voice stimulus. The color coding reflects t scores from tests of whether power in the ignore-voice condition differed from that in the attend-voice condition. Asterisks indicate electrodes showing a significant difference between the two conditions (ignore voice > attend voice) at the .05 level.

Grand-averaged time-frequency wavelets averaged across the cluster of 13 posterior electrodes that showed a significant difference (ignore voice > attend voice) in alpha-band power between the ignore-voice condition (right panel) and the attend-voice condition (left panel). The solid white lines indicate the onsets of the cue and the voice; the dashed lines mark the offsets of these stimuli.
To further assess alpha power’s diagnosticity as an attentional index, we conducted a within-subjects median split for the AV condition. We focused on this condition because performance was naturally at floor in the IV condition. This allowed us to examine within-subjects EEG-defined high- and low-attention source ROCs, conditional on high-confidence-“old” decisions. The results, plotted in Figure 7, show higher accuracy in the low-power than in the high-power trials (da = 1.31 vs. 0.47, respectively). A Wilcoxon test of ROC areas (Hanley & McNeil, 1983) confirmed this difference, z = 2.97, p < .01, 95% CI = [0.07, 0.31]. As predicted by the attention-fluctuation hypothesis, the difference appeared to be primarily in the middle of the ROCs. This can be seen by comparing the fits of the UVSDT and threshold models to the functions in Figure 7. The UVSDT model fit both ROCs well—low power: G2df=3 = 5.81; high power: G2df=3 = 4.82 (Figs. 7a and 7b). In contrast, the threshold model fit the high-power ROC quite well, G2df=3 = 1.71, but deviated significantly from the data in the low-power trials, G2df=3 = 13.00, p < .01 (Figs. 7c and 7d).

Source-recall receiver operating characteristics, conditional on a high-confidence “old” response, for trials on which prestimulus alpha power was (a, c) high and (b, d) low at encoding. All data are from the attend-voice condition, and trials were categorized on the basis of a within-subjects median split. The graphs show observed data plus fits of (a, b) unequal-variance signal detection theory and (c, d) the linear threshold model.
To ensure that our results were not due to distortions from averaging over subjects, we compared groups of subjects with similar performance levels, sorting them into subgroups with low (n = 4), medium (n = 3), and high (n = 4) accuracy. In all three subgroups da was higher in the low-power than in the high-power trials—low-accuracy group: da = 0.51 for low-power trials and 0.04 for high-power trials; medium-accuracy group: da = 1.07 for low-power trials and 0.57 for high-power trials; high-accuracy group: da = 1.54 for low-power trials and 1.39 for high-power trials.
Discussion
Our behavioral results show that when attention to the source dimension is controlled at encoding, the resulting source ROCs are clearly curvilinear and consistent with the underlying strength distributions of continuous recollection models (Mickes et al., 2009). Further, the results demonstrate that continuous recollection may produce the same increase in high-confidence hits that originally motivated the inclusion of threshold recollection in DPSDT. Finally, the results illustrate that DPSDT will falsely interpret a continuous variable as a threshold variable, as long as its contribution largely results in high-confidence “old” responses. This suggests that DPSDT’s threshold recollection measure not only is unreliable, as other researchers have already suggested (Rotello, Macmillan, Reeder, & Wong, 2005), but also may be invalid.
Curvature of source ROCs may result if subjects are instructed to “unitize” item and source information (Bastin et al., 2013). However, our design included no such instructions. Our recognition results also are contrary to this interpretation: If unitized information contributed to decisions, DPSDT should not have indicated increased threshold recollection when source memory was added to item decisions. In fact, the DPSDT model not only indicated an increase in threshold recollection in the AV condition, but also indicated no change in familiarity. This means that according to DPSDT, source recall was based entirely on threshold recollection, and thus the source ROC in the AV condition should have been clearly linear. Our results contradict this prediction.
Our study revealed an increase in right-lateralized posterior alpha power following cues to ignore an upcoming voice stimulus. Topographically, this finding resembles effects of visual attention (Freunberger et al., 2009; Payne et al., 2013). However, the focus of the auditory effect was more lateral than is found for vision. The right parieto-occipital attention-mediated cluster identified in our analyses matches previous findings (Banerjee et al., 2011), though we used a considerably more complex task. Within the frontoparietal network, which has been implicated in top-down control of spatial attention, the subsets of regions that are activated by attention differ between audition and vision (Krumbholz, Nobis, Weatheritt, & Fink, 2009). Those results indicate a modality-sensitive attentional-control region. Recent work with nonhuman primates suggests that this control region may be subserved by cells in the intraparietal sulcus (Grefkes & Fink, 2005).
Increases in posterior alpha power have been demonstrated for to-be-ignored auditory stimuli including noise bursts (Banerjee et al., 2011) and speech streams (Kerlin, Shahin, & Miller, 2010). To those we add auditory source stimuli. Our results, which demonstrate the generality of alpha power as a dynamic marker of attention, constitute a first step toward a powerful new analytic strategy for recognition memory. Using the alpha signal, one can track attentional engagement across conditions, subjects, and even individual trials (Macdonald, Mathan, & Yeung, 2011). One can then verify the interpretation of such EEG oscillations by applying appropriate cognitive models whose fits are conditional on the trials of interest (Ratcliff, Philiastides, & Sajda, 2009). Trials on which attention is low can then be safely removed or parameterized. Alpha power could also be treated as a covariate or used to inform the parameters of hierarchical models (Pratte, Rouder, & Morey, 2010). Our work also suggests that in conjunction with accurate retrieval indices, alpha power can be used to constrain or validate the attention parameters of existing and future models (Hautus et al., 2008).
In sum, we have shown that attentional fluctuations affect behavioral measures of item and source memory, and that such fluctuations can distort memory data and result in misinterpretations within the framework of models, like DPSDT, that fail to account for attentional variation. Fortunately, such problems can be avoided by dual consideration of EEG oscillations at encoding and cognitive modeling of test performance. Our results show that models of complex cognitive processes such as recognition and cued recall should follow the lead of recent extensions of signal detection theory (Hautus et al., 2008) by accounting for variations in subjects’ attention during encoding. The combination of electrophysiology with cognitive modeling constitutes a powerful tool for improving understanding of recognition and attention’s role therein.
Footnotes
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
This work was supported in part by the Center of Excellence for Learning in Education, Science, and Technology, a National Science Foundation (NSF) Science of Learning Center (funded by NSF Grant SMA-0835976); by the National Institutes of Health (Grant T32-NS07292); and by the Air Force Office of Scientific Research (Grant FA9550-10-1-0420).
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
