Abstract
We sought to provide evidence for a combined effect of two attentional mechanisms during associative learning. Participants’ eye movements were recorded as they predicted the outcomes following different pairs of cues. Across the trials of an initial stage, a relevant cue in each pair was consistently followed by one of two outcomes, while an irrelevant cue was equally followed by either of them. Thus, the relevant cue should have been associated with small relative prediction errors, compared to the irrelevant cue. In a later stage, each pair came to be followed by one outcome on a random half of the trials and by the other outcome on the remaining half, and thus there should have been a rise in the overall prediction error. Consistent with an attentional mechanism based on relative prediction error, an attentional advantage for the relevant cue was evident in the first stage. However, in accordance with a mechanism linked to overall prediction error, the attention paid to both types of cues increased at the beginning of the second stage. These results showed up in both dwell times and within-trial patterns of fixations, and they were predicted by a hybrid model of attention.
Humans and other animals use environmental cues to predict the occurrence of significant outcomes and to engage in appropriate responses, yet only a small subset of cues is relevant to functioning in most natural and social environments. Identifying and selecting those cues for further processing is the defining role of attention. Experimental research has shown a number of factors determining attentional priority, one of them being stimulus salience. Specifically, cues that stand out from their background tend to capture attention (Theeuwes, 1992). Other factors depend on the relationship between cues and significant events. For instance, people focus their attention on cues that are relevant for performing a given task (McColeman et al., 2014; Rehder & Hoffman, 2005; Tatler et al., 2011). In addition, cues that come to be followed by rewarding—or aversive—outcomes receive increased attention (Anderson, 2016; Wang et al., 2013). Notably, this effect continues after rewards cease and paying attention to those cues is detrimental to the demands of a second task (Anderson & Yantis, 2012; Theeuwes & Belopolsky, 2012). Furthermore, attention is also driven by uncertainty about the events that may take place in a situation (Dayan et al., 2000; Gottlieb et al., 2020).
Because learning allows animals to associate cues with contingent outcomes, attention operates together with learning to enable outcome prediction (Pearce & Bouton, 2001). Thus, not surprisingly, research in the area of associative learning has also contributed to elucidate the conditions that drive attentional priority (for reviews, see Le Pelley, 2004; Pearce & Mackintosh, 2010). In models of associative learning, the features of a cue influencing attention are captured by an “associability” parameter, α, modulating the speed at which cue-outcome associations are formed. Faster learning is taken to indicate increased attention to a cue. Research using eye-tracking has shown that fixation patterns usually match the predictions made for associability—for example, cues fixated for longer times subsequently showed high associability (see Le Pelley et al., 2016). Such converging evidence is important, because eye-tracking is an established measure of overt visual attention (e.g., Deubel & Schneider, 1996).
Attentional theories of associative learning postulate that associability is not a fixed property of cues, or merely based on their physical features. Instead, associability changes in accordance to prediction error, which is the discrepancy between the magnitude of an outcome and the extent to which it is predicted by one or more cues. The overall prediction error experienced on a trial is given by the term:
where λ indicates the magnitude of the outcome, for example, taking the values of 1 or 0 to code for its presence or absence, and ΣV represents the strength of the association between the cues and the outcome (e.g., Pearce & Hall, 1980; Rescorla & Wagner, 1972). Typically, ΣV is 0 at the outset of learning, indicating that the cues have never been experienced together with the outcome. But through repeated pairings, ΣV is driven towards the maximum value, usually set to 1. The specific way in which prediction error is assumed to change associability depends on the theory under consideration.
In his attentional theory, Mackintosh (1975) suggested that the associability of a cue, for example, A, increases when it signals an outcome more accurately than the other present stimuli, for example, X. Otherwise, when A is equally or less accurate than X, its associability decreases. Mackintosh expressed this idea with the following rules:
Thus, the change in associability, Δα, is based on the comparison of the prediction error of each cue, for example, A, with that of all the other cues, X. Strong support for Mackintosh’s theory comes from discrimination paradigms where some cues are relevant for predicting the outcomes, while others are irrelevant (e.g., George & Pearce, 1999; Le Pelley et al., 2011, 2013; Le Pelley & McLaren, 2003; Lochmann & Wills, 2003). Take for instance the study conducted by Le Pelley et al. (2011), who asked participants to solve a discrimination where the occurrence of different sounds could be predicted on the basis of nonsense words. On each trial, participants saw a pair of words before hearing one of two possible sounds. The pairs were arranged so that, across trials, one word was consistently followed by the same sound, whereas the other word was followed by each of the two sounds on an equal number of occasions. These contingencies rendered the words in the former class relevant for the prediction of the sounds—eventually becoming strongly associated with a specific sound and generating small prediction errors, as inferred from the percentage of correct predictions, while those words in the latter class were irrelevant—presumably acquiring weak associations with both sounds and generating large prediction errors throughout the task. Eye-tracking data showed that participants paid more attention to the relevant words than to the irrelevant ones. Moreover, the attentional bias persisted during a subsequent discrimination where the same words were paired with two new sounds. Importantly, in the second discrimination the words in each pair were equally relevant for predicting the sounds. A final test of associability also indicated that the previously relevant words acquired stronger associations with the sounds of the second discrimination, compared to the previously irrelevant words. The attentional advantage for the (previously) relevant words, together with the learning bias inferred from the final test, lent support to Mackintosh’s theory.
In contrast to Mackintosh’s theory, the Pearce and Hall (1980) model proposed that decreasing prediction error leads to a reduction in associability. Thus, the associability of a cue is directly related to the prediction error generated by all the present cues on the most recent trial:
Evidence in favour of the Pearce–Hall model typically comes from research using partial reinforcement schedules (e.g., Hogarth et al., 2008; Kaye & Pearce, 1984; Koenig et al., 2018; Koenig, Kadel, et al., 2017; Koenig, Uengoer, & Lachnit, 2017; Swan & Pearce, 1988). For instance, Hogarth et al. (2008) asked participants to predict the occurrence of a tone based on different visual patterns. One of those patterns was followed by the tone on every trial, that is, continuously reinforced cue. Another visual pattern was followed by the tone on a random half of the trials, that is, partially reinforced cue. And, finally, a third pattern was never paired with the tone, that is, non-reinforced cue. As indicated by trial-by-trial ratings, most participants acquired differential expectancies of the tone that matched the actual contingencies. Specifically, participants came to expect the tone during the continuously reinforced cue and to expect its omission during the non-reinforced cue. Thus, prediction error should have been at a minimum in both of these conditions. However, participants indicated intermediate tone expectancies during the partially reinforced cue. Yet, since the outcome of a particular trial was either the occurrence of the tone or its absence, the intermediate expectancies always bore some degree of discrepancy with the outcome. Therefore, this condition should have generated a relatively large prediction error. The time that participants spent looking at each type of cue (dwell time) was compared to the dwell time on a control cue that was present on every trial. Consistent with the Pearce–Hall model, the relative dwell times showed that the partially reinforced cue received more attention than both the continuously reinforced cue and the non-reinforced cue.
Despite the difference in the mechanisms of associability change suggested by Mackintosh (1975) and Pearce and Hall (1980), ample evidence exists for each of them (for a review, see Pearce & Mackintosh, 2010). Thus, Le Pelley (2004, 2010) developed a hybrid model in which both mechanisms interact to determine changes in associability. On a given trial, the individual prediction error of each cue is compared to the combined prediction error of all the other stimuli, thereby the “Mackintosh associability” increases for the cue with the smallest relative prediction error, while it decreases for the rest. In addition, the “Pearce–Hall associability” is obtained from the prediction error produced by the summed associative strengths of all the stimuli present. For each cue, these two values are multiplied to determine the overall associability. The attentional pattern observed on a given study would be consistent with Mackintosh’s mechanism, if the differences in prediction error concern cues that are presented simultaneously, as in a discrimination with relevant and irrelevant cues. Conversely, the Pearce–Hall mechanism would prevail, if those differences are associated with stimuli presented on separate trials, as with continuously versus partially reinforced cues. Beesley et al. (2015) termed the attentional patterns corresponding to each mechanism as “exploitation” versus “exploration.” Attentional exploitation operates when the learning agent can use one of the cues to predict the outcome accurately, in which case that cue is selected by attention. Attentional exploration is engaged in situations where it is uncertain whether the present cues are followed by the outcome, and thus the amount of attention paid to potentially informative cues is kept high. Notwithstanding the wider scope of the hybrid model, only few empirical studies have focused on it (Haselgrove et al., 2010; Kattner, 2015; Le Pelley et al., 2010; Luque et al., 2017), and, among those, even less have measured overt attention (Beesley et al., 2015; Easdale et al., 2019). Thus, the aim of the present study was to obtain further evidence for changes in overt attention consistent with a combination of the two associability mechanisms.
Our design included a discrimination stage followed by a partial reinforcement stage. Cues A, B, X, and Y were presented in pairs throughout the experiment, and the participants’ task was to predict which of two outcomes, o1 or o2, would follow each pair. During discrimination training, pairs AX and AY were followed by o1, while BX and BY were followed by o2. Given that A and B were consistently paired with o1 and o2, respectively, they were relevant for predicting the outcomes. Therefore, they should be associated with a small prediction error. Conversely, X and Y were paired with both outcomes, each on half of the trials, and were thus irrelevant for making the predictions. Those latter cues should be associated with a relatively large prediction error. During partial reinforcement, each of the four pairs was followed by either o1 or o2, on an equal number of randomly distributed trials. Thus, in this stage participants should not be able to make consistently accurate predictions, and prediction error should become large for the pairs and the individual cues.
Figure 1 shows simulations of the predicted changes in associability based on the above-mentioned theories (see the online Supplementary Material for details). Mackintosh’s (1975) theory (upper panel) predicts that, during discrimination, the relevant cues should receive more attention than the irrelevant ones, and that this differential allocation should decrease during partial reinforcement. The Pearce–Hall model (Pearce et al., 1981; middle panel) predicts that both types of cues should receive a similar amount of attention throughout the experiment. Moreover, attention should decrease during discrimination, and it should be reinstated by partial reinforcement. Finally, according to Le Pelley’s (2004, 2010) hybrid model (lower panel), attention should decrease for both types of cues during discrimination, yet the relevant cues receive more attention than the irrelevant ones. During partial reinforcement, the amount of attention paid to both types of cues should increase at the outset, and become similar in the course of training. We designed our experiment to test the predictions derived from the three theoretical accounts. Simulations using a wide range of parameters left the ordinal pattern of the predictions from each theory unchanged. Changes in overt attention were measured by recording the dwell times on the pairs of cues as participants solved the task.

Simulation results of attentional changes for the present experiment according to (a) Mackintosh’s (1975) theory, (b) the Pearce–Hall model (Pearce et al., 1981), and (c) Le Pelley’s (2004, 2010) hybrid model. In Stage 1, discrimination training comprised stimulus pairs containing one cue being relevant for predicting the outcome (o1 or o2) and another cue being irrelevant. In Stage 2, each pair of cues was randomly followed by either o1 or o2 (partial reinforcement). Note that the values obtained in the simulation based on the hybrid model are generally lower than those obtained in simulations based on the other models. However, our predictions were concerned with the relative amount of attention paid to each type of cue and the attentional changes across stages. For additional information, see the online Supplementary Material.
Method
Participants
Our study was approved by the ethics committee of the Faculty of Psychology at Philipps-Universität Marburg (AZ: 2018-25k). Twenty-six university students participated in exchange of 8.00 € or course credits. Depending on their performance, they also earned a monetary reward ranging from 3.92 to 9.20 €. Participants either reported absence of visual impairment (<1.5 diopter) or used soft contact lenses. Given that our hypotheses relied on the condition that prediction error would be small at the end of discrimination training, we excluded two participants with an accuracy in outcome prediction below 0.60 in the last two blocks of discrimination. Thus, the final sample consisted of 24 participants (MAge = 22.04 years, SD = 3.32, 6 men).
Stimuli and procedure
The stimuli, as well as the data and the analysis codes, are available at https://osf.io/wzyv8/. Four pairs of cues were used throughout the experiment. Participants’ task was to predict the outcome of each trial based on the cues presented on the screen, earning 8 cents for each correct prediction. Table 1 shows the experimental design. During an initial discrimination stage (Stage 1), the pairs AX and AY were followed by o1, while BX and BY were followed by o2. Thus, A and B were consistently paired with one of the outcomes, thereby becoming relevant for making the prediction, whereas X and Y were paired with both outcomes, each on half of the trials, and therefore were irrelevant. During a subsequent partial reinforcement stage (Stage 2), each of the pairs was followed by o1 on a random half of the trials, and by o2 on the other half. The experiment was divided into blocks (24 during Stage 1, and 8 during Stage 2) each including each of the four pairs once in random order. There was no break between the two stages. We presented an additional trial at the beginning of Stage 2, and excluded from the analyses the attentional data obtained from it. The reason for this was that measurement of attention occurred before the presentation of the outcome within each trial (see below). Therefore, attention allocation on the first trial of Stage 2 should still reflect the contingencies trained during Stage 1, that is, before the introduction of partial reinforcement. The pair presented on this trial was counterbalanced across participants. Throughout partial reinforcement, the trials where the outcomes had changed relative to the discrimination stage and those where the outcomes remained the same were evenly distributed, so that each block contained two instances of each.
Experimental design.
Figure 2 shows an example of the trials used in the experiment. Each trial started with a fixation cross that lasted for 2 s. Then, three grey circles (34 mm in diameter, 2.6°) appeared around the centre of the screen, distributed at 180 mm (13.7°) from each other (centre-to-centre). Two of the circles constituted the cues, each enclosing a specific pattern of grey dots. The other circle was empty. After 4 s, a response trigger was presented inside the empty circle for 0.2 s, together with a brief tone. The response trigger consisted of two small coloured dots representing each of the two possible outcomes (green or red) next to each other. Once the trigger disappeared, participants had 1.8 s to indicate the expected outcome by pressing the mouse button (left or right) corresponding to the position where the dot representing the outcome had been shown. Thus, the mouse buttons mapped the positions of the coloured dots within the trigger, which changed randomly from trial to trial. The response trigger competed for attention with the cues and prevented participants from keeping looking at them throughout the whole attention measurement interval, thus avoiding a potential ceiling effect. Finally, the correct outcome (a big dot that was either green or red) was presented for 2 s at the centre of the empty circle. In addition, an auditory feedback indicated whether the response was correct (“correct: 8 cents”) or not (“incorrect: 0 cents”). If participants made the response after the appropriate interval, a warning message was displayed on the screen following the outcome (“response too late!”) and the auditory feedback was omitted. A blank screen with a random duration between 1.5 and 4.5 s followed every trial. The position of the grey circles varied randomly across trials, taking any of the following polar angles: 0°, 60°, 120°, 180°, 240°, and 300°. All participants saw the same four grey dot patterns. Assignment of the patterns to the cues (A, B, X, and Y) was fully counterbalanced across participants. The position of the coloured dots within the response trigger (left or right) was determined randomly across trials, with four trials of each alternative every two consecutive blocks.

Trial outline. The stimuli appear closer to each other than in the participants’ screen. See the “Method” section for the actual sizes.
Participants read the instructions of the task after signing the informed consent. The instructions asked participants to pay attention to the dot patterns and to look then at the empty circle, in order to make sure that they would not miss the response trigger. In addition, the instructions stated that over the course of the experiment participants could learn to predict the outcomes on the basis of the grey patterns. Then, a minimum of eight practice trials were run, using stimuli that were similar to those used later in the experiment. The eye movements were recorded with an infrared video-based eye tracker (Eyelink 2000, SR-Research; Mississauga, Canada), sampling the gaze position at a frequency of 1000 Hz. Before the experiment started, the eye tracker was calibrated by means of a nine-point grid, keeping the maximal error below 0.5°. Sampling of the left or the right eye was counterbalanced across participants. The visual stimuli were presented on a 22″ CRT screen (Vision Master Pro 514, Iiyama; Tokio, Japan). Forehead and chin rests kept participants’ head in a fixed position, with an eye-to-screen distance of 78 cm. Stimulus timing and response recording was controlled by Presentation software (version 16.1; Neurobehavioral Systems, Inc.).
Measures and data analysis
We used the proportion of correct outcome predictions (accuracy) to measure the amount of prediction error generated by the cues, with a low accuracy corresponding to a large prediction error. The data from those trials in which no response occurred during the appropriate interval were excluded from the accuracy analyses (<0.01%). We used custom MATLAB software (Koenig, 2010) for signal conditioning and the detection and parametrisation of saccades and fixations. A heuristic filter was applied to gaze position traces to eliminate impulse and ringing noise (Stampe, 1993; cf. SR-Research, 2010, 2.3.2.2 Option “File sample filter”). Absolute gaze velocity was computed as the Euclidean sum of x and y velocity components and was filtered using a 5-point moving average (cf. SR-Research, 2010, 4.5.3.4 Saccades). Saccades were detected if gaze velocity exceeded 30°/s, and for each saccade limits were set at 15% of peak velocity (Koenig, 2010; cf. SR-Research, 2010, 4.3.5 Saccadic Thresholds). Non-saccade intervals were defined as fixations. Fixations were scored as on-stimulus, if they deviated less than 60 mm from its centre. The analysis interval was limited to the 4 s between the onset of the stimuli and the presentation of the response trigger. We excluded attentional data (dwell time and fixation probability, see below) from those trials in which (a) participants did not fixate on the cross before the onset of the cues (0.02%), and (b) an artefact had a duration above percentile 90 (>380 ms, 0.01%). Percentile 90 was computed from the distribution of the durations of all the artefacts observed during the measurement interval (including data from all trials and participants). The range of the distribution ranged from 1 to 1024 ms, with a median of 255 ms. Our primary attention measure was the dwell time on each type of cue. The dwell times were log-transformed to normalise their distributions. In addition, in order to track attention allocation within trials, we measured the probability of fixating on each stimulus—relevant cue, irrelevant cue, and trigger position—within the measurement interval, across 20 bins of 200 ms each. For each time bin, the dwell time on each stimulus was divided by 200, and the resulting relative times were averaged across trials.
We averaged the data from every two trials showing the same pair of cues, resulting in 12 analysis epochs during discrimination and eight during partial reinforcement. For each measure, we conducted three ANOVAs, focusing on the discrimination stage, the partial-reinforcement stage, or the transition between them (the factors are detailed in the results section). Where appropriate, the Greenhouse and Geisser (1959) correction was applied to the degrees of freedom. The significance level was set at .05. Partial η2 and Cohen’s d were used as measures of effect size.
Results
Accuracy
Each model under consideration predicted changes in attention to the cues as a function of prediction error, with the latter being inversely related to the accuracy of outcome predictions. As shown in Figure 3, during discrimination, participants’ accuracy increased following an asymptotic pattern, indicating that prediction error was reduced to a minimum. This was confirmed by a one-way ANOVA showing a significant effect of epoch, F(11, 253) = 23.08, p < .001, partial η2 = .50, significant linear and quadratic trends, Fs(1, 23) > 45.58, ps < .001, partial η2s > .66. From the last epoch of discrimination to the first epoch of partial reinforcement, a drop in accuracy pointed to a rise in prediction error, t(23) = 12.33, p < .001, d = 2.57. Then, accuracy showed an increase throughout partial reinforcement. Accordingly, a one-way ANOVA showed a significant effect of epoch, F(3, 69) = 5.08, p < .01, partial η2 = .18, significant linear trend, F(1, 23) = 5.90, p = .02, partial η2 = .20. However, the mean accuracy values continued to be below the chance level (0.5), suggesting that the cues continued to generate large prediction errors.

Mean values of the accuracy of outcome prediction across epochs. Error bars represent SEM.
Fixation dwell time
Having established that the trained contingencies led to the expected changes in accuracy, we focused on dwell time—our primary attention measure (see Figure 4). Consistent with the presumed reduction in prediction error, the dwell times decreased during discrimination, particularly in the early epochs. Moreover, the decrease was less pronounced for the relevant than for the irrelevant cues. This pattern was confirmed by means of a Cue (relevant vs. irrelevant) × Epoch (1–12) ANOVA, which revealed a significant interaction, F(11, 253) = 3.62, p < .01, partial η2 = .14. Thus, although the dwell times on each type of cue were similar in Epochs 1 and 2, ts(23) < 1.78, ps > .09, ds < 0.37, throughout the remaining epochs the dwell times were longer on the relevant cues than on those irrelevant, ts(23) > 2.16, ps < .05, ds > 0.45. Across epochs, the decrease was significant for each type of cue, Fs(11, 253) > 14.15, ps < .001, partial η2s > .38, significant linear and quadric trends, Fs(1, 23) > 15.00, ps < .001, partial η2s > .39. In addition, the main effects of cue and epoch were also significant, Fs > 12.42, ps < .01, partial η2s > .35.

Mean dwell times on the cues across epochs. Error bars represent SEM.
Upon the presumed rise in prediction error following the introduction of partial reinforcement, the dwell times showed a similar increase for each type of cue. The dwell times continued to be longer on the relevant cues than on those irrelevant. Accordingly, a Cue (relevant vs. irrelevant) × Epoch (Discrimination, Epoch 12 vs. Partial reinforcement, Epoch 1) ANOVA showed significant main effects of both cue, F(1, 23) = 9.99, p < .01, partial η2 = .30, and epoch, F(1, 23) = 24.65, p < .001, partial η2 = .52. In addition, the interaction effect was non-significant, F(1, 23) = 3.02, p = .10, partial η2 = .12.
The dwell times on each type of cue became similar after the first epoch of partial reinforcement, and remained so until the end. This was confirmed by a Cue (relevant vs. irrelevant) × Epoch (1–4) ANOVA, which revealed a significant interaction, F(3, 69) = 3.19, p = .03, partial η2 = .12. Thus, although in the first epoch the dwell times were longer on the relevant than on the irrelevant cues, t(23) = 2.36, p = .03, d = 0.48, they were similar during the remaining epochs. However, the dwell times on each type of cue remained relatively constant throughout partial reinforcement, Fs(3, 69) < 1.79, ps > .18, partial η2s < .08. None of the main effects were significant, Fs(3, 69) < 1.11, ps > .32, partial η2s < .05.
Fixation probability
Finally, we examined the pattern of attentional deployment within trials. Figure 5 shows the probability of fixating on each stimulus across the 20 time bins into which the measurement interval was divided. We focused on the data of the first and last epochs of discrimination (Figure 5a and b) and the first and last epochs of partial reinforcement (Figure 5c and d). Overall, participants started fixating on the discrimination cues and then proceeded to look at the position of the response trigger.

Probabilities of fixating on each stimulus (i.e., relevant cue, irrelevant cue, and trigger position) within trials. (a) and (b) Data from the first and last epochs of discrimination, respectively. (c) and (d) Data from the first and last epochs of partial reinforcement, respectively.
We conducted three separate ANOVAs on the fixation probabilities concerning the cues. In the follow-up analyses, the significance values were adjusted through the Benjamini–Hochberg procedure for the multiple comparisons between time bins (Thissen et al., 2002). The first epoch of discrimination was compared to the last one by means of a Cue (relevant vs. irrelevant) × Epoch (Discrimination, Epoch 1 vs. Discrimination, Epoch 12) × Bin (1–20) ANOVA. At the outset of discrimination, when participants had not yet learned to predict the outcomes, the fixation probabilities were similar for both types of cues. However, by the end of the stage, when prediction error was presumably at minimum, the probabilities were higher for the relevant cues than for those irrelevant. This was confirmed by a significant Cue × Epoch interaction, F(1, 23) = 10.88, p < .01, partial η2 = .32, with similar fixation probabilities in the first epoch, t < 1, but not in the last one, t(23) = 2.64, p = .02, d = 0.55. Across epochs, the probabilities decreased for each type of cue, ts(23) > 7.78, ps < .001, ds > 1.71. In addition, the probability of fixating on the cues at the end of the trial diminished across epochs. Accordingly, a significant Epoch × Bin interaction, F(19, 437) = 15.03, p < .001, partial η2 = .40, pointed to a decrease in the last 3 s, ts(23) > 3.50, ps < .01, ds > 0.68, and also to an increase in the interval between 200 and 400 ms, t(23) = 2.29, p = .04, d = 0.51. The main effects of epoch and bin were significant, Fs > 143.88, ps < .001, partial η2s > .86, but none of the remaining effects were, Fs < 3.13, ps > .08, partial η2s < .13.
From the last epoch of discrimination to the first epoch of partial reinforcement, and consistent with the rise in prediction error, the fixation probabilities increased for both types of cues. A Cue (relevant vs. irrelevant) × Epoch (Discrimination, Epoch 12 vs. Partial reinforcement, Epoch 1) × Bin (1–20) ANOVA revealed a significant Epoch × Bin interaction, F(19, 437) = 7.90, p < .001, partial η2 = .26. Thus, the increase occurred in the first 200 ms, then in the interval between 600 and 800 ms, and finally in the last 3 s, ts(23) > 2.26, ps < .04, ds > 0.43. In addition, the probabilities continued to be higher for the relevant cues than for those irrelevant, as indicated by a main effect of cue, F(1, 23) = 7.36, p = .01, partial η2 = .24. The main effects of epoch and bin were also significant, Fs > 24.96, ps < .001, partial η2s > .51, but none of the other effects were Fs < 1.81, ps > .13, partial η2s < .08.
In the last epoch of partial reinforcement, when prediction error should still be large, the difference in the fixation probabilities between the two types of cues appeared to reverse. A Cue (relevant vs. irrelevant) × Epoch (Partial reinforcement, Epoch 1 vs. Partial reinforcement, Epoch 4) × Bin (1–20) ANOVA showed a significant Cue × Epoch interaction, F(1, 23) = 5.25, p = .03, partial η2 = .19. However, the follow-up tests revealed no difference between cues in either epoch, and no change across epochs for either cue, ts(23) < 2.01, ps > .05, ds < 0.28. Among the remaining effects, only the main effect of bin was significant, F(19, 437) = 192.43, p < .001, partial η2 = .89 (all the other Fs < 1.81, ps > .12, partial η2s < .08).
Discussion
In the present study, participants’ eye movements were recorded as they predicted the outcomes following different pairs of cues. During an initial discrimination, one of the cues in each pair was relevant for making correct predictions, because it signalled the same outcome on every trial. The other cue was irrelevant because it was equally associated with each of the two possible outcomes. In a subsequent partial reinforcement stage, each pair was followed by one outcome on a random half of the trials, and by the other outcome on the remaining half. Thus, it was no longer possible to make accurate predictions. Our results revealed that outcome predictions became fully accurate during the initial discrimination, suggesting that they were based on the relevant cues, whose prediction errors should have been reduced to a minimum. Accordingly, an attentional preference emerged for the relevant cues in both the dwell times and the fixation probabilities. For both types of cues, there was a decrease in the dwell time, and also in the fixation probabilities concerning the last 3 s of the measurement interval. Accuracy dropped to chance level immediately following the introduction of partial reinforcement, and it stayed at that level throughout the stage, thus suggesting sustained large prediction errors. Consistently, the dwell times and the fixation probabilities showed an increase in attention for the two types of cues. In addition, the attentional preference for the relevant cues was no longer apparent beyond the first epoch of partial reinforcement.
The present findings are in full agreement with the predictions of the hybrid model by Le Pelley (2004, 2010), which assumes a combined effect of two associability mechanisms. One of them, based on Mackintosh’s (1975) theory, should give attentional preference to the relevant cues during discrimination, as they should be associated with smaller relative prediction errors than the irrelevant cues. The other mechanism, put forward in the Pearce and Hall (1980) model, should determine the amount of attention paid to each cue as a direct function of overall prediction error—that is, produced by all the present cues, with a decrease in attention during discrimination and an increase following partial reinforcement. The reduction in the dwell times during discrimination could also be accounted for by increased familiarity with the stimuli. Given that participants would need less and less time to identify the cues, they could move their eyes earlier towards the position of the response trigger, where action was needed next. However, this suggestion cannot explain the increase in the dwell times following the onset of partial reinforcement, when participants should be equally or more acquainted with the stimuli than during discrimination. In contrast, the changes in absolute dwell time observed throughout the task fitted the pattern anticipated by the Pearce–Hall component of Le Pelley’s hybrid model. Neither Mackintosh’s (1975) theory nor the Pearce and Hall (1980) model by itself is able to fully explain the observed pattern. Regarding the amount of attention paid to the relevant cues, Mackintosh’s theory did not anticipate a decrease during discrimination, or an increase following partial reinforcement. However, the Pearce–Hall model did not predict an attentional preference for the relevant cues during discrimination.
First evidence for the hybrid model in humans came from a study conducted by Beesley et al. (2015). Participants were trained with different pairs of cues, which were associated with their respective outcomes on either a deterministic or a probabilistic way. The probabilistic pairs were followed by a primary outcome on 67% of the trials and by another outcome on the remaining 33%, and thus should have produced larger overall prediction errors than the deterministic pairs, which were consistently followed by just one outcome. In addition, each pair contained a relevant cue that was (mostly) associated with one outcome, and an irrelevant cue that was equally associated with two outcomes. Thus, within pairs, the former cue should have produced smaller relative prediction errors than the latter. The authors found that the probabilistic pairs were fixated for longer times than the deterministic pairs and, within the latter, the same was true for the relevant cues compared to the irrelevant ones. In agreement with Beesley et al., our results suggest that changes in overt attention consistent with a combination of two associability mechanisms will be apparent in situations characterised by variations in both relative and overall prediction error, that is, situations where concurrent stimuli differ in their predictive validities and the overall predictive accuracy changes across instances (see also Easdale et al., 2019; Walker et al., 2019, in press).
Other studies in humans have provided evidence at variance with the hybrid model. Le Pelley et al. (2010) aimed to test the prediction that a procedure emphasising differences in overall prediction error should produce an attentional advantage for cues generating large prediction error. In four related experiments, participants learned a discrimination in which all the cues were presented separately. Some of them were consistently paired with one of two possible outcomes, whereas others were paired with both, each on a random half of the trials. Thus, the latter cues should have produced larger overall prediction errors than the former. The changes in associability resulting from the initial training were tested in a subsequent discrimination in which the same cues were paired with two new outcomes. Contrary to the prediction, learning of the second discrimination was faster for the cues that had been previously associated with small overall prediction errors. Therefore, the authors concluded that the hybrid model might not apply to humans (for similar findings, see Kattner, 2015; Livesey et al., 2011, Experiments 3 and 4).
The evidence reviewed in the previous paragraphs indicates that the hybrid model has received support from studies using eye-tracking, but not from those relying on speed of subsequent learning. Recently, it has been suggested that participants may acquire knowledge about the relational structure of learning tasks in a cue-specific way (Livesey et al., 2019). This type of knowledge could explain the discrepancy between the two measures of attention. For instance, in the first stage of the study conducted by Le Pelley et al. (2010), participants might have learned whether each particular cue was followed by only one outcome or two. Associability was then measured in a discrimination that involved pairing each of the cues with just one outcome. If participants applied the relational knowledge acquired in the first stage, learning would have been relatively easy for the cues previously followed by one outcome (i.e., those producing small overall prediction error), because for those cues the relational structure was congruent across stages. This effect would be unrelated to changes in associability. However, Beesley et al. (2015) suggested that the associability changes based on the Pearce and Hall (1980) mechanism might not generalise to new learning situations. This suggestion may also reconcile the above-mentioned discrepancy. In fact, the only human study showing facilitated learning for inconsistently reinforced cues (Griffiths et al., 2011) tested associability changes by increasing the magnitude of the original outcome, rather than using new outcomes in a subsequent learning task. Griffiths et al. showed that presenting non-reinforced trials before increasing the magnitude accelerated the prediction update. 1
Studies on the influence of associative learning on overt attention typically rely on self-paced designs, in which dwell times are measured on each trial from cue onset until participants indicate the expected outcome. The dwell time on each cue is then expressed as a proportion of the response time (e.g., Easdale et al., 2019; Kruschke et al., 2005). In contrast, we used a fixed presentation interval and based the analysis on absolute dwell times. This choice was motivated by two reasons. First, although proportional times capture differences in attention between cues, they may miss relevant changes in absolute levels of attention (e.g., if attention increases equally for each cue, as predicted by the Pearce–Hall model). Second, different processes may be operating at different points within the cue-outcome interval. For instance, Lachnit et al. (2013) evaluated several associative learning theories assuming either elemental or configural stimulus processing. They conducted four experiments differing in the time available for solving discriminations. Configural theories successfully predicted all the data reflecting early cue processing, whereas elemental theories turned out to be correct in predicting data reflecting later stages of processing. Fixed cue durations may be better suited to study temporal aspects of cue processing.
Despite the potential advantages of using a fixed measurement interval, when participants have no reason to look elsewhere after fixating on the relevant cues, a ceiling effect may occur on the dwell time (e.g., Torrents-Rodas et al., 2021). In order to avoid this, we presented the response trigger very briefly at the end of the measurement interval, on a known screen location. Indeed, the fixation probabilities showed that the trigger competed for attention with the cues towards the end of the interval. Participants initially looked at the cues and then moved their eyes towards the position where the response trigger was set to appear. Importantly, this within-trial shift was modulated by learning, and it was in accordance with the associability mechanism suggested by Pearce and Hall (1980). Thus, by the end of discrimination, when prediction accuracy was high, participants fixated on the cues for a short time and then directed their eye gaze towards the location of the response trigger. The introduction of partial reinforcement, which was associated with an increase in prediction error, significantly extended the interval in which participants fixated on the cues, to the detriment of the amount of attention paid to the trigger location. To the best of our knowledge, this measurement method has been used for the first time in the present study. We believe that it provides a standardised way to measure dwell times across trials and participants and to track fixation patterns within trials (for an example of within-trial fixation patterns in intervals of variable duration in the field of categorisation, see Blair et al., 2009). As pointed out above, monitoring within-trial time might well be important in the development and evaluation of theories of associative learning (Lachnit et al., 2013).
In the present study, we tested the predictions of Le Pelley’s hybrid model against those of Mackintosh’s theory and the Pearce–Hall model. However, our results may also be explained by other implementations of the hybrid model (e.g., Pearce & Mackintosh, 2010). Moreover, instead of the associability mechanism suggested by Mackintosh (1975), the attentional preference for the relevant cues could be accounted for the different associative strengths of the relevant and the irrelevant cues. Thus, the pattern of overt attention observed throughout the experiment would result from a function combining the influence of the differences in associative strength with the influence of overall prediction error (i.e., the Pearce–Hall mechanism; see Koenig, Kadel, et al., 2017; Koenig, Uengoer, & Lachnit, 2017). Finally, Esber and Haselgrove (2011) put forward a model that can explain attentional patterns suggestive of two separate associability mechanisms by means of a single process. According to their model, the amount of attention paid to cues is an additive function of separate associative strengths that they developed with different outcomes, including associations with the absence of an expected outcome. To our understanding, this model is also equipped to explain most of our results, with the exception of the complete loss of differential allocation of attention between the relevant and the irrelevant cues observed during partial reinforcement.
By tracking participants’ gaze, the present study provided evidence for a combined effect of two associability mechanisms in determining attentional changes during learning. One of these mechanisms is consistent with a comparison of the individual prediction errors associated to each of the present cues, thus selecting the cue with the best predictive accuracy. The other mechanism is related to the prediction error produced by all the cues, thus increasing attention—or keeping it at a high level—if the outcome cannot be accurately predicted. Our results add to recent findings in human participants that are consistent with the predictions of hybrid models of attention. Moreover, the effects of relative and overall prediction error may be considered to be analogous to those of relevance and uncertainty, respectively, described in other areas of experimental research (e.g., Gottlieb et al., 2020; Rehder & Hoffman, 2005). Future research should further elucidate to what extent the attentional changes accounted for each mechanism are transient or generalise to new learning situations.
Supplemental Material
sj-docx-1-qjp-10.1177_17470218211019308 – Supplemental material for Evidence for two attentional mechanisms during learning
Supplemental material, sj-docx-1-qjp-10.1177_17470218211019308 for Evidence for two attentional mechanisms during learning by David Torrents-Rodas, Stephan Koenig, Metin Uengoer and Harald Lachnit in Quarterly Journal of Experimental Psychology
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), under project number 290878970-GRK 2271 project 5 and project number 222641018–SFB/TRR 135 project B4.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
