Prediction and Prioritisation: How Reward Learning Shapes Attentional Capture

Abstract

A traditional view of selective attention distinguishes between goal-directed and stimulus-driven mechanisms of attentional control. More recently, a large (and growing) body of research has identified a third class of control system—termed selection history—wherein attentional prioritisation is shaped by our prior experience with stimuli, independently of our goals and the physical salience of those stimuli. This article reviews work within this selection history literature demonstrating that prioritisation is rapidly and automatically modulated by learning about the rewards associated with stimuli, and argues for a framework that distinguishes between history-driven processes implementing attentional exploitation (the drive to leverage reliable information) and attentional exploration (the drive to resolve uncertainty, with the aim of validating potential new sources of information). Findings such as these highlight a fundamental and intricate interaction between learning and attention, wherein our prior experience shapes the way in which we extract information from our environment—with potential consequences for understanding the subsequent decisions that we make and choices that we take.

Keywords

attention capture selection history reward learning conditioning sign tracking

Introduction

It has been argued that we are living in an ‘attention economy’ (Davenport & Beck, 2001; Simon, 1971; J. Williams, 2018). In the digital age, the cost of producing and communicating information is lower than it has ever been, to the point that only a tiny fraction of the digital data that is generated receives any analysis at all (estimated at just 0.5% in 2015: Carpentier, 2023). Consequently, human attention—the cognitive process of filtering down the sources of information that enter our awareness—becomes the bottleneck for consumption. As such, attention has come to be seen as a scarce, and monetisable resource—because what we pay attention to shapes every decision that we make (Krajbich, 2019; Nelson-Field, 2024; Pearson et al., 2022).

At its heart, attention is about selection, or prioritisation: we are limited in the number of sensory stimuli that we can analyse at any one time, so we must prioritise some sources of information for further analysis (and potentially action), and de-prioritise others. In the context of vision—where most of the research on attention has been conducted—an influential framework sees visual attention as being implemented at the level of an attentional priority map, which provides a real-time representation of the priority of all of the stimuli currently present in the visual field, with selection occurring at the region of highest priority (Fecteau & Munoz, 2006). As originally conceptualised, priority in this map was held to reflect the combined influence of goal-directed and stimulus-driven inputs (Serences et al., 2005; Yantis, 2000). Here, goal-directed attentional control reflects salience that is based on the intentions of the observer: for example, if my goal is to find broccoli in the supermarket, I might assign a high salience to green items. By contrast, stimulus-driven attentional control reflects salience that derives from the physical distinctiveness of stimuli (in terms of their colour, motion, luminance, etc.) independently of the observer’s goals: for example, while looking for broccoli, my attention may be involuntarily captured by a brightly coloured advertisement for sponges, even though I have no current desire to buy sponges (e.g. Theeuwes, 1994; for a recent overview of the attention capture debate, see Luck et al., 2021).

More recently, however, this dichotomy has been challenged by findings suggesting that our attention is influenced by our prior experience in ways that fall outside the traditional distinction between goal-directed and stimulus-driven control (for reviews, see Awh et al., 2012; Le Pelley et al., 2016; Theeuwes, 2019). The underlying principle here is that our world is (in general) highly repetitive and therefore predictable, and our attention system can learn to make use of this predictability in order to learn settings for attentional control. For example, through substantial experience I have learned that a particular shade of purple (Pantone 2685C, to be precise) is associated with Cadbury’s chocolate bars, which I enjoy eating. As a result, my attention system might prioritise detection of this reward-associated colour in the future—to the point that, while shopping, a flash of purple in the confectionary aisle may capture my attention even though it is unrelated to my current goal (to buy broccoli), and is not particularly physically distinctive against the heterogeneously coloured background of the other items on the shelves.

Awh et al. (2012) introduced the term ‘selection history’ to refer to influences of prior experience on attentional priority that cannot be explained by either the voluntary selection goals of the observer or the physical characteristics of the selected items. The literature in this area has seen an explosion in recent years, with Google Scholar identifying over 6,000 articles referencing ‘selection history’ since Awh et al.’s paper. Researchers have now identified a range of ways in which experience influences attention, including (but not limited to) effects of prior selection as a target or rejection as a distractor (Kristjánsson & Campana, 2010), and statistical learning about likely target and/or distractor features (Sha et al., 2017; Stilwell et al., 2019) or locations (Geng & Behrmann, 2005; Wang & Theeuwes, 2018). Indeed, the concept of selection history is in danger of becoming something of a grab-bag for a range of phenomena that sit only loosely together and may not be mediated by a common, unitary mechanism. Indeed, it has been argued that selection history itself may be fractionated into a number of dissociable sub-processes (Anderson et al., 2021), raising the possibility that selection history should be viewed more as a general conceptual framework than as a specific mechanism. But I shall not pursue that debate further here; instead, in the remainder of this article I will focus on a particular (and particularly well-studied) facet of selection history, relating to how attention is influenced by learning about the predictive relationship between stimuli and rewards—and I will argue that research in this area can be usefully understood within a framework that distinguishes between attentional processes implementing exploitation versus exploration of reward information.

Reward Learning and Attentional Exploitation

Central to the idea of learning, and prediction more generally, are the closely (and inversely) related concepts of information and uncertainty. Stimuli are predictive to the extent that they provide information about—and hence reduce uncertainty about—what will happen next. In our stochastic and dynamic environment, there is clearly value in being able to anticipate the future, and in line with this idea, humans are voracious seekers of predictive information: weather forecasts, restaurant reviews, blood tests, stock market reports, and so on, all provide (perhaps imperfect) information that can be exploited to guide the decisions we make and actions we take.

A central tenet in the field of conditioning is that associative learning operates in the service of maximising reward through changing behaviour.¹ And this idea applies not only to overt behaviour, but also to attention (the ‘behaviour’ of prioritising stimuli for further analysis). That is, our attention system incorporates a mechanism that acts to prioritise stimuli that provide predictive information about upcoming rewards.

This impact of reward on attention has been shown in a wide range of procedures, using monetary rewards, food rewards, and social rewards, and investigating impacts on both spatial and non-spatial attention (for reviews, see Pearson et al., 2022; Rusz et al., 2020). Consider, for example, a study by Le Pelley et al. (2015; see also Pearson et al., 2016) in which, on each trial, participants were tasked with making a rapid eye movement (saccade) to a diamond-shaped target in an array of circles (see Figure 1a). Eye tracking provides an excellent index of attention since gaze and attention are typically tightly coupled: every saccadic eye movement is closely preceded by a shift of attention. Moreover, eye movements constitute an ‘online’ measure: we can infer shifts of attention directly, on a millisecond-by-millisecond basis, as opposed to making indirect inferences about attention on the basis of (say) differences in mean response time across sets of trials conducted under different conditions. Le Pelley et al.’s task was based on the additional singleton procedure (Theeuwes, 1992); one of the circles in the display was a colour-singleton distractor, coloured blue or red, while all other items (including the target) were grey. Critically, the colour of this distractor signalled the magnitude of reward that was available for making a saccade to the target, with blue and red assigned to the roles of high-value and low-value colours in a counterbalanced fashion across participants (Figure 1b). If the distractor was rendered in the high-value colour, there was a relatively large reward available for looking at the target (10 cents), whereas if the distractor was rendered in the low-value colour, there was a smaller reward available (1 cent).

Figure 1.

Le Pelley et al.’s (2015) value-modulated attentional capture task: (a) Participants’ task on each trial was to look at a diamond-shaped target. The colour of a colour-singleton circle in the display—termed the distractor—signalled the magnitude of available reward, (b) In this example, a red distractor in the search display signals a high-value reward (10c), and a blue distractor signals a low-value reward (1c): in the actual experiment, colour–reward relationships were counterbalanced across participants, (c) Over the course of the task, participants were more likely to look at the high-value distractor than the low-value distractor, even though this was counterproductive since looking at the distractor resulted in cancellation of the reward, (d) Proportion of first saccades directed towards the distractor as a function of saccade latency (i.e. time from onset of the search display to initiation of the saccade). The pattern of greater gaze capture by the high-value (vs. low-value) distractor was present even among participants’ lowest-latency (i.e. fastest) saccades. Data are redrawn from Le Pelley et al. (2015, Experiment 3).

Notably, participants were never required to attend to the reward-signalling distractor: their task was to look at the target to earn a reward. In fact, the task was arranged so that if participants ever did look at the coloured distractor prior to looking at the target, the reward that would have been delivered on that trial was cancelled. Consequently, the best strategy in this task was to ignore coloured items entirely and look directly at the target on every trial. Nevertheless, participants’ gaze was sometimes captured by the distractor, and the key finding is that their attention was more likely to be captured by the high-value distractor than the low-value distractor—even though this pattern was particularly counterproductive since it meant that participants missed out on more of the high-value rewards than low-value rewards (Figure 1c). The implication is that learning about the predictive relationship between distractors and rewards led to attentional prioritisation of the high-value distractor. This influence of reward value on attention cannot have been due to stimulus-driven control, since the physical salience of the high- and low-value distractors was equated across participants by counterbalancing. Moreover, it seems clear that the effect does not reflect goal-directed control of attention, since participants’ task was to look at the target, not the distractor, to earn points, and so attending to distractors was contrary to the task-goal. Even stronger evidence that this influence of reward value on attention is not a consequence of goal-directed attention comes from studies showing that the same influence is observed even if participants are explicitly instructed (at the start of the experiment) that looking at the distractors leads to cancellation of reward (Pearson et al., 2015; Watson et al., 2019, 2020), and hence are fully aware that attending to the distractors is directly contrary to their goal of earning points. As such, this value-modulated attentional capture (VMAC) effect constitutes an example of selection history automatically shaping attentional priority. Indeed, the fact that the VMAC effect persists (and if anything increases) over the course of hundreds or even thousands of trials (Le Pelley et al., 2015)—even though participants know that attending to distractors is counterproductive—is a testament to the power of automatic, experience-driven processes in shaping attention. Further consistent with the idea that VMAC reflects the operation of an early and automatic attentional process, this influence of reward on attentional capture has been shown to be present even among participants’ fastest saccades, within 180 ms of stimulus onset (Pearson et al., 2016): see Figure 1d.

A notable feature of the VMAC procedure described above is that the critical distractor stimuli constitute Pavlovian signals of reward. That is, these stimuli provide predictive information about the magnitude of the upcoming reward, but are not the stimuli that participants must respond to in order to earn that reward. Consequently, the observed pattern of VMAC must reflect prioritisation modulated purely by information value, rather than by instrumental significance for current decision-making (cf. Doyle et al., 2025). In this regard, VMAC has notable similarities to the phenomenon of sign-tracking observed in animal research, wherein animals are sometimes observed to approach and interact with a cue (the ‘sign’) that predicts a reward, rather than carrying out the response that obtains the reward itself (Colaizzi et al., 2020; Heck et al., 2025; Le Pelley et al., 2024).² Like VMAC, this sign-tracking behaviour is observed even under an omission schedule in which interacting with the sign results in omission of reward (Atnip, 1977; D. R. Williams & Williams, 1969).

VMAC and sign-tracking are consistent with the idea that the attention system acts rapidly and automatically to prioritise processing of signals of reward (indeed, other research suggests that the influence of reward may be even more deep-seated, modulating the initial perception of stimuli, i.e. whether and how they are perceptually encoded in the first instance: Cheng et al., 2021; Serences, 2008). This interpretation makes a connection with the concept of incentive salience in the behavioural neuroscience literature: the idea that signals of desirable outcomes become salient (and hence attention-grabbing) in their own right: ‘motivational magnets’ that come to elicit approach behaviour (Berridge & Robinson, 2003).

Intuitively, it makes sense that our attentional system should be geared towards prioritising information about the availability of reward. Rewards are (by definition) of value to us, and hence signals of reward are also likely to be of value since they can be used to guide ongoing decisions and behaviour in a way that makes obtaining reward more likely (though not in the unusual situation of the VMAC procedure, as discussed further in the next section). For example, if a forager has previously experienced that red berries are delicious, then it makes sense to prioritise detection of red items while searching for food in future, since red has become a signal for reward. Hence, we can characterise this influence of reward-signalling as a mechanism of ‘attentional exploitation’: our cognitive system acts to prioritise processing of items that provide reliable information about valued events, since the information provided by these stimuli can be used to guide (and optimise) ongoing behaviour.

It is clearly adaptive for our attention system to exploit prior (learned) knowledge when implementing attentional control settings. But why have a distinct, selection-history-based mechanism that implements this effect of reward learning on attentional priority automatically? Why not instead implement the influence of reward learning—and potentially other types of learning—via goal-directed attentional control (with the forager applying a top-down control setting that prioritises red items)? Anderson (2021) has argued that a key benefit afforded by involuntary attentional control is that it allows for offloading of attentional demands. Goal-directed search relies on creating and maintaining an attentional target template in active memory, making it cognitively effortful and slow (Theeuwes, 2018). By contrast, automating patterns of attentional control negates the need for this metabolically costly, volitional process; instead, repeated experience (e.g. of a stimulus predicting reward) causes settings to become wired in and implemented automatically and rapidly via a dedicated ‘hands free’ circuit, obviating the need for an active search template (Shiffrin & Schneider, 1977).

Offloading attentional demands to a dedicated selection-history mechanism for executing involuntary, experience-based attentional control comes at a potential cost, however. Through this approach, the attention system is essentially implementing a general policy that past experience is a useful guide for future behaviour. In the context of reward learning, this policy takes two forms: (1) The existence of the system implements the policy that it is (in general) beneficial to attend to reward-signalling stimuli; and (2) The influence of learning implements the policy wherein stimuli that signalled reward in the past will continue to do so in the future. I will consider each of these ideas in turn.

The Value of Attentional Exploitation

With regard to the first of these assumptions, in the regular environment, signals of reward are usually ‘tied to’ the reward itself, as in the case of our forager, where the signal (red colour) is closely integrated with the reward (the sweet taste of the berry). Under these conditions, responding to the signal will typically be beneficial for accessing the reward itself. As such, it does not seem unreasonable to assume that, over our evolutionary history, a policy of prioritising reward-signalling stimuli would be generally adaptive. In this sense, the VMAC procedure—wherein attending to reward-signalling distractors is directly counterproductive to obtaining reward—creates an artificial situation that is unlikely to arise often in the real world. In some sense, that is the point of the procedure: by putting an evolved system in an unusual situation we can learn something about the underlying nature of that system, much as visual illusions such as the Kanisza triangle or the hollow-face illusion (Gregory, 1970) are not stimuli that would arise naturally, but tell us about the nature of the processes underlying visual perception; a similar logic applies to the ‘cognitive illusions’ used by Tversky and Kahneman (1974) to probe the heuristics and biases underlying judgement and decision-making.

That said, the policy of prioritising reward-signalling stimuli can have drawbacks, particularly in the modern world where we are surrounded by signals of reward: logos of fast food restaurants, advertisements for luxury products, social media notifications on our phones, and so on. This reward-laden environment may increase the incidence of situations in which involuntary prioritisation of reward signals leads to distraction from ongoing tasks, rather than being helpful—increasingly so as advertisers’ reach into our daily lives becomes more all-pervading. For example, Ford recently patented a system allowing in-vehicle screens to display advertisements based on what the car’s cameras see passing outside (Kierstein, 2024), raising concerns about distracting attention from the road, particularly if drivers are tired or engaged in other tasks, which may limit their ability to use goal-directed control to overcome distraction by reward-associated stimuli.

The idea that reward-modulated attention can have negative consequences is particularly germane in the context of addiction. Many addictive drugs produce potent neural reward signals (Hyman, 2005), and so cues that are repeatedly paired with these drugs (packaging, locations, smells, etc.) may come to be automatically prioritised as reward signals, becoming ‘motivational magnets’ in their own right that elicit craving and consequently drug-seeking (Le Pelley et al., 2024; Tomie & Morrow, 2018; Tomie et al., 2008). Consistent with this idea, evidence suggests that people who are more prone to automatically prioritising reward-signalling stimuli (as measured in laboratory tasks using small monetary rewards) are also more prone to addictive and addiction-related behaviours (for a review, see Le Pelley et al., 2024). For example, in a recent preregistered study, Watson et al. (2024) found that the magnitude of the VMAC effect (that is, the tendency to prioritise signals of reward) was associated with severity of problematic alcohol use in a treatment-seeking clinical sample of alcohol users, and predicted the likelihood of participants having returned to use at a three-month follow-up. Further relevant evidence comes from animal studies of sign-tracking (which, as noted earlier, provides an analogue to VMAC as a measure of incentive salience). Some strains of rats show evidence of individual differences in their tendency towards sign-tracking, and rats exhibiting stronger sign-tracking tendency also show more evidence of ‘impulsive’ behaviour in tests of decision-making (Tomie et al., 1998), are more motivated to earn cocaine (Saunders & Robinson, 2010), are more sensitive to cocaine cues, and are more likely to relapse in response to drug-associated cues (Saunders & Robinson, 2010) compared to rats who do not show evidence of sign-tracking (for a review, see Robinson et al., 2018).

Thus attentional exploitation may be a double-edged sword: while there is clearly value to a process implementing automatic prioritisation of reward signals, this system may not be ideally evolved to suit our current reward-saturated environment.

Attentional Exploitation in a Dynamic Environment

The second policy implemented by a ‘hands free’ mechanism subserving automatic, learned prioritisation of reward signals is that stimuli that have signalled reward in the past will continue to do so in future. That is, previous experience that red berries consistently signal reward leads to a ‘prioritise red’ pattern being wired into a history-based salience process. Following on from the arguments made in the previous section, this pattern will likely be adaptive as long as red berries continue to signal reward. Of course, the world is not perfectly predictable. Red berries may usually be tasty, but perhaps sometimes a red berry is rotten, or comes from a different species that is bland. For any form of learning to be adaptive, the benefit of exploiting prior knowledge must be sufficient to outweigh the cost of potentially making occasional errors.

Of more importance in the current context is the issue of what happens if the predictive relationships in the environment change in a systematic way. Perhaps the season changes, or our forager moves to a new area where red berries are no longer desirable. It has been argued that hands-free attentional control—as implemented by history-based mechanisms—will be relatively inflexible and sluggish in the face of a changing environment: control settings that have been established and wired-in on the basis of prior experience will run on, unless and until the agent engages goal-directed processes to update and modify these control settings (Anderson, 2021). And empirical data suggest that people are often somewhat resistant to expending the cognitive effort required to recalibrate attentional control settings to maximise task performance, persisting with prior patterns of prioritisation even when doing so leads to suboptimal performance (Irons & Leber, 2016).

In the context of reward, it has been shown that people can deploy top-down processes to update learned attentional control settings rapidly in response to a change in the world, if they are given clear motivation to do so. For example, in a study by Stankevich and Geng (2015), participants initially performed a task in which (say) green items signalled high-reward feedback, and red items signalled no-reward feedback, which led to attentional prioritisation of the high-reward colour (as indexed via response times). Participants were then instructed that rewards would no longer be available before carrying on with the task in the absence of any reward feedback. This instruction led to an immediate decrease in the magnitude of the attentional bias to the (former) high-reward colour, suggesting that the top-down knowledge of the change in reward structure was sufficient to prompt participants to update their attentional control settings. That said, the reward-related attentional bias was not entirely eliminated; even despite the explicit instruction, and experience of the task under the new reward structure, a (smaller) bias towards the former high-reward colour persisted throughout the latter phase of the task.

Further evidence of persistence of reward-related attentional bias is seen in studies where, rather than changing the relationship between cues and rewards, the value of the rewards themselves is changed (De Tommaso & Turatto, 2021; De Tommaso et al., 2017; Le et al., 2024; Watson et al., 2022; but see also Pool et al., 2014). For example, Watson et al. (2022) used a version of the VMAC search task described earlier (cf. Figure 1) in which different distractor colours signalled that a rapid saccade to the target would earn either water or potato chips. Participants were initially trained on this task when they were thirsty, but not hungry, such that water rewards would be more valuable than chips rewards. Under these conditions, gaze data showed that participants came to automatically prioritise the water-signalling distractor (even though—as in previous versions of the task—this was a counterproductive pattern since looking at this distractor led to cancellation of the reward). Then, once participants had acquired this bias, Watson et al. decreased the value of the water reward for half of the participants by having them drink until they were no longer thirsty, before all participants carried on with the task. Critically, this change in reward value had no impact on the size of the previously-established attentional bias towards the water-signalling distractor (relative to the chips-signalling distractor), as compared to participants for whom water had not been devalued (i.e. participants who were thirsty throughout the task).

The implication of these findings is that reward-related attentional prioritisation can become ‘habit-like’, resulting in attentional control settings that are (at least in part) resistant to updating in the face of subsequent changes in knowledge and/or experience (Le et al., 2024). That is, patterns of learned prioritisation can persist in influencing behaviour even in the face of a change in the environment that renders those stimuli no longer desired. It remains to be established what aspects of environmental change and task features (e.g. relating to participants’ explicit knowledge, or their incidental experience of reward feedback) promote or hinder flexible updating of reward-related attentional biases. This question is an important target for future research—particularly in light of the role of reward-related attention in addictive behaviours that was noted earlier. Consider a person who, after years of heavy drinking, wants to quit—that is, their explicitly stated desire for alcohol has decreased. However, if a ‘habit-like’ attentional bias—which is resistant to a change in reward value—continues to automatically prioritise attention to alcohol-related stimuli (e.g. pub signs, liquor stores) then these stimuli may continue to exert an influence on behaviour, interfering with the goal of abstinence (Le Pelley et al., 2024). Consistent with this idea, Albertella et al. (2021) tested a large (non-clinical) sample of participants who had signed up for a one-month voluntary alcohol abstinence challenge (as part of a ‘dry January’ campaign), and found that VMAC magnitude assessed at baseline predicted the likelihood of successfully completing the challenge: participants who were more susceptible to having their attention captured by reward signals were more likely to fail the abstinence goal (assessed via self-report).

More generally, it has been argued that learned patterns of attentional prioritisation can shape motivated behaviour within a ‘biased competition’ framework, by influencing which stimuli are chosen to act as targets of subsequent action (for more detail, see Le Pelley et al., 2024). According to this account, so-called slips of action (e.g. drinking alcohol while one is intending to abstain) result from prioritisation of task-inappropriate cues as a consequence of repeated and rewarded prior selection of those cues: in effect, ‘habitual control of goal selection’ (Cushman & Morris, 2015).

Attentional Exploration

Up to this point, we have considered a selection-history-based mechanism of attentional control that is based on predicted reward magnitude, wherein stimuli that signal the availability of large reward become more likely to capture attention than signals of smaller reward. I have characterised this as an ‘exploitative’ process, since it invokes a mechanism that acts to prioritise those stimuli that are likely to be of most use to us as targets for ongoing behaviour: our forager benefits from prioritising red items since this helps him to quickly detect tasty berries, which he can then eat. Hence, this value-modulated system involves exploiting current knowledge in order to optimise current behaviour (cf. Sutton & Barto, 2018). This idea of an exploitative attentional mechanism is broadly in line with findings of animal studies suggesting that attention operates to maximise current information (and minimise current uncertainty), wherein stimuli that provide immediate, diagnostic information about subsequent events receive attentional priority over those that do not (Gottlieb et al., 2020).

However, a growing body of recent research suggests this may not be the whole story, indicating that selection-history mechanisms may not always act in the service of exploiting reliable information (Cho & Cho, 2021; Chow, Garner, Pearson, Heber, & Le Pelley, 2025; Ju & Cho, 2023; Le Pelley, Pearson, et al., 2019; Pearson et al., 2024). For example, Chow, Garner, Pearson, Heber, and Le Pelley (2025; see also Le Pelley, Pearson, et al., 2019) had participants perform a search task similar to that shown in Figure 1, but rather than varying the reward magnitude associated with each distractor colour, we instead varied the reward uncertainty associated with each distractor. One distractor colour provided ‘diagnostic’ reward information: whenever this colour appeared in the search display, a rapid saccade to the target earned 20 points (with points later converted into money).³ If the search display contained a distractor in the other, ‘uncertain’ colour, a rapid saccade to the target earned 0 points on a random 50% of trials, and 40 points on the other 50% of trials (Figure 2a). Thus, both distractor colours were associated with the same expected (mean) reward value, but differed in their predictiveness—that is, the amount of information they provided about the specific reward available on each trial. Under these conditions, attentional exploitation would be reflected by greater prioritisation of—and hence greater capture by—the diagnostic distractor (which provided immediate, accurate information about available reward) than the uncertain distractor (which did not). Contrary to this hypothesis, Chow, Garner, Pearson, Heber, and Le Pelley (2025; see also Le Pelley, Pearson, et al., 2019) found that participants’ gaze was more likely to be captured by the uncertain distractor than the diagnostic distractor (see Figure 2b, ‘uninstructed’ group). This uncertainty-modulated attentional capture (UMAC) effect was pronounced even among participants’ fastest eye movements (Figure 2c), consistent with an early and automatic influence of uncertainty on attentional prioritisation.

Figure 2.

Study of uncertainty-modulated attentional capture (UMAC) by Chow, Garner, Pearson, Heber, & Le Pelley (2025): (a) A ‘diagnostic’ (diag) distractor signalled availability of 20 points on every trial; an ‘uncertain’ (uncert) distractor signalled 40 points on half of trials, and 0 points on the other half of trials (at random), (b) Across the course of the search task, the uncertain distractor was more likely than the diagnostic distractor to capture participants’ attention (higher proportion of distraction trials, that is, trials on which participants looked at the distractor prior to looking at the target during the search task). This pattern was observed in a group of participants who learned the colour–reward relationships only via trial-by-trial experience of reward feedback (group Uninstructed), and in participants who were explicitly told the colour–reward relationships at the outset, in addition to subsequently experiencing feedback (group Instructed), (c) Across both groups, the pattern of greater prioritisation of the uncertain (vs. diagnostic) distractor was present among participants’ fastest saccades. Redrawn from Chow, Garner, Pearson, Heber, & Le Pelley (2025).

Uncertainty is not an all-or-none factor: it varies along multiple dimensions. One such dimension, entropy, reflects the number and probability of different outcome events that may occur, which influences how surprising each observed outcome is (Shannon, 1948). Other things being equal, entropy increases as a function of the number of distinct event outcomes. For example, rolling a six-sided die has greater entropy (2.58 bits) than flipping a coin (1 bit). Another dimension of uncertainty, variance, reflects the spread of the distribution of outcomes and indicates how far each possible value is from the expected value. Studies of UMAC suggest that both of these properties play a role in promoting attentional prioritisation of stimuli associated with uncertain outcomes. Ju and Cho (2023) used a design in which the critical distractors were matched in both expected value and outcome variance, but differed in outcome entropy—and found evidence of greater attention to high-entropy items than low-entropy items. By contrast, Pearson et al. (2024) used a design in which distractors were matched in expected value and entropy, but differed in variance—and found evidence of greater attention to high-variance items than low-variance items. The implication is that UMAC is driven by the experience of outcome uncertainty as a broad concept, rather than being tied to one specific element of uncertainty.

Of course, one could argue that—even though the two distractors were matched in their objective expected value in the studies noted above—participants nevertheless (for some reason) perceived the more uncertain distractor as having higher subjective value than the more diagnostic distractor, so that the observed difference in prioritisation could be reconciled with a mechanism based on reward magnitude. However, this interpretation was ruled out in a further experiment by Pearson et al. (2024), which put influences of reward magnitude and reward uncertainty in opposition. Here, a low-value/high-variance distractor was paired with 0 points on 50% of trials, and 100 points on the other 50% of trials; by contrast, a high-value/low-variance distractor was paired with 70 points on 50% of trials, and 100 points on the other 50% of trials (Figure 3a). Consequently: (1) the high-value/low-variance distractor was associated with higher reward magnitude (expected value); and (2) the low-value/high-variance distractor was associated with greater reward uncertainty: both distractors had equal outcome entropy (since each could be followed by two distinct outcomes), but the variance of possible reward outcomes was larger for the low-value/high-variance distractor. Notably, Pearson et al. found that the low-value/high-variance distractor was more likely to capture participants’ attention (Figure 3b), even though participants could accurately report—when asked after the search task—that it had lower expected value (Figure 3c). These findings are consistent with the idea that the impact of uncertainty on prioritisation reflects a separate process from that driven by reward magnitude (as implicated in earlier studies of VMAC); indeed, in Pearson et al.’s study, uncertainty had a larger influence on attention than did magnitude.

Figure 3.

The study of attentional priority by Pearson et al. (2024, Experiment 3) pitted influences of reward magnitude (expected value, EV) and reward variance (σ²) in opposition: (a) A high-value, low variance distractor signalled availability of 100 points on 50% of trials, and 70 points on the remaining 50%. A low-value, high-variance distractor signalled 100 points on 50% of trials, and 0 points on the remaining 50%, (b) The low-value, high-variance distractor was more likely to capture participants’ attention (higher proportion of distraction trials, i.e. trials on which participants looked at the distractor prior to looking at the target during the search task), (c) When asked to estimate the mean (expected) reward value associated with each distractor, participants correctly reported a higher value for the high-value, low-variance distractor. * indicates p < .05. Redrawn from Pearson et al. (2024).

In sum, recent studies point to the existence of a selection-history process that acts to automatically prioritise stimuli on the basis of their predictive uncertainty, distinct from the value-based mechanism highlighted by earlier research. Whereas the value-based mechanism can be seen as a process of attentional exploitation of current reward information, the uncertainty-based mechanism can instead be seen as reflecting a different ‘mode’ of information-seeking: one that prioritises exploring current sources of uncertainty with the aim of uncovering new information (Chow, Garner, Pearson, Heber, & Le Pelley, 2025; Chow, Garner, Pearson, Theeuwes, & Le Pelley, 2025; Ju & Cho, 2023; Pearson et al., 2024). This view implicitly assumes that our cognitive system seeks to build an accurate mental model of the predictive relationships in the world; that is, to reduce prediction error to zero (cf. Dayan et al., 2000; Pearce & Hall, 1980). In the UMAC task used by Le Pelley, Pearson, et al. (2019), prediction error created by the variable reward paired with the uncertain distractor may drive attention to automatically prioritise this item, with the idea that exploring its features will support new learning that allows for more accurate predictions to be made in future (though the stochastic nature of the rewards in this task means this goal will never be attained, since outcomes are ultimately unpredictable). By contrast, once it is established that the diagnostic distractor signals 50 points, prediction error falls to zero and there is no further drive to explore this stimulus in order to learn more about it. Thus, attentional exploration acts in the service of learning, highlighting items whose consequences are (currently) imperfectly understood in an attempt to improve this understanding so that we might be better-placed to exploit the information provided by a more accurate mental model in the future.

Notably, the attentional exploration underlying UMAC seems to operate relatively automatically. Chow, Garner, Pearson, Heber, and Le Pelley (2025) demonstrated that explicitly informing participants of the true predictive status of each distractor at the outset of the task—which should negate any top-down drive to strategically prioritise distractors in order to discover more about them—had no impact on the UMAC effect that was observed, relative to a group of participants who were not informed of the distractor–reward relationships (and could learn only via trial-by-trial experience): see Figure 2b. Based on these findings, we suggested that UMAC may reflect the operation of an automatic learning-based mechanism—a selection history mechanism—that implements attentional exploration driven by experience of reward uncertainty, in turn via experience of prediction error—much as suggested by Pearce and Hall (1980) in their algorithmic model of associative learning (see also Dayan et al., 2000).

Two Mechanisms of Reward-Driven Attention

To summarise, the framework proposed here points to two distinct mechanisms that operate—relatively automatically—to shape attentional prioritisation based on our previous experience of reward.⁴ A value-based, exploitation mechanism acts to prioritise stimuli that have previously been associated with larger rewards, in the service of optimising current behaviour: since reward-associated items are typically valuable targets of motivated behaviour (e.g. approaching or choosing them will usually be beneficial), it makes sense to have an attentional system that is geared towards rapid, ‘hands-free’ processing of such reward-related items. In addition, an uncertainty-based, exploration mechanism acts to prioritise stimuli that have previously been associated with a greater degree of uncertainty in experienced reward, in the service of optimising future knowledge by directing cognitive resources towards learning more about the true significance of these items.

This idea of an exploit/explore framework for attention has theoretical antecedents in the literature on conditioning and associative learning (for reviews, see Le Pelley, 2004; Le Pelley et al., 2016). For example, Mackintosh (1975) proposed a model of conditioning wherein attention was modulated by the extent to which stimuli signalled outcome events, such as rewards—effectively implementing a process of attentional exploitation. Likewise, Pearce and Hall (1980) put forward a model in which attention was based on experience of prediction error, implementing a form of attentional exploration. It could be argued (and not unreasonably) that much of the more recent work on selection history—such as that reviewed here—has involved researchers from the attention literature ‘rediscovering’ principles from the field of conditioning. In doing so, however, this research has brought a more refined conceptualisation of what attention actually is. Early work in the conditioning literature treated attention as a unitary construct, and so failed to address the level at which effects were operating. By contrast, more recent tasks and measures derived from the visual attention literature allow for a much richer and more precise view of the way in which learning shapes attention—we can distinguish between effects reflecting top-down versus selection-history-driven control, for example, and can examine dynamic changes in patterns of prioritisation with millisecond resolution in ways that were not previously possible.

While important foundational steps have been taken in this research area, this is an active field and a number of critical research questions remain open. I will end by highlighting some potential directions for future work.

The Nature of the Outcome

All of the studies discussed here used rewards as outcomes. Indeed, this has been the case for much of the existing work on selection history, though the specific nature of the reward has varied, including money, symbolic points, game-like feedback (sound effects, the chance to unlock digital ‘medals’ etc.), food and drink, or social rewards (for reviews, see Pearson et al., 2022; Rusz et al., 2020). But rewards are not the only motivationally significant stimuli: aversive events (punishments, threats and losses) will also motivate behaviour. Less research has examined the impact of learning about aversive stimuli on attentional capture, though what research there is suggests that an exploitation-type process operates to prioritise signals of aversive events in much the same way as signals of reward (Grégoire et al., 2022; Mikhael et al., 2021; Nissens et al., 2017; Wentura et al., 2014)—the implication being that attentional exploitation is based on the motivational significance of the outcome, rather than its valence (appetitive or aversive; positive or negative). It makes sense that our attentional system should be geared towards prioritising signals of both reward and threat, given that both are likely to be important for ongoing behaviour. However, to my knowledge, no research has examined attentional exploration in the context of aversive outcomes (e.g. by pairing stimuli with different levels of monetary loss, or a mixture of rewards and losses), and this remains a task for the future.

That said, designing experiments to examine the impact of aversive events presents methodological challenges. For example, while it is straightforward to motivate participants to respond to earn a reward, it is harder to motivate responding that will result in a loss. One approach has been to use negative reinforcement, wherein responding prevents an aversive event that would otherwise have occurred (Le Pelley, Watson, et al., 2019; Nissens et al., 2017; Wentura et al., 2014), but interpretation of findings from studies using this approach is complicated by the possibility that successful avoidance of an otherwise-expected aversive event is viewed by participants as a positive outcome, that is, a form of reward (Anderson & Britton, 2020; Mikhael et al., 2021). Moreover, studies that have attempted to compare the degree to which prioritisation is influenced by reward versus loss (Becker et al., 2020; Yan et al., 2022) are complicated by difficulty in matching the frequency with which rewards versus losses occur, and of equating the psychological salience of reward versus loss values (Pearson et al., 2022).

The Nature of the Task

In this article, I have focused primarily on studies in which the critical stimuli are Pavlovian signals of reward. An alternative approach is to use a more instrumental procedure, in which the critical stimuli are themselves the targets that participants must attend (and respond) to in order to earn reward. Studies using this approach have demonstrated evidence of both attentional exploitation (Anderson & Halpern, 2017) and attentional exploration (Cho & Cho, 2021). That said, interpretation is complicated here since, even in these latter studies, the reward-related stimuli also have a Pavlovian relationship with reward: the stimulus signals the magnitude of reward that is available (Pavlovian), and attending to this stimulus leads to reward delivery (instrumental). This makes it difficult to separate out the influence of each component. It remains for future research to investigate the similarities and differences in selection history processes mediated by Pavlovian versus instrumental conditioning more selectively.

Integrating Attentional Exploitation and Attentional Exploration

The framework proposed here distinguishes between two ‘modes’ of automatic attentional prioritisation: attentional exploitation (prioritising stimuli on the basis of current knowledge about the outcome that they signal), and attentional exploration (prioritising stimuli on the basis of current uncertainty, driven by potential future knowledge that they might provide). A critical next step will be to determine how these mechanisms interact, and potentially compete, to determine overall priority: how does our attentional system strike a balance between exploring—to build a more accurate mental model of the world—and exploiting—to make use of existing knowledge? This will involve establishing the features of tasks and individuals that lead to seeking reliable information versus prioritising sources of uncertainty: that is, how ‘task architecture’ can be manipulated to alter the explore/exploit balance point. For example, exploration requires holding beliefs in mind and updating them based on incoming sensory evidence; this is more cognitively demanding than exploitation, which merely involves acting on existing beliefs. This raises the (testable) possibility that the balance of attentional modes is mediated by availability of cognitive resources, which could be manipulated by imposing a working memory load or secondary task (cf. Watson et al., 2019).

Selection History Versus Top-Down Control

Here, I have reviewed evidence consistent with a history-driven influence of uncertainty on attentional capture, labelled as attentional exploration. These findings stand in contrast to a parallel body of existing work demonstrating information-seeking behaviour, illustrating conditions under which observers preferentially attend to stimuli that provide immediate, diagnostic information about subsequent events over those that do not (Foley et al., 2017; Horan et al., 2019; for a review, see Gottlieb et al., 2020). A notable point of difference between work on UMAC—consistent with attentional exploration—and research highlighting information-seeking for cues that resolve current uncertainty lies in the task objectives. In studies of UMAC, attending to the reward-signalling distractors is counterproductive to the goals of the task—and this is the basis of the claim that UMAC reflects an automatic, history-driven impact of uncertainty on attention. By contrast, in studies demonstrating information-seeking behaviour, observers were typically required to make a saccade to the cue in order for the task to proceed (Blanchard et al., 2015; Bromberg-Martin & Hikosaka, 2009), and in many cases, attending to the critical cue was directly beneficial because doing so provided instrumental information that could be used to improve the accuracy of subsequent responding (Foley et al., 2017; Horan et al., 2019). This raises the possibility that these previous findings reflect top-down (volitional), rather than selection-history-based (automatic) influences on attentional selection (though see Doyle et al., 2025). In other words, when we are given the choice to intentionally gather information about an upcoming event, we may prefer to sample from a source that provides more reliable information and so resolves our uncertainty about what will come next. However, selection history may create a competing drive toward attentional exploration that acts to prioritise—and potentially drive capture by—signals of more variable outcomes. Future research could test this speculation.

The Impact of Reward-Based Attentional Prioritisation on Decision-Making

The idea of an explore/exploit distinction is clearly not new: it has been a particular focus in the context of reinforcement learning and decision-making, examining how we use the information we have gathered to guide our choices (Mehlhorn et al., 2015; Sutton & Barto, 2018). But this neglects a key question: where did that information come from in the first place? Consider a supermarket shopper, facing shelves full of thousands of items. It would be impossible to evaluate the pros and cons of every item before making each selection; instead, attention acts to select a subset of items for further consideration, and filters out others (Krajbich, 2019). And the research reviewed here suggests that the process of information-gathering is itself shaped by our previous experience of reward: the implication is that, by implementing patterns of reward-based prioritisation, our attention system may be automatically ‘nudging’ us towards making certain choices rather than others (Pearson et al., 2022): Figure 4.

Figure 4.

History-driven attention and choice: (a) A shopper wants to buy a snack, (b) The chip packet is a reliable signal of reward: in the past, chips have always been quite tasty. Bananas have greater variance: some are delicious, but others are unripe or rotten, (c) If attention acts to prioritise sources of diagnostic information (exploitation), the shopper will be biased towards buying chips; if attention prioritises sources of uncertainty (exploration), the shopper will be more likely to buy bananas.

This raises the question of how learned patterns of reward-based prioritisation act to shape ongoing decision-making behaviour. This is a critical issue, since it is our choices that ultimately determine our interaction with the world. Earlier I noted findings suggesting a relationship between reward-modulated attention and real-world addictive behaviours (Le Pelley et al., 2024), which provides prima facie evidence that these attentional processes can indeed bear on people’s overt decisions—an idea that is central to the popular concept of the ‘attention economy’ (Davenport & Beck, 2001). Recent studies have begun to examine the impact of reward-modulated attention on decision-making at a mechanistic level, within the framework of evidence-accumulation models such as the attentional drift diffusion model (Gluth et al., 2018, 2020) and this represents an exciting avenue for future investigation.

Conclusion

The idea that what we learn influences what we pay attention to has a venerable history: James (1890/1983) described ‘derived attention’ as a form of attention that ‘owes its interest to association with some other immediately interesting thing’ (p. 393). When making this point, however, James may not have appreciated how fundamental, and how intricate, the interaction between learning and attention is. Here, I have focused on just one facet of this interaction, wherein learning shapes prioritisation of reward signals, and I have argued that evidence is consistent with the operation of distinct mechanisms implementing attentional exploitation and exploration at an automatic level. But this is just one element of a complex and multifaceted relationship between learning and attention: for example, other research has shown that prioritisation is also influenced by prior learning about stimulus locations, or features, and the previous responses that we have made (Anderson et al., 2021). A critical next step will be to establish how these different influences are integrated into the attentional priority map to determine overall prioritisation (do they reflect different processes, or different facets of a unitary system?—see Anderson, 2024), and how they influence ongoing behaviour. More generally, this body of research demonstrates the extent to which plasticity permeates through our cognitive system: at seemingly every level of the information-processing pathway, we are a product of our own experience.

Footnotes

Acknowledgements

I would like to thank Ben Newell for valuable feedback on a draft of this article.

ORCID iD

Mike E. Le Pelley

Funding

The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Australian Research Council grants DP200101314, DP240102605 and DP260101915, and an AUSMURI grant from the Australian Department of Defence.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Notes

References

Albertella

Hooven

J. v.

Bovens

Wiers

R. W.

(2021). Reward-related attentional capture predicts non-abstinence during a one-month abstinence challenge. Addictive Behaviors, 114, Article 106745. https://doi.org/10.1016/j.addbeh.2020.106745

Anderson

B. A.

(2021). An adaptive view of attentional control. American Psychologist, 76, 1410–1422. https://doi.org/10.1037/amp0000917

Anderson

B. A.

(2024). Trichotomy revisited: A monolithic theory of attentional control. Vision Research, 217, Article 108366. https://doi.org/10.1016/j.visres.2024.108366

Anderson

B. A.

Britton

M. K.

(2020). On the automaticity of attentional orienting to threatening stimuli. Emotion, 20, 1109–1112. https://doi.org/10.1037/emo0000596

Anderson

B. A.

Halpern

(2017). On the value-dependence of value-driven attentional capture. Attention, Perception, & Psychophysics, 79, 1001–1011. https://doi.org/10.3758/s13414-017-1289-6

Anderson

B. A.

Kim

A. J.

Liao

M. R.

Mrkonja

Clement

Gregoire

(2021). The past, present, and future of selection history. Neuroscience & Biobehavioral Reviews, 130, 326–350. https://doi.org/10.1016/j.neubiorev.2021.09.004

Atnip

G. W.

(1977). Stimulus-reinforcer and response-reinforcer contingencies in autoshaping, operant, classical, and omission training procedures in rats. Journal of the Experimental Analysis of Behavior, 28, 59–69. https://doi.org/10.1901/jeab.1977.28-59

Awh

Belopolsky

A. V.

Theeuwes

(2012). Top-down versus bottom-up attentional control: A failed theoretical dichotomy. Trends in Cognitive Sciences, 16, 437–443. https://doi.org/10.1016/j.tics.2012.06.010

Becker

M. W.

Hemsteger

S. H.

Chantland

Liu

T. S.

(2020). Value-based attention capture: Differential effects of loss and gain contingencies. Journal of Vision, 20, 1–13. https://doi.org/10.1167/jov.20.5.4

10.

Berridge

K. C.

Robinson

T. E.

(2003). Parsing reward. Trends in Neurosciences, 26, 507–513. https://doi.org/10.1016/S0166-2236(03)00233-9

11.

Blanchard

T. C.

Hayden

B. Y.

Bromberg-Martin

E. S.

(2015). Orbitofrontal cortex uses distinct codes for different choice attributes in decisions motivated by curiosity. Neuron, 85, 602–614. https://doi.org/10.1016/j.neuron.2014.12.050

12.

Bromberg-Martin

E. S.

Hikosaka

(2009). Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron, 63, 119–126. https://doi.org/10.1016/j.neuron.2009.06.009

13.

Carpentier

C. L.

(2023). New economics for sustainable development: Attention economy. United Nations Economist Network. https://www.un.org/sites/un2.un.org/files/attention_economy_feb.pdf

14.

Cheng

Rich

A. N.

Le Pelley

M. E.

(2021). Reward rapidly enhances visual perception. Psychological Science, 32, 1994–2004. https://doi.org/10.1177/09567976211021843

15.

Cho

S. A.

Cho

Y. S.

(2021). Uncertainty modulates value-driven attentional capture. Attention, Perception, & Psychophysics, 83, 142–155. https://doi.org/10.3758/s13414-020-02171-3

16.

Chow

J. Y. L.

Garner

K. G.

Pearson

Heber

Le Pelley

M. E.

(2025). Effects of instructed and experienced uncertainty on attentional priority. Journal of Experimental Psychology: Learning, Memory, and Cognition, 51, 869–880. https://doi.org/10.1037/xlm0001427

17.

Chow

J. Y. L.

Garner

K. G.

Pearson

Theeuwes

Le Pelley

M. E.

(2025). Delaying reward feedback does not increase the influence of information on attentional priority in visual search. PsyArXiv, 271, Article 106447. https://doi.org/10.31219/osf.io/n7qzv_v2

18.

Colaizzi

J. M.

Flagel

S. B.

Joyner

M. A.

Gearhardt

A. N.

Stewart

J. L.

Paulus

M. P.

(2020). Mapping sign-tracking and goal-tracking onto human behaviors. Neuroscience & Biobehavioral Reviews, 111, 84–94. https://doi.org/10.1016/j.neubiorev.2020.01.018

19.

Cushman

Morris

(2015). Habitual control of goal selection in humans. Proceedings of the National Academy of Sciences of the United States of America, 112, 13817–13822. https://doi.org/10.1073/pnas.1506367112

20.

Davenport

T. H.

Beck

J. C.

(2001). The attention economy: Understanding the new currency of business. Harvard Business Press.

21.

Dayan

Kakade

Montague

P. R.

(2000). Learning and selective attention. Nature Neuroscience, 3, 1218–1223. https://doi.org/10.1038/81504

22.

De Tommaso

Mastropasqua

Turatto

. (2017). The salience of a reward cue can outlast reward devaluation. Behavioral Neuroscience, 131, 226–234. https://doi.org/10.1037/bne0000193

23.

De Tommaso

Turatto

. (2021). On the resilience of reward cues attentional salience to reward devaluation, time, incentive learning, and contingency remapping. Behavioral Neuroscience, 135(3), 389–401. https://doi.org/10.1037/bne0000423

24.

Doyle

Volkova

Crotty

Massa

Grubb

M. A.

(2025). Information-driven attentional capture. Attention, Perception, & Psychophysics, 87, 721–727. https://doi.org/10.3758/s13414-024-03008-z

25.

Fecteau

J. H.

Munoz

D. P.

(2006). Salience, relevance, and firing: A priority map for target selection. Trends in Cognitive Sciences, 10, 382–390. https://doi.org/10.1016/j.tics.2006.06.011

26.

Foley

N. C.

Kelly

S. P.

Mhatre

Lopes

Gottlieb

(2017). Parietal neurons encode expected gains in instrumental information. Proceedings of the National Academy of Sciences of the United States of America, 114, E3315–E3323. https://doi.org/10.1073/pnas.1613844114

27.

Geng

J. J.

Behrmann

(2005). Spatial probability as an attentional cue in visual search. Perception & Psychophysics, 67, 1252–1268. https://doi.org/10.3758/Bf03193557

28.

Gluth

Kern

Kortmann

Vitali

C. L.

(2020). Value-based attention but not divisive normalization influences decisions with multiple alternatives. Nature Human Behaviour, 4, 634–645. https://doi.org/10.1038/s41562-020-0822-0

29.

Gluth

Spektor

M. S.

Rieskamp

(2018). Value-based attentional capture affects multi-alternative decision making. eLife, 7, Article e39659. https://doi.org/10.7554/eLife.39659

30.

Gottlieb

Cohanpour

Singletary

Zabeh

(2020). Curiosity, information demand and attentional priority. Current Opinion in Behavioral Sciences, 35, 83–91. https://doi.org/10.1016/j.cobeha.2020.07.016

31.

Grégoire

Britton

M. K.

Anderson

B. A.

(2022). Motivated suppression of value- and threat-modulated attentional capture. Emotion, 22, 780–794. https://doi.org/10.1037/emo0000777

32.

Gregory

(1970). The intelligent eye. Weidenfeld and Nicolson.

33.

Heck

Durieux

Anselme

Quertemont

(2025). Implementations of sign- and goal-tracking behavior in humans: A scoping review. Cognitive Affective & Behavioral Neuroscience, 25, 570. https://doi.org/10.3758/s13415-024-01249-x

34.

Horan

Daddaoua

Gottlieb

(2019). Parietal neurons encode information sampling based on decision uncertainty. Nature Neuroscience, 22, 1327–1335. https://doi.org/10.1038/s41593-019-0440-1

35.

Hyman

S. E.

(2005). Addiction: A disease of learning and memory. American Journal of Psychiatry, 162, 1414–1422. https://doi.org/10.1176/appi.ajp.162.8.1414

36.

Irons

J. L.

Leber

A. B.

(2016). Choosing attentional control settings in a dynamically changing environment. Attention, Perception, & Psychophysics, 78, 2031–2048. https://doi.org/10.3758/s13414-016-1125-4

37.

James

(1890/1983). The principles of psychology. Harvard University Press.

38.

Jeong

J. H.

Kim

Choi

J. S.

Cho

Y. S.

(2023). Value-driven attention and associative learning models: A computational simulation analysis. Psychonomic Bulletin & Review, 30, 1689–1706. https://doi.org/10.3758/s13423-023-02296-0

39.

Cho

Y. S.

(2023). The modulation of value-driven attentional capture by exploration for reward information. Journal of Experimental Psychology: Learning, Memory, and Cognition, 49, 181–197. https://doi.org/10.1037/xlm0001189

40.

Kierstein

(2024). Ford patents in-car system that eavesdrops so it can play you ads. MotorTrend. https://www.motortrend.com/news/ford-billboard-ad-patent-system

41.

Krajbich

(2019). Accounting for attention in sequential sampling models of decision making. Current Opinion in Psychology, 29, 6–11. https://doi.org/10.1016/j.copsyc.2018.10.008

42.

Kristjánsson

Campana

(2010). Where perception meets memory: A review of repetition priming in visual search tasks. Attention, Perception, & Psychophysics, 72, 5–18. https://doi.org/10.3758/App.72.1.5

43.

J. T.

Watson

Le Pelley

M. E.

(2024). Effects of outcome revaluation on attentional prioritisation of reward-related stimuli. Quarterly Journal of Experimental Psychology, 78, 142–162. https://doi.org/10.1177/17470218241236711

44.

Le Pelley

M. E

. (2004). The role of associative history in models of associative learning: A selective review and a hybrid model. Quarterly Journal of Experimental Psychology, 57B, 193–243. https://doi.org/10.1080/02724990344000141

45.

Le Pelley

M. E.

Mitchell

C. J.

Beesley

George

D. N.

Wills

A. J

. (2016). Attention and associative learning in humans: An integrative review. Psychological Bulletin, 142, 1111–1140. https://doi.org/10.1037/bul0000064

46.

Le Pelley

M. E.

Pearson

Griffiths

Beesley

. (2015). When goals conflict with values: Counterproductive attentional and oculomotor capture by reward-related stimuli. Journal of Experimental Psychology: General, 144, 158–171. https://doi.org/10.1037/xge0000037

47.

Le Pelley

M. E.

Pearson

Porter

Yee

Luque

. (2019a). Oculomotor capture is influenced by expected reward value but (maybe) not predictiveness. Quarterly Journal of Experimental Psychology, 72, 168–181. https://doi.org/10.1080/17470218.2017.1313874

48.

Le Pelley

M. E.

Watson

Pearson

Abeywickrama

R. S.

Most

S. B

. (2019b). Winners and losers: Reward and punishment produce biases in temporal selection. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45, 822–833. https://doi.org/10.1037/xlm0000612

49.

Le Pelley

M. E.

Watson

Wiers

R. W

. (2024). Biased choice and incentive salience: Implications for addiction. Behavioral Neuroscience, 138, 235–243. https://doi.org/10.1037/bne0000583

50.

Luck

S. J.

Gaspelin

Folk

C. L.

Remington

R. W.

Theeuwes

(2021). Progress toward resolving the attentional capture debate. Visual Cognition, 29, 1–21. https://doi.org/10.1080/13506285.2020.1848949

51.

Mackintosh

N. J.

(1975). A theory of attention: Variations in the associability of stimuli with reinforcement. Psychological Review, 82, 276–298. https://doi.org/10.1037/h0076778

52.

Mehlhorn

Newell

B. R.

Todd

P. M.

Lee

M. D.

Morgan

Braithwaite

V. A.

Hausmann

Fiedler

Gonzalez

(2015). Unpacking the exploration–exploitation tradeoff: A synthesis of human and animal literatures. Decision, 2, 191–215. https://doi.org/10.1037/dec0000033

53.

Mikhael

Watson

Anderson

B. A.

Le Pelley

M. E.

(2021). You do it to yourself: Attentional capture by threat-signaling stimuli persists even when entirely counterproductive. Emotion, 21, 1691–1698. https://doi.org/10.1037/emo0001003

54.

Nelson-Field

(2024). The attention economy: A category blueprint. Springer Nature.

55.

Nissens

Failing

Theeuwes

(2017). People look at the object they fear: Oculomotor capture by stimuli that signal threat. Cognition and Emotion, 31, 1707–1714. https://doi.org/10.1080/02699931.2016.1248905

56.

Pearce

J. M.

Hall

(1980). A model for Pavlovian conditioning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychological Review, 87, 532–552. https://doi.org/10.1037/0033-295X.87.6.532

57.

Pearson

Chong

Chow

J. Y. L.

Garner

K. G.

Theeuwes

Le Pelley

M. E.

(2024). Uncertainty-modulated attentional capture: Outcome variance increases attentional priority. Journal of Experimental Psychology: General, 153, 1628–1643. https://doi.org/10.1037/xge0001586

58.

Pearson

Donkin

Tran

S. C.

Most

S. B.

Le Pelley

M. E.

(2015). Cognitive control and counterproductive oculomotor capture by reward-related stimuli. Visual Cognition, 23, 41–66. https://doi.org/10.1080/13506285.2014.994252

59.

Pearson

Le Pelley

M. E.

(2020). Learning to avoid looking: Competing influences of reward on overt attentional selection. Psychonomic Bulletin & Review, 27, 998–1005. https://doi.org/10.3758/s13423-020-01770-3

60.

Pearson

Osborn

Whitford

T. J.

Failing

Theeuwes

Le Pelley

M. E.

(2016). Value-modulated oculomotor capture by task-irrelevant stimuli is a consequence of early competition on the saccade map. Attention, Perception, & Psychophysics, 78, 2226–2240. https://doi.org/10.3758/s13414-016-1135-2

61.

Pearson

Watson

Albertella

Le Pelley

M. E.

(2022). Attentional economics links value-modulated attentional capture and decision-making. Nature Reviews Psychology, 1, 320–333. https://doi.org/10.1038/s44159-022-00053-z

62.

Pool

Brosch

Delplanque

Sander

(2014). Where is the chocolate? Rapid spatial orienting toward stimuli associated with primary rewards. Cognition, 130, 348–359. https://doi.org/10.1016/j.cognition.2013.12.002

63.

Robinson

T. E.

Carr

Kawa

A. B.

(2018). The propensity to attribute incentive salience to drug cues and poor cognitive control combine to render sign-trackers susceptible to addiction. In Tomie

Morrow

J. D.

(Eds.), Sign-tracking and drug addiction. Maize Books, Michigan Publishing.

64.

Rusz

Le Pelley

M. E.

Kompier

M. A. J.

Mait

Bijleveld

(2020). Reward-driven distraction: A meta-analysis. Psychological Bulletin, 146, 872–899. https://doi.org/10.1037/bul0000296

65.

Saunders

B. T.

Robinson

T. E.

(2010). A cocaine cue acts as an incentive stimulus in some but not others: Implications for addiction. Biological Psychiatry, 67, 730–736. https://doi.org/10.1016/j.biopsych.2009.11.015

66.

Serences

J. T.

(2008). Value-based modulations in human visual cortex. Neuron, 60, 1169–1181. https://doi.org/10.1016/j.neuron.2008.10.051

67.

Serences

J. T.

Shomstein

Leber

A. B.

Golay

Egeth

H. E.

Yantis

(2005). Coordination of voluntary and stimulus-driven attentional control in human cortex. Psychological Science, 16, 114–122. https://doi.org/10.1111/j.0956-7976.2005.00791.x

68.

Sha

L. Z.

Remington

R. W.

Jiang

Y. H. V.

(2017). Short-term and long-term attentional biases to frequently encountered target features. Attention, Perception, & Psychophysics, 79, 1311–1322. https://doi.org/10.3758/s13414-017-1317-6

69.

Shannon

C. E.

(1948). A mathematical theory of communication. Bell System Technical Journal, 27, 623–656. https://doi.org/10.1002/j.1538-7305.1948.tb00917.x

70.

Shiffrin

R. M.

Schneider

(1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attending and a general theory. Psychological Review, 84, 127–190. https://doi.org/10.1037/0033-295X.84.2.127

71.

Simon

H. A.

(1971). Designing organizations for an information-rich world. In Greenberger

(Ed.), Computers, communication, and the public interest. Johns Hopkins University Press.

72.

Stankevich

B. A.

Geng

J. J.

(2015). The modulation of reward priority by top-down knowledge. Visual Cognition, 23, 206–228. https://doi.org/10.1080/13506285.2014.981626

73.

Stilwell

B. T.

Bahle

Vecera

S. P.

(2019). Feature-based statistical regularities of distractors modulate attentional capture. Journal of Experimental Psychology: Human Perception and Performance, 45, 419–433. https://doi.org/10.1037/xhp0000613

74.

Sutton

R. S.

Barto

A. G.

(2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.

75.

Theeuwes

(1992). Perceptual selectivity for color and form. Perception & Psychophysics, 51, 599–606. https://doi.org/10.3758/Bf03211656

76.

Theeuwes

(1994). Endogenous and exogenous control of visual selection. Perception, 23, 429–440. https://doi.org/10.1068/p230429

77.

Theeuwes

(2018). Visual selection: Usually fast and automatic; seldom slow and volitional. Journal of Cognition, 1, 29. https://doi.org/10.5334/joc.13

78.

Theeuwes

(2019). Goal-driven, stimulus-driven, and history-driven selection. Current Opinion in Psychology, 29, 97–101. https://doi.org/10.1016/j.copsyc.2018.12.024

79.

Tomie

Aguado

A. S.

Pohorecky

L. A.

Benjamin

(1998). Ethanol induces impulsive-like responding in a delay-of-reward operant choice procedure: Impulsivity predicts autoshaping. Psychopharmacology, 139, 376–382. https://doi.org/10.1007/s002130050728

80.

Tomie

Grimes

K. L.

Pohorecky

L. A.

(2008). Behavioral characteristics and neurobiological substrates shared by Pavlovian sign-tracking and drug abuse. Brain Research Reviews, 58, 121–135. https://doi.org/10.1016/j.brainresrev.2007.12.003

81.

Tomie

Morrow

J. D.

(Eds.). (2018). Sign-tracking and drug addiction. Michigan Publishing.

82.

Tversky

Kahneman

(1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. https://doi.org/10.1126/science.185.4157.1124

83.

Wang

B. C.

Theeuwes

(2018). Statistical regularities modulate attentional capture independent of search strategy. Attention, Perception, & Psychophysics, 80, 1763–1774. https://doi.org/10.3758/s13414-018-1562-3

84.

Watson

Pavri

Pearson

Le Pelley

M. E.

(2022). Attentional capture by signals of reward persists following outcome devaluation. Learning & Memory, 29, 181–191. https://doi.org/10.1101/lm.053569.122

85.

Watson

Pearson

Chow

Theeuwes

Wiers

R. W.

Most

S. B.

Le Pelley

M. E.

(2019). Capture and control: Working memory modulates attentional capture by reward-related stimuli. Psychological Science, 30, 1174–1185. https://doi.org/10.1177/0956797619855964

86.

Watson

Pearson

Le Pelley

M. E.

(2020). Reduced attentional capture by reward following an acute dose of alcohol. Psychopharmacology, 237, 3625–3639. https://doi.org/10.1007/s00213-020-05641-6

87.

Watson

Prior

Ridley

Monds

Manning

Wiers

R. W.

Le Pelley

M. E.

(2024). Sign-tracking to non-drug reward is related to severity of alcohol-use problems in a sample of individuals seeking treatment. Addictive Behaviors, 154, Article 108010. https://doi.org/10.1016/j.addbeh.2024.108010

88.

Wentura

Müller

Rothermund

(2014). Attentional capture by evaluative stimuli: Gain- and loss-connoting colors boost the additional-singleton effect. Psychonomic Bulletin & Review, 21, 701–707. https://doi.org/10.3758/s13423-013-0531-z

89.

Williams

D. R.

Williams

(1969). Automaintenance in the pigeon: Sustained pecking despite contingent non-reinforcement. Journal of the Experimental Analysis of Behavior, 12, 511–520. https://doi.org/10.1901/jeab.1969.12-511

90.

Williams

(2018). Stand out of our light: Freedom and resistance in the attention economy. Cambridge University Press.

91.

Yan

M. M.

Long

Q. S.

Chen

A. T.

(2022). Evaluative distractors modulate attentional disengagement: People would rather stay longer on rewards. Journal of Vision, 22, 1–9. https://doi.org/10.1167/jov.22.8.12

92.

Yantis

(2000). Goal-directed and stimulus-driven determinants of attentional control. In Monsell

Driver

(Eds.), Attention and performance XVIII (pp. 73–103). MIT Press.