Abstract
Reward-associated cues are known to influence motivation to approach both natural and man-made rewards, such as food and drugs. However, the mechanisms underlying these effects are not well understood. To model these processes in the laboratory with humans, we developed an appetitive Pavlovian-instrumental transfer procedure with a chocolate reward. We used a single unconstrained response that led to an actual rather than symbolic reward to assess the strength of reward motivation. Presentation of a chocolate-paired cue, but not an unpaired cue, markedly enhanced instrumental responding over a 30-s period. The same pattern was observed with 10-s and 30-s cues, showing that close cue-reward contiguity is not necessary for facilitation of reward-directed action. The results confirm that reward-related cues can instigate voluntary action to obtain that reward. The effectiveness of long-duration cues suggests that in clinical settings, attention should be directed to both proximal and distal cues for reward.
Keywords
Reward-related cues in the environment are thought to play a critical role in regulating behavior directed toward natural rewards, such as food, water, and sex, as well as man-made rewards, such as drugs. For example, a television advertisement for pizza might encourage a person to go to the freezer or telephone for home delivery. In the case of appetitive disorders such as drug addiction and binge eating, reward-related cues represent a threat to self-control and often serve as a trigger for relapse after treatment (Niaura et al., 1988). The majority of reward-related cues are learned. For example, food packaging and even the sight of food gain their potency through their prior association with the taste and postingestional consequences of food (Booth, 1993). Thanks to a century of laboratory and applied research, we now know a great deal about such associative learning. However, we know rather less about the mechanisms by which learned environmental cues facilitate reward-directed behavior. Given the prevalence of overeating and addictive behavior, it is important to better understand these basic mechanisms.
One feature of reward-related cues we do know about is that they appear to exert their effects on reward-directed behavior by increasing desire or appetitive motivation. A good example of this was provided by Dar, Rosen-Korakin, Shapira, Gottlieb, and Frenk (2010), who asked airline flight attendants who were smokers to rate their craving for cigarettes throughout short and long nonsmoking flights. They found that craving was governed by proximity to the opportunity to smoke, not by time since the last cigarette. The flight attendants reported that their craving was highest during preparation for landing, regardless of flight duration. Laboratory research involving smokers has confirmed that cues associated with nicotine or information about nicotine availability increase nicotine craving, whereas information about nonavailability suppresses craving (e.g., Carter & Tiffany, 2001; Dols, van den Hout, Kindt, & Willems, 2002). Research with natural rewards, such as chocolate, has demonstrated that stimuli paired with chocolate acquire the ability to increase desire for chocolate (Van Gucht, Vansteenwegen, Beckers, & Van den Bergh, 2008).
The research just described assessed the effect of reward-associated cues on motivation or desire for reward. In order to go further and model the effect on actual behavior, it is necessary to establish an action that is directed toward obtaining the target reward. In the laboratory, this can be accomplished by training an arbitrary instrumental response that is performed because it leads to the reward in question. The impact of reward-associated cues on this response can then be assessed in a design known as Pavlovian-instrumental transfer (PIT). The PIT design was initially developed in animal research, and it has provided clear evidence of the ability of reward-associated stimuli to facilitate instrumental responding for that reward (see Holmes, Marchand, & Coutureau, 2010). Selectivity of the PIT effect has also been demonstrated in research in which a stimulus paired with one reward preferentially enhances an instrumental response that leads to that reward, relative to a response that leads to an alternative reward (Colwill & Rescorla, 1988; Delamater, 1996). There has also been substantial progress in identifying the brain mechanisms responsible for the PIT effect (e.g., Corbit & Balleine, 2011).
However, there are some limitations of the existing PIT literature from the perspective of understanding the impact of reward-related cues in appetitive disorders. First, there have been relatively few studies with humans using the PIT design. Second, the studies that have been conducted have typically involved abstract rewards, such as points or tokens that participants are informed can be exchanged for actual rewards after the experiment (e.g., Nadler, Delgado, & Delamater, 2011; Talmi, Seymour, Dayan, & Dolan, 2008). The difficulty here, in terms of modeling motivational effects, is that participants are not engaged in the real-time pursuit and consumption of actual rewards during the experiment. One exception is a study by Bray, Rangel, Shimojo, Balleine, and O’Doherty (2008) in which participants received liquid rewards. However, in this study, participants were required to make an instrumental choice on each discrete trial, so they were not free to respond or not respond according to their degree of motivation. Accordingly, we set out to test whether Pavlovian stimuli paired with a natural reward can modulate voluntary instrumental responding for that reward. We used a single unconstrained response to assess absolute motivation for the reward. Following Van Gucht et al. (2008), we chose chocolate as a high-value reward that can induce substantial levels of motivation.
Experiment 1
We modeled the Pavlovian component of our procedure on previous animal PIT research, as well as the chocolate-craving procedure of Van Gucht et al. (2008). We used a within-subjects differential-conditioning design in which two colored lights served as conditioned stimuli (CSs). One light was consistently followed by delivery of an M&M-brand chocolate, whereas the other light was followed by no outcome. A simple button press was used as the instrumental response, rewarded by delivery of chocolates on a partial-reinforcement schedule. In the transfer-test phase, we presented the two Pavlovian stimuli while participants had the opportunity to make the instrumental response. Responding was not followed by chocolate in this phase, to avoid any direct effect of the reward on responding. The dependent variable of interest was the number of button-press responses made before, during, and after the presentation of each CS. The experiment was approved by the Human Research Ethics Advisory Panel D at the University of New South Wales.
Method
Participants
Participants were 55 students (39 female, 16 male) from the University of New South Wales who responded to an advertisement and received 15 Australian dollars (AU$15) in return for their time. Their mean age was 22.2 years (SD = 3.21, range = 18–31). To be eligible for the study, participants were required to have a score of 2 or higher on each of two items in a brief screening questionnaire on chocolate liking and consumption. They also had to answer no to a question on whether they were currently dieting and answer no to all questions on a chocolate-allergy screen. Participants were asked to refrain from eating for 2 hr prior to the experiment and to refrain from eating chocolate for 24 hr prior to the experiment.
Materials
Participants were seated at a desk in a 2-m × 3-m testing room, facing a 43-cm computer monitor. On the desk to the left of the monitor was a Med Associates M&M dispenser Model ENV-702 on a pedestal mount, inside a 210-mm × 170-mm × 330-mm sound-attenuating plywood box. A clear 20-mm-diameter plastic tube delivered individual M&M chocolates through a hole in the plywood box and into a dish within easy reach of the participant’s left hand. In front of the monitor was a metal box with six colored lights (green, red, blue, yellow, orange, and white; only the red and blue lights were illuminated during the experiment) mounted on top. To the right of the monitor was another metal box with four response buttons mounted on top, within easy reach of the participant’s right hand. For this experiment, all of the buttons except for the leftmost button were taped over. To help mask external sounds, we had the participants wear headphones emitting constant 72-dB white noise throughout the experiment. In the adjoining control room, a desktop computer with Med Associates interface and software was used to present all stimuli and to record button-press responses.
A postexperimental questionnaire was used to assess participants’ knowledge of the Pavlovian contingency. Participants rated how often each of the two colored lights was followed immediately by chocolate during the middle part of the experiment, using 100-mm visual analogue scales from 0% (never) to 100% (always). Participants also rated their desire for chocolate and provided demographic information and their height and weight.
Procedure
After providing consent, participants rated their desire for chocolate on a 100-mm visual analogue scale from 0 (not at all) to 100 (I crave it). They were then seated in the test room. Their attention was drawn to the response button, the colored lights, the chocolate dispenser, and the computer monitor. They were told that at different points in the experiment, they would be able to press the button to earn chocolate and that the colored lights would come on. They were also told that from time to time, instructions would be presented on the monitor.
The experiment followed a standard three-phase PIT design. The first was the instrumental-acquisition phase. Instructions on the monitor informed participants that they could press the button to obtain chocolate and that they could press the button as often and as quickly as they liked. Button pressing was rewarded on a variable-ratio (VR) 10 schedule in which, on average, 10 presses (range = 5–15) were required before the chocolate was delivered. However, to help shape participants’ button pressing, we faded in the VR schedule over the first 3 rewards (fixed ratios of 2, 4, and 6). Each button press, whether rewarded or not, led to the presentation of a small black square (2.5 mm × 2.5 mm) in the center of the monitor for 0.1 s. Pilot testing had revealed that without such feedback, many participants stopped pressing and told the experimenter that the button was not working. In addition, every time a chocolate reward was presented, the word “chocolate” appeared in size 28 font in the center of the monitor for 1 s. If participants had not earned 5 rewards within the first 5 min, the experimenter entered the room and informed them that they might need to press the button several times to receive chocolate. The instrumental-acquisition phase finished when participants had earned 12 rewards. Any participant who did not complete the phase within 10 min was excluded from the experiment.
The second phase was the Pavlovian-acquisition phase. The instructions informed participants that they would see some colored lights and asked them not to press the button during this part of the experiment. A differential-conditioning procedure was employed in which the red and blue lights (counterbalanced) served as the two CSs, Stimulus A and Stimulus B. Each CS was presented for 10 s. Stimulus A was paired with the delivery of one chocolate, whereas Stimulus B was presented with no outcome. On Stimulus A trials, the chocolate dispenser was activated 8 s after CS onset. Pilot recording indicated that there was on average a 1-s delay between dispenser activation and participants’ placing the chocolates in their mouth. Hence, the chocolate was usually consumed in the final second of the presentation of the CS. The intertrial interval (offset to onset) was varied randomly from 15 to 35 s. Trials were randomly intermixed, with the restriction that no more than two trials of the same type could be in a row.
The final phase was the transfer-test phase. The instructions informed participants that they could press the button again. Testing was conducted under both instrumental and Pavlovian extinction. Button presses continued to be recorded, and the CSs were presented, but no chocolates were delivered. The first 2 min comprised instrumental extinction. If participants made any button-press responses during the final 30 s of this 2-min period, extinction was continued for a further 30 s; this process was repeated until participants had refrained from responding for the full 30-s period or 10 min had elapsed, at which point Pavlovian testing commenced. Both Stimulus A and Stimulus B were presented once for 10 s in random order. Button-press responding was recorded in 5-s bins from 30 s before CS onset to 60 s after CS onset. The two stimuli were then presented a second time, again in random order. The intertrial interval was varied randomly from 90 to 110 s. After the transfer-test phase, participants were asked to complete the postexperimental questionnaire. They were then debriefed, offered additional chocolates, and thanked for their participation.
Results and discussion
In the instrumental-acquisition phase, 11 participants received a verbal prompt. Despite this procedure, 4 participants had to be excluded for failing to earn 12 rewards. A further participant was excluded for failing to identify which light was consistently followed by chocolate in the postexperimental questionnaire. For the remaining 50 participants, the mean pretest level of desire for chocolate was 71.6% and the mean posttest level of desire for chocolate was 58.9%, which suggests that participants still had substantial desire for chocolate at the end of the experiment. The primary data of interest, button-press responses during the transfer-test phase, were analyzed with a set of planned contrasts using a multivariate, repeated measures model (R. G. O’Brien & Kaiser, 1985). The main contrast of interest, pre-CS versus CS, compared the response rate averaged across the 30-s pre-CS period with the response rate averaged across the 10-s CS period.
Figure 1 shows button-press responses in 5-s bins for each CS during the transfer-test phase, averaged across the two test trials and across participants. The mean number of pre-CS responses was above 0 because, after having met the instrumental-extinction criterion, many participants resumed pressing at some point during the transfer-test phase. Stimulus B had little effect on response rate, whereas Stimulus A appeared to increase responding for a period, commencing in the second half of the presentation of the CS, followed by recovery to the pre-CS level. The main effect for the A-versus-B contrast was statistically significant, reflecting higher mean responding to Stimulus A than to Stimulus B averaged over the full recording period, F(1, 49) = 5.12, p < .05. The main effect for the pre-CS-versus-CS contrast was not significant, F(1, 49) = 1.44, p > .05. However, the critical comparison, that is, the interaction between these two contrasts, was significant, F(1, 49) = 5.16, p < .05, which confirmed that the presentation of Stimulus A increased button-press responding to a greater degree than did the presentation of Stimulus B. When the pre-CS-versus-CS contrast was applied to each stimulus separately (as in a simple-effects analysis), it approached but did not reach significance for Stimulus A, F(1, 49) = 3.33, p = .074, and was not significant for Stimulus B, F < 1.

Mean number of instrumental responses in each of the 5-s bins before, during, and after presentation of the conditioned stimuli (CSs) in the transfer-test phase of Experiment 1. Stimulus A was a CS that had previously been paired with chocolate; Stimulus B was a CS that had not been paired with chocolate. The dotted lines indicate the onset (0 s) and offset (10 s) of the CS. Error bars represent standard errors of the mean.
Because the CS for chocolate, Stimulus A, appeared to facilitate button-press responding for some time after stimulus offset, we tested an additional post hoc contrast that compared response rate during the 30-s period before CS onset with the 30-s period after CS onset (i.e., the CS plus the subsequent 20 s). The interaction between this new contrast and the A-versus-B contrast was significant, F(1, 49) = 11.45, p < .05, supporting the idea that Stimulus A produced a sustained elevation of responding relative to Stimulus B. Applied to each stimulus separately, the new contrast was nonsignificant for Stimulus B, F < 1, but significant for Stimulus A, F(1, 49) = 6.85, p < .05.
Overall, the data indicated that a CS for a chocolate reward can facilitate instrumental responding to obtain that reward, relative to a differential control CS. The planned analysis showed that this facilitation was significant for the period when the CS was physically present. However, the data also suggested that the facilitation effect lasted for some time after CS offset, and this pattern was supported by a post hoc analysis.
Experiment 2
In Experiment 1, the CS for chocolate facilitated instrumental responding for approximately 30 s, even though the CS was only 10 s long. This pattern suggests that the effect produced by the CS persisted and outlasted the duration of the CS itself. In other words, facilitation does not depend directly on a current expectancy of reward, at least as signaled by the CS. Alternatively, the post-CS component of facilitation could have been due to frustration arising from omission of the chocolate signaled by Stimulus A. To further explore these possibilities, we increased the CS duration in Experiment 2 from 10 to 30 s. If the period of facilitation seen in Experiment 1 was triggered by CS onset, then a longer CS may produce a similar pattern. If frustration due to the absence of a reward plays a role, then facilitation should be seen after offset of the 30-s CS, even if the longer CS fails to produce facilitation for its full duration because of weaker CS–unconditioned stimulus contiguity. In fact, there is evidence from the animal-conditioning literature that long-duration appetitive CSs are surprisingly effective in producing general activity and enhancement of goal-directed responding (Lovibond, 1980; Sheffield & Campbell, 1954). Experiment 2 tested whether longer-duration stimuli are similarly effective in humans.
Method
Participants
Participants were 46 students (34 female, 12 male) from the University of New South Wales. Of these participants, 7 received AU$15 as in Experiment 1, and the remaining 39 participated in return for partial fulfillment of a course requirement. Their mean age was 20.6 years (SD = 4.05, range = 17–38). The same eligibility criteria used in Experiment 1 were applied.
Procedure
Experiment 2 used the same equipment and followed exactly the same procedure as Experiment 1 except that the CS duration was increased from 10 to 30 s in both the Pavlovian-acquisition phase and the transfer-test phase. The response-recording period in the transfer-test phase remained the same as in Experiment 1, that is, 30 s before CS onset and 60 s after CS onset. In the planned contrast analysis, the pre-CS-versus-CS contrast compared response rate averaged across the 30-s pre-CS period with response rate averaged across the 30-s CS period.
Results and discussion
In this experiment, 9 participants required a verbal prompt, but only 1 participant was excluded for failing to earn 12 rewards in the instrumental-acquisition phase. For the remaining 45 participants, the mean pretest level of desire for chocolate was 63.0% and the mean posttest level of desire for chocolate was 57.4%, which again confirmed that desire was maintained at a moderate level throughout the experiment.
Figure 2 shows button-press responses in 5-s bins for each CS during the transfer-test phase, averaged across the two test trials and across participants. The pattern of data was broadly similar to that in Experiment 1. Stimulus B again appeared to have little effect on responding. By contrast, Stimulus A produced an elevation of responding that started in the first 5-s block of the CS and peaked over the next two blocks. Responding then declined, although it remained above that produced by Stimulus B for the remainder of the CS and then immediately declined to a level similar to that produced by Stimulus B. There was no indication of a frustration effect after the nonoccurrence of the predicted reward at the end of the presentation of the CS. In the statistical analysis, the main effect for the A-versus-B contrast was not statistically significant, F(1, 44) = 2.51, p > .05, but the main effect for the pre-CS-versus-CS contrast was significant, F(1, 44) = 6.09, p < .05, reflecting higher responding averaged across the two stimuli during the CS period than during the preceding period. Critically, the interaction between the two contrasts was significant, F(1, 44) = 8.18, p < .05, which confirmed that Stimulus A produced a greater increase in responding during the presentation of the CS than did Stimulus B. Applied to each stimulus separately, the pre-CS-versus-CS contrast was significant for Stimulus A, F(1, 44) = 10.84, p < .05, but not for Stimulus B, F < 1.

Mean number of instrumental responses in each of the 5-s bins before, during, and after presentation of the conditioned stimuli (CSs) in the transfer-test phase of Experiment 2. Stimulus A was a CS that had previously been paired with chocolate; Stimulus B was a CS that had not been paired with chocolate. The dotted lines indicate the onset (0 s) and offset (30 s) of the CS. Error bars represent standard errors of the mean.
General Discussion
In both experiments, a stimulus that had been paired with chocolate facilitated performance of an action to obtain chocolate. The time course of facilitation was similar in the two experiments, approximately 30 s. The effect appeared to be a direct consequence of the CS rather than an indirect frustration effect arising from the absence of a reward. We explored the demographic and individual-difference data (e.g., postexperiment desire ratings and body mass index) for possible associations with the facilitation effect but observed no consistent patterns. Thus, the effect appears to be universal rather than restricted to a subset of participants. Previous research had established that stimuli paired with chocolate in the laboratory can elicit craving (Van Gucht et al., 2008). In this study, we extended that finding to show that stimuli paired with chocolate can facilitate voluntary goal-directed action to obtain chocolate. To our knowledge, this is the first demonstration of appetitive PIT in humans in which participants were free to respond for a natural reward. We suggest that compared with designs used in previous research, this design provides a closer analogue of the modulation of goal-directed behavior by reward-related cues in natural settings.
An important finding in Experiment 2 was that the facilitation effect could be obtained with a relatively long (30-s) stimulus and that it was present from the beginning of the presentation of the CS. Thus, close contiguity between the stimulus and the reward is not necessary for the stimulus to exert control over behavior. This pattern is similar to that obtained in animal studies, in which long- and variable-duration cues, including contextual cues, can produce elevations in general activity as well as facilitate instrumental responding (Lovibond, 1980). If the present effects hold at even longer intervals (of minutes or hours), they could have important clinical implications for therapeutic interventions that attempt to reduce the impact of reward-associated stimuli by cue exposure (Conklin & Tiffany, 2002). To date, most cue-exposure and cue-reactivity research has focused on cues that are highly proximal to consumption of the reward, such as the sight of intravenous-injection apparatus or the sight and smell of an alcoholic beverage (e.g., Drummond & Glautier, 1994; C. P. O’Brien, Greenstein, Ternes, McLellan, & Grabowski, 1979). Our results provide preliminary evidence that more distal cues may also play an important role. In fact, distal cues represent promising targets for intervention for two reasons. First, they may be critical in initiating reward-seeking behavior. Second, it may be easier to divert a person to an alternative course of action after exposure to distal cues because at that point, he or she should be less committed to the goal of obtaining the target reward.
Several questions raised by the present experiments would be useful to pursue further. First, it would be interesting to test whether the facilitation effect is observed in more complex scenarios, for example, with competing tasks or a cost to responding, which often occur in nonlaboratory settings. Second, it would be valuable to assess desire or craving for the reward during the experiment to test whether the impact of reward-associated cues on instrumental responding is mediated by increased desire. We did not assess desire in this study because we did not want to interfere with ongoing instrumental responding crucial for demonstrating the initial effect.
Finally, it is important to investigate the role of cognitive variables, such as reward expectancy. There is evidence that cognitive variables play an important role in associative learning (see Mitchell, De Houwer, & Lovibond, 2009). However, there are some puzzling findings regarding the role of expectancy in associative modulation of appetitive behavior that need to be better understood. For example, extinction of a cue for reward does not necessarily prevent it from facilitating instrumental behavior in a PIT design (Delamater, 1996; Rosas, Paredes-Olay, García-Gutiérrez, Espinosa, & Abad, 2010). Similarly, Van Gucht et al. (2008) found that extinction of their Pavlovian cues eliminated chocolate expectancy but not chocolate craving. There are also paradoxical effects of reward devaluation on instrumental choice (Hogarth & Chase, 2011). It is possible that reward-associated cues have an impact on motivation and behavioral intention through quite extended cognitive sequences, as proposed, for example, in Kavanagh, Andrade, and May’s (2005) “elaborated intrusion” model of desire. In this context, the present procedure offers a potential laboratory model for further explicating the relationships among reward expectancy, motivation, and goal-directed behavior.
In summary, the current study tested the extent to which a reward-associated cue could increase humans’ responding to obtain a natural reward. The pattern of results was clear across both experiments. Participants pressed a button that had previously given them access to chocolate substantially more often when presented with a chocolate-paired cue than when presented with a control cue, even though chocolate was not physically present. These results show that reward-associated cues can trigger actions to obtain that reward under conditions in which participants are free to respond or not respond on a voluntary basis. The results are relevant to understanding environmental modulation of both normal and pathological reward-directed behavior. The effectiveness of relatively long-duration (30-s) cues suggests that distal cues may serve as a useful point of intervention in the clinic because they occur further away from consumption and before actions to obtain or consume the reward have been initiated—that is, before it is too late.
Footnotes
Acknowledgements
The authors thank Jasmine Fardouly, Jessica Lee, and Georgia McClure for carrying out the data collection.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
This research was supported by Grant DP130103570 from the Australian Research Council.
