Abstract
The notion that reward inhibits pain is a well-supported observation in both humans and animals, allowing suppression of pain reflexes to acquired rewarding stimuli. However, a blanket inhibition of pain by reward would also impair pain discrimination. In contrast, early counterconditioning experiments implied that reward might actually spare pain discrimination. To test this hypothesis, we investigated whether discriminative performance was enhanced or inhibited by reward. We found in adult human volunteers (N = 25) that pain-based discriminative ability is actually enhanced by reward, especially when reward is directly contingent on discriminative performance. Drift-diffusion modeling shows that this relates to an augmentation of the underlying sensory signal strength and is not merely an effect of decision bias. This enhancement of sensory-discriminative pain-information processing suggests that whereas reward can promote reward-acquiring behavior by inhibition of pain in some circumstances, it can also facilitate important discriminative information of the sensory input when necessary.
The traditional logic of giving children sweets after an injury or to endure vaccinations has a sound basis in theories of motivational opponency. Human and animal studies have shown that reward can inhibit pain (Becker, Gandhi, Elfassy, & Schweinhardt, 2013; Dum & Herz, 1984; Leknes & Tracey, 2008), allowing suppression of pain reflexes to promote reward-acquiring behavior (Fields, 2018). Conversely, in situations in which pain is judged to be more important than reward consumption, pain can also be facilitated, prioritizing pain avoidance or escape (Becker et al., 2013; Becker, Gandhi, Chen, & Schweinhardt, 2017; Fields & Margolis, 2015). Both cases illustrate how pain motivation and pain affect can be dissociated from perceptual discrimination—a concept initially proposed in Melzack’s tripartite model of pain, which considered pain as constructed of distinct sensory-discriminative, affective-motivational, and cognitive subcomponents (Melzack & Casey, 1968).
A dissociation between discrimination and pain affect and motivation was famously shown in a classic counterconditioning experiment (Dickinson & Pearce, 1977; Erofeeva, 1921). When painful shock was associated with a subsequent reward, appetitive responses gradually and completely replaced pain responses, indicating that the pain had lost its aversiveness. However, in this situation, discriminative pain responses must still be present to allow the reward to continue to be predictable. This leads to the prediction that reward can spare pain discrimination, although this has not previously been directly tested. This prediction creates competing hypotheses: Either reward provides a blanket inhibition of pain in line with the conventional notion of motivational opponency or reward is able to specifically enhance pain discrimination. In the current investigation, we tested whether discriminative performance was enhanced or inhibited by reward.
Method
Participants
To not overestimate experimental power, we based our a priori sample-size calculation on traditional analysis strategies—repeated measures analysis of variance (ANOVA) instead of linear mixed models (LMMs) or drift-diffusion models. This a priori sample-size calculation indicated that 23 participants would be needed to achieve the desired medium effect size (f) of .25 (α = .05, β = 0.80) using a repeated measures within-subjects design (the analysis was conducted using G*Power 3.1; Faul, Erdfelder, Lang, & Buchner, 2007). Anticipating an attrition rate of 10%, we recruited 25 adults (21 women; age: M = 22.87 years, SD = 7.17), who participated in the study after giving written informed consent. Data collection was stopped when the prespecified target of 25 participants was reached. Participants consisted of a convenience sample recruited via online advertisements at the local university. Exclusion criteria for participation were chronic pain and a history of psychiatric or neurological diseases. The study was approved by the Ethics Committee II of the Medical Faculty Mannheim of Heidelberg University.
General procedure
Each participant completed three testing sessions on separate days. Each testing session comprised a baseline assessment of participants’ pain sensitivity to individually adjust stimulus intensity in the following pain-discrimination task. This task was performed in three conditions over the separate sessions in counterbalanced order across participants, operationalizing different reward contingencies.
Thermal stimulation
Participants received thermal stimuli on the ball of the thumb (thenar eminence) of their nondominant hand. Stimuli were applied using a thermode (MSA Thermotest; Somedic SenseLab AB, Sösdala, Sweden), which was embedded in a Styrofoam hemisphere to allow comfortable placement of the hand. The thermode size was 2.5 cm × 5 cm. Baseline temperature was kept constant at 40° C to avoid long temperature rise times (rate of temperature change = 5.5° C per second) to the target temperatures in the pain-discrimination task. For safety reasons, the maximal temperature was 50° C.
Statement of Relevance
The behavior of humans and other animals is determined by sensory inputs and motivational drives both positive and negative. We typically think of positive and negative motivations as inhibiting one another. This is the logic behind giving children sweets after an injury or in anticipation of a vaccination—they will feel less pain if they are experiencing pleasure. But mutual inhibition may not always be beneficial, such as when the ability to discriminate pain helps you to obtain reward. In such cases, reward may actually enhance pain, at least in terms of discriminatory ability. In this research, we tested whether monetary rewards would boost discrimination of an increase of a painful stimulus (heat). We found that indeed it could, especially when the reward was directly contingent on discrimination performance. These results show that reward and pain do not have a blanket mutual inhibitory relation. Instead, pain is selectively tuned according to the broader goals of the individual.
Stimulus-intensity calibration
To adjust stimulus intensity during the discrimination task to participants’ individual pain sensitivity, we assessed participants’ heat-pain threshold and heat-pain tolerance using a method-of-limits procedure. In this procedure, the temperature at the thermode slowly increased by 1° C per second. Participants were instructed to press a button twice—first, when the pain threshold was reached (i.e., when they perceived the first painful sensation), and second, when the pain tolerance was reached (i.e., when they could not tolerate any further increase of the temperature), after which the temperature decreased immediately. This procedure was repeated four times, and the average of the last three repetitions was used as an estimate of pain threshold and pain tolerance.
In a subsequent calibration step implemented using a simple staircase procedure, participants received pain stimuli for 20 s. Using a horizontally oriented numerical rating/visual analog scale (VAS) ranging from 0 (no sensation) to 200 (most intense pain tolerable), with 100 being the pain threshold (Becker, Gandhi, Pomares, Wager, & Schweinhardt, 2017), participants constantly rated the perceived intensity of this stimulation, which could vary because of habituation and sensitization. The first trial of this calibration started with a stimulus intensity of the pain threshold plus 50% of the difference between the pain threshold and pain tolerance, as assessed before. If the rating of this stimulus intensity at the end of the stimulation was lower than 130 or higher than 150 (140 ± 10) on the VAS, the stimulus intensity of the next trial was increased or decreased. For every 10 points on the VAS below 130 or above 150 (140 ± 10), the temperature was increased or decreased by 0.1° C. The calibration procedure was stopped if the rating fell in the target range of 130 to 150, and the resulting stimulus intensity was then used for the stimulation temperature of the discrimination task, in which we aimed for a moderately painful stimulation intensity.
Perceptual-discrimination task
Participants performed a two-alternative forced-choice one-interval discrimination task (see Fig. 1c), in which a thermal stimulus was applied and held within the noxious range for several seconds, during which a short additional thermal pulse of different magnitudes (0.2° C, 0.4° C, 0.6° C, 0.8° C) could sometimes occur and that the participant was required to detect.

Overview of the drift-diffusion model and its parameters (a), expected effects of reward on model parameters (b), and design of the discrimination task (c). Drift-diffusion models (a) are based on the assumption that a decision is made when the noisy input as a stochastic process reaches one of two decision thresholds; the distance between is described as the “boundary separation a,” characterizing the conservativeness of a decision and how much evidence is needed. Sensory input provides the evidence accumulated from a certain “starting point z,” which can be shifted by a priori biases. Speed of evidence accumulation is described as the “drift rate v” and reflects the strength of the sensory signal. The total reaction time also includes processes not related to the decision process, such as stimulus encoding and response execution, described as the “nondecision time t.” Drift-diffusion models incorporate separate distributions of reaction times for correct and incorrect responses, simultaneously using information on mean reaction times and their variance as well as response accuracy. As illustrated in (b), it is hypothesized that monetary reward given contingently on correct discrimination of a thermal nociceptive pulse increases drift rates v for correct responses with a simultaneous decrease in drift rate for incorrect responses (reaction time distribution illustrated in dark gray) compared with noncontingent and nonreward conditions (reaction time distribution illustrated in light gray). At the beginning of each trial of the discrimination task (c), a thermal probe is heated and held to an individually adjusted painful target temperature. After a variable delay, a pulse of varying magnitude may be presented (0.2° C, 0.4° C, 0.6° C, 0.8° C) or not, and the participant is required to identify the presence or absence of a pulse with a button press for each. In the reward conditions, feedback was provided immediately afterward. In each reward condition, participants performed 90 trials (18 of each stimulus magnitude) in blocks of 30 trials with breaks of 1 min between each block.
At the beginning of each trial, the stimulation intensity increased to the individually adjusted temperature determined before. After an interval of 4 s to allow perception to reach a steady state, a visual cue indicated the start of the detection interval. This cue was followed by a variable interval of 500 ms to 1,500 ms, after which the thermal pulse could be delivered (test trials) or the temperature stayed constant (control trials). Immediately afterward, participants indicated whether or not they had felt a pulse by pressing a corresponding button on a response unit. Depending on the experimental reward condition, participants then received a monetary reward of €0.10 or nothing. At the end of each trial, the stimulation intensity decreased to baseline, and after a break of 5 s, the next trial started.
The task was performed in three reward conditions: In the contingent-reward condition, monetary reward was immediately given for each correct answer, which implemented a continuous reinforcement schedule; in the noncontingent-reward condition, reward was provided on a comparable number of trials but was independent of participants’ performance; and in the nonreward condition, the participant received no reward and no other feedback whether the responses were correct during the task but was awarded a comparable amount of money at the end of the experiment for taking part. The noncontingent-reward condition was implemented via a yoked procedure in which monetary rewards were implemented in the same order as the received rewards across all trials of the task in the contingent-reward condition of another participant. Participants were not informed of any reward contingencies.
The discrimination task comprised 90 trials in total, with 18 trials of each of the five temperature-pulse conditions (0.2° C, 0.4° C, 0.6° C, 0.8° C, and no change). The trials were presented in pseudorandom order and separated into blocks of 30 trials with breaks of 1 min between each block. Before and after each of these blocks, participants rated the perceived intensity of constant thermal stimuli of 20-s duration at the target temperature of the discrimination task. These intensity ratings were used to readjust the target temperature of the discrimination task to fall within the target range of 130 to 150 on the VAS for the following block of trials to account for habituation and sensitization. If the rating at the end of the constant stimulation was below 130 or above 150 on the VAS, the stimulation intensity was adjusted as in the calibration procedure, aiming at a rating of between 130 and 150. The readjusted temperature was used as the stimulation intensity for the subsequent block of 30 trials.
Data analysis
None of the participants or observations were excluded from the analyses.
Behavioral data
For the analysis of response accuracy and reaction times, separate LMMs were calculated with reward condition (nonreward, contingent reward, noncontingent reward) and pulse (0.2° C, 0.4° C, 0.6° C, 0.8° C, no change) as within-subjects fixed factors and the interaction of reward condition and pulse; either response accuracy or reaction time was the dependent variable. For the analysis of VAS ratings and stimulation intensities, separate LMMs were calculated with reward condition (nonreward, contingent reward, noncontingent reward) and time (Blocks 1, 2, and 3) as within-subjects fixed factors and the interaction of reward condition and time; either VAS rating or stimulus intensity was the dependent variable. For the analysis of the VAS ratings, only the ratings after each block of 30 trials were used. All LMM analyses included the participant as a random intercept. Significant main effects and interactions of the LMMs were followed by post hoc pairwise comparisons. The significance level (α) was set to .05. Where appropriate, correction for multiple testing was applied using a false-discovery rate (Benjamini & Hochberg, 1995) to avoid alpha inflation. Exact p values are reported with significances corrected for multiple comparisons. Because of the way variance is partitioned in LMMs (e.g., Rights & Sterba, 2019), there is no agreed-on method to calculate standard effect sizes for individual model terms such as main effects or interactions; therefore, we do not report effect sizes for main effects and interactions related to LMM analyses. Nevertheless, LMMs were used here because mixed models are superior than alternative approaches in controlling for Type I errors; consequently, results from mixed models are more likely to generalize to new observations (e.g., Barr, Levy, Scheepers, & Tily, 2013). For post hoc pairwise comparisons, effect sizes (Cohen’s ds) of the means and standard deviations of the respective comparison were calculated by dividing the difference of the means by the pooled standard deviation.
Hierarchical drift-diffusion model
Drift-diffusion models were used to differentiate changes in perceptual signal strength and response biases (see Fig. 1a). Computational drift-diffusion models (Ratcliff & McKoon, 2008) are biologically realistic models that have been proven successful in models of vision and other sensory domains, explaining behavior and neurophysiological responses (Heekeren, Marrett, & Ungerleider, 2008). These models consider decisions as a process of accumulating sensory input. If a certain threshold is reached, a decision will be made (Ratcliff & McKoon, 2008). Parameters of the decision-making process are estimated using response accuracy (incorporating correct and incorrect responses) and reaction time distributions. The following parameters are estimated in drift-diffusion models: (a) the starting point (z), the point between the two boundaries at which the sampling process as accumulation of evidence starts, with a shift of z toward one of the boundaries resulting in less information being needed to reach the boundary and thus indicating an a priori response bias; (b) the drift rate (v), the speed at which the evidence is accumulated, with higher values indicating a faster accumulation speed; (c) the boundary separation (a), which describes the distance between the two decision boundaries, with a smaller distance indicating that less information is needed for a decision; and (d) the nondecision time (t), the time required for response encoding and response execution (i.e., the time not required for decision-making but reflected in the total reaction time; see Fig. 1a).
Here, the parameters of the model were estimated with the hierarchical drift-diffusion model as a hierarchical Bayesian estimation of the drift-diffusion model using a Python-based algorithm (Wiecki, Sofer, & Frank, 2013; http://ski.clps.brown.edu/hddm_docs). Compared with traditional drift-diffusion models, the hierarchical model requires fewer data points per participant because parameter estimation is based on a group distribution (Vandekerckhove, Tuerlinckx, & Lee, 2011). The best-fitting model was chosen on the basis of the deviance information criterion. On the basis of this criterion, separate models for each reward condition (no reward, contingent reward, and noncontingent reward) were calculated, and for each reward condition, the model was set up to allow the drift rate to differ between pulse conditions, while starting point, boundary separation, and nondecision time were kept fixed, on the basis that these variables should not depend on pulse magnitude. Intertrial variability was allowed for the drift rate. The probability distributions of the parameters (i.e., the posterior distributions) were calculated using Markov chain Monte Carlo (MCMC) sampling methods. Noninformative priors were used, that is, uniform distributions with equal probabilities across a range of parameter values. We generated 20,000 samples for each model, discarding the first 15,000 samples because early samples are regarded as unreliable because of random selection of initial values. Samples were further thinned by a factor of 5. MCMC convergence was reached for all models as indicated by the diagnostics of the hierarchical drift-diffusion models (all
Differences in the drift rate v between thermal pulses within each reward condition and between reward conditions as well as differences in the starting point z, the boundary separation a, and the nondecision time t between the reward conditions were assessed by testing the overlap of the posterior distributions of the drift rates for no pulses (0.0° C) compared with the other pulse conditions (0.2° C, 0.4° C, 0.6° C, 0.8° C). Statistical inference was based on computing tail areas of contrast posteriors, with a difference being assumed if the probability of an overlap between distributions was smaller than 2.5%. Therefore, Pr(X) indicates the posterior probability that a proposition X is true, given the data at hand. If this probability is very low, the complement of X is inferred. For example, to test whether the drift rate v is closer to the upper boundary (correct response) with 0.4° C pulses compared with 0.2° C pulses, we would evaluate the proposition v0.4°C ≤ v0.2°C, computing the posterior probability that Pr(v0.4°C ≤ v0.2°C) ≤ 0. If this probability is greater than 97.5%, v0.4°C ≤ v0.2°C is inferred, that is, the drift rate for 0.4° C pulses is lower or the same as the drift rate for 0.2° C pulses. If the probability is less than 2.5%, v0.2°C > v0.4°C is inferred, that is, the drift rate for 0.4° C pulses is higher than the drift rate for 0.2° C pulses. For the specific hypothesis that operant reinforcement increases signal strength, we tested Pr(vnoncontingent ≤ vcontingent) ≤ 0 and Pr(vno ≤ vcontingent) ≤ 0. If these probabilities were greater than 95%, vnoncontingent ≤ vcontingent and vno ≤ vcontingent were inferred. Tail areas of contrast posteriors are a Bayesian analog to classical p values associated with one- or two-tailed tests.
Results
Detection accuracy increased with pulse magnitude in all three reward conditions (see Fig. 2a), which is mirrored in a main effect of pulse magnitude for the analysis of accuracy (percentage correct), F(4, 24) = 97.94, p < .001. Accuracy also differed between reward conditions—main effect of reward condition: F(2, 24) = 3.66, p = .040—the highest accuracy was in the with contingent-reward condition, and there was no difference between the noncontingent and no-reward conditions (across pulse magnitudes, contingent reward: M = 78%, SD = 21%; noncontingent reward: M = 74%, SD = 20%; no reward: M = 74%, SD = 20%; post hoc comparisons: mean difference between contingent and noncontingent reward = 0.04, df = 24, p = .031, d = 0.16; mean difference between contingent and no reward = −0.04, df = 24, p = .054, d = 0.19; mean difference between noncontingent and no reward = 0.003, df = 24, p = .886, d = 0.01). In addition, we found a significant interaction between pulse magnitude and condition, F(8, 24) = 5.92, p < .001 (for post hoc comparisons, see Table S1 in the Supplemental Material available online).

Results. Detection accuracy (a) is shown as a function of reward condition and stimulus magnitude. Reaction time for correct and incorrect responses (b) is shown as a function of stimulus magnitude. Posterior probability for drift rates (c) is shown for each reward condition across pulse magnitudes. The intensity rating of baseline stimulation temperature after each block of 30 trials of the discrimination task pooled across reward conditions is shown in (d), and the baseline stimulation intensity of each block of 30 trials of the discrimination task pooled across reward conditions is shown in (e). In the box-and-whisker plots (a, b, d, e), the horizontal lines within the boxes indicate the medians, the top and the bottom edges of the boxes mark the upper and lower limits of the interquartile ranges, and the whiskers extend to 1.5 times the interquartile ranges. Small circles indicate outliers. Symbols in (a) indicate marginally significant and significant differences between reward conditions (†p < .10, *p < .05), and the asterisk in (b) indicates a significant difference between response types (p < .05). In (c), the asterisks indicate significant differences between reward conditions (p < .05).
Reaction times were shorter for correct responses than for incorrect responses (see Fig. 2b)—main effect of response: F(1, 374) = 100.98, p < .001—and shorter with increasing pulse magnitude (see Fig. 2c)—main effect of pulse magnitude: F(4, 498) = 3.37, p = .01—but there was no effect of reward condition—main effect of reward condition: F(2, 142) = 1.15, p = .318.
The VAS pain ratings of the painful ramped baseline temperature did not differ between reward conditions (see Fig. 2d), F(2, 192) = 2.42, p = .091. The pain ratings also did not differ between blocks (assessed after each block of 30 trials)—main effect of block: F(2, 192) = 0.69, p = .502. Similarly, temperature of the baseline (which was adjusted on the basis of variations of ratings) did not differ between reward conditions (see Fig. 2e)—main effect of reward condition: F(2, 195) = 0.27, p = .767—and blocks—main effect of block: F(2, 196) = 0.23, p = .797.
These findings provide an initial suggestion that contingent reward might enhance pain-discrimination ability. However, the overall metric of percentage of correct responses cannot fully capture the full, complex nature of the discrimination process, which involves a number of underlying processes. Therefore, we investigated which component of the perceptual decision-making process was most sensitive to reward by applying a drift-diffusion-model analysis (see Fig. 1). As shown in Figure 2c, the drift rate (i.e., signal strength) increases with increasing pulse magnitude; this is expected because the magnitude of peripheral stimulation is the main determinant of drift-rate strength. But critically, reward condition also had an effect on the drift rate (see Fig. 2c). Specifically, pairwise comparisons of posterior probabilities showed a higher drift rate with contingent reward for all pulse magnitudes with a higher drift rate with contingent reward for all pulse magnitudes compared with no reward—0.2° C: Pr(vcontingent ≤ vno) = 0.048, 0.4° C: Pr(vcontingent ≤ vno) = 0.017, 0.6° C: Pr(vcontingent ≤ vno) = 0.008, 0.8° C: Pr(vcontingent ≤ vno) = 0.003—and compared with noncontingent with pulse magnitudes of 0.4° C: Pr(vcontingent ≤ vnoncontingent) < 0.001. Noncontingent reward increased the drift rate in comparison with no reward only with pulses of 0.4° C: Pr(vnoncontingent ≤ vno) = 0.012. For a full list of pairwise comparisons, see Table S2 in the Supplemental Material. We found no effect of reward condition on the other model parameters (i.e., decision bias, decision conservativeness, and nondecision time; see Fig. S1 in the Supplemental Material).
Discussion
The results indicate that reward can specifically augment sensory nociceptive information in healthy adults. This means that, in addition to the inhibitory effect of reward on pain motivation and pain affect (Becker et al., 2013; Dum & Herz, 1984; Leknes & Tracey, 2008), reward can have excitatory effects on sensory-discriminative pain processing in some circumstances.
A key piece of evidence that underlies the tripartite model of pain initially proposed by Melzack (Melzack & Casey, 1968) is the counterconditioning experiment from Erofeeva and Pavlov (Dickinson & Pearce, 1977; Erofeeva, 1921). The emergence of appetitive responses gradually replacing pain responses showed that some aspect of the nociceptive stimulus was still able to predict reward. However, there have been few objective demonstrations using behavioral measures that reward might have facilitatory effects on pain discrimination. Here, we clearly showed that reward can enhance discriminative processing in healthy adults, directly refuting the notion that reward inhibits pain as a universal phenomenon. No impact of reward on pain ratings was found here, likely because participants rated pain in extra trials without the discrimination task and without reward.
Mechanistically, there are at least three possible ways in which reward might have differential co-occurring effects on different components of pain. First, pain could still be inhibited by reward overall, but discrimination was selectively enhanced by increasing the precision of information processing in the brain (i.e., sharpening the signal-to-noise ratio). Second, reward-based descending inhibition of pain might have a selective effect on different ascending nociceptive pathways. Some evidence indicates that descending inhibition can have co-occurring differential effects on populations of C-fiber-responsive and A-delta-fiber-responsive dorsal horn neurons (Heinricher, Tavares, Leith, & Lumb, 2009). Third, it may be that in some behavioral contexts, reward globally enhances pain, for instance by attentional effects arising from specific goal-directed behavioral and task requirements. This remains possible because although our data show enhancement of discrimination, we found no effect on affective behavior and ratings, which itself is not sufficient to infer a functional dissociation.
We also found that discrimination, although weaker, could be enhanced simply by the presence of reward not contingent on performance (i.e., increase of the drift rate for noncontingent compared with no reward for a pulse magnitude of 0.4° C). Thus, the mere presence of a rewarding context might have a nonspecific enhancing effect on pain processing (Clark, Lawrence, Astley-Jones, & Gray, 2009), leading to a higher drift rate. This was noted at temperature differences of 0.4° C close to the just-noticeable difference in heat pain, suggesting that the effect might interact with judgment uncertainty.
In summary, the results suggest that reward can specifically enhance pain discrimination in healthy adults. In principle, this provides a way for organisms to suppress unnecessary pain responding in the pursuit of greater reward without corrupting useful discriminative information. However, whether these results can be generalized to other populations, for example, patients with chronic pain, must be tested in future investigations.
Supplemental Material
Becker_Supplemental_Material_rev – Supplemental material for Reward Enhances Pain Discrimination in Humans
Supplemental material, Becker_Supplemental_Material_rev for Reward Enhances Pain Discrimination in Humans by Susanne Becker, Martin Löffler and Ben Seymour in Psychological Science
Footnotes
Acknowledgements
We thank Flavia Mancini and Nobuhiro Hagura for providing comments on the manuscript.
Transparency
Action Editor: Daniela Schiller
Editor: Patricia J. Bauer
Author Contributions
S. Becker and B. Seymour conceptualized the study. Testing and data collection were performed by M. Löffler, and S. Becker analyzed the data. S. Becker and B. Seymour drafted the manuscript, and M. Löffler provided critical revisions. All the authors approved the final manuscript for submission.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
