Abstract
BACKGROUND:
Although recent economics literature suggests a link between performance-related pay (PRP) and ill health, this finding is contested on the grounds that this link is plagued by endogeneity between the two variables of interest.
OBJECTIVE:
This study investigates the adverse effects of performance-related pay on stress which is an important determinant of physical health.
METHODS:
Forty subjects were randomly assigned to two equal groups: either being paid by performance or being paid a flat fee. Both objective (saliva samples to measure cortisol elevation) and subjective (self-reported stress level) measures of stress were obtained before and after participation in the experiment. This experimental methodology purges the effects of self-selection into performance pay and identifies the direction of causation from performance-related pay to stress which is measured by cortisol levels.
RESULTS:
Those who were paid for their performance experienced higher levels of stress, both in terms of perceived stress and in terms of objectively measured cortisol levels, compared to those who were paid a flat fee for minimum performance.
CONCLUSIONS:
Performance-related pay induces objectively measurable stress. Self-reported stress levels and the objective stress measure obtained by measuring cortisol move in a similar direction for the PRP and non-PRP groups, but only the cortisol group shows statistically significant differences between the PRP and non-PRP. This also suggests that individuals underestimate the stress caused by performance pay.
Introduction
It is long advocated by economists that pay according to performance is the most efficient of the pay-ment schemes [1]. There is a varying incidence of performance-related pay (PRP) in the labour markets of developed countries, which depends on the industry or occupation and on how one defines performance pay. Estimates suggest that it is more than 10–15% of European workers and can be as high as 40% in Scandinavia and the United States (US) [2]. Mainly the literature on performance pay has fo-cused on the effects of such payment schemes on worker productivity. Yet, there are some studies that investigate the interrelationships between working conditions and wellbeing measures such as job satisfaction. Job satisfaction is found to be higher among higher paid workers paid by performance [3, 4] while performance pay is correlated with lower job satisfaction for lower paid workers [4]. One would expect that if performance pay has effects on workers’ well-being then it should also have repercussions on the worker’s physical health. Indeed, as early as 1776, Adam Smith observed in the Wealth of Nations, Book VII that “Workmen ... when they are liberally paid by the piece, are very apt to overwork themselves, and to ruin their health and constitution in a few years”.
The focus of this paper is to examine this link between PRP and health (measured by stress) in an experimental setting. As discussed in the following section, the paper offers two advances on the literature. First, the experimental setting controls for the potential for endogeneity which identifies the direction of causality between PRP and health. Second, it uses an objective measure of stress, the hormone cortisol, to allow for a more accurate physiological response to PRP. Below the study discusses the potential pathways through which PRP may affect low grade stress and hence health, the problem of endogeneity in previous studies and the use of cortisol in measuring stress.
Background
There are several multidisciplinary strands of literature that give the context for this study. The dis- cussion below examines the pathways through which PRP could impact health low grade stress and health, how previous studies using survey data can be affec- ted by endogeneity bias making it hard to identify the direction of causation and how an objective measure of stress can be utilised to investigate the link.
Pathways
Many studies in the field of economics and human resource management have examined the effects of PRP, athough the focus is typically on the effects of PRP on worker productivity and resulting pay (e.g. [2] for a general discussion and [5–7] for examples in different occupations often showing an increase in pay due to PRP). The key aspect of the standard economic theory of PRP is that it gives stronger incentives for workers to work hard since pay is directly tied to performance.
It is this direct link between the payment me-thod, effort and pay that generates several potential pathways through which performance pay may affect the worker’s health. First, there may be an incentive to ‘work harder, not smarter’ and take more risks at work. Hence, there will be an increase of injuries at work as (particularly manual) workers attempt to increase productivity [8–11].
A second pathway suggests a direct relationship between performance pay and physical health. Economic theory suggests that PRP explicitly changes the trade-off between work and leisure, giving a relatively larger return to time spent in work. Thus, workers are induced by PRP to shift hours to work and away from non-work activities [12]. Since non-work activities include time spent on healthy behaviours (such as exercising, sleep, leisure or shopping and cooking healthy meals), an increase of time spent at work should be expected to reduce the above activities. The effects of this change would not manifest themselves on immediate health deterioration (as opposed to having an injury), but their adverse repercussion on health would build up over time. In line with this, the longer the time spent in jobs with performance pay, the higher the odds of having worse overall health, heart problems, stomach problems, and anxiety/depression [13].
A third pathway links performance-related pay directly to increases in the individual’s exposure to stress and his or her health. For instance, data from the General Health Questionnaire (GHQ) show a strong correlation between time spent in PRP jobs and reported stress level. In addition, the literature suggests that higher job stress is associated with lower job satisfaction and higher injury-related absenteeism [5, 15].
The medical literature (e.g. [16, 17]), provides a well understood link between the exposure to stress and the state of the individual’s physical and psycho-logical health. In summary, the physiological processes involved are as follows: In response to stress the immune system redirects white blood cells to areas where injury or infection is most likely, the skin becomes cool and sweaty as blood is drawn away from it toward the heart and muscles which are essential for survival, the mouth becomes dry, and the digestive system slows down. However, when the cause of stress passes, the levels of stress hormones drop and the body’s various organ systems return to normal, a state called ‘Allostasis’. However, an absent or incomplete relaxation response may cause damage such as chronic stress or low grade stress, which prevents physiological arousal from fully returning to normal. A review of the literature on the effects of acute and chronic stress [18] suggests that although the body’s natural defences adapt and thus people can overcome episodes of stress, excessive chronic stress which is constant and persists over an extended period of time can be psychologically and physically damaging. Hence, there is a relationship between inflammatory responses to chronic psychosocial stress and long-term development of disease. The physical effects of chronic stress or low grade stress burden the cardiovascular system, the immune system (increases a person’s risk of getting an infectious illness), the brain (interferes with memory and learning), the musculoskeletal system (intensifies the chronic pain of arthritis and produces tension-type headaches) and the reproductive system (can cause impotence in men and affects fertility) [18].
Endogeneity
In view of these repercussions of chronic stress on health, the aim of this study is to investigate if per-formance-related pay systems significantly increase low grade stress. If so, since most of the life of a working person is spent at work, long exposure to PRP should be expected to generate chronic or low-grade stress which can potentially induce severe deterioration of the working person’s health. However, the link between PRP systems and health is not straightforward. It is not clear if the causation runs from the payment method to increases in stress (and subsequent deterioration in health) or if poor health tends to drive workers into certain kinds of payment methods such as PRP. If a correlation is found between working on a PRP contract and low health status, this may be an outcome of the propensity of individuals with low health status who being unable to perform and hold a job in regular contracts are dislocated to inferior PRP contracts. Alternatively, those who choose PRP contracts might be more adept in stressful situations so subjectively care less about their exposure to stress. Thus, one of the biggest problems in identifying econometrically any link between work contracts (payment methods) and health is that there may be endogeneity between two variables of interest [19]. To disentangle these effects, econometric research either relies on Heckman-type corrections [20] for endogenous selection or instrumental variables procedures. In both cases, the unbiasedness of the estimates depends crucially on the statistical properties of the identifying restrictions. Typically, the case for any selection of identifying restrictions to control statistically for this endogeneity can be challenged on statistical or theoretical grounds. The issue could be resolved if workers were randomly distributed in the different payment contacts, but this is typically not an option in real labour markets. Thus, the present study uses experimental methodology to investigate the possible link between PRP and stress by circumventing endogeneity through initial randomisation of the subjects.
PRP and low grade stress
There are only two papers to date that use an experimental design to look at PRP and stress. The first reports an experiment examining the sorting of more productive players into situations where there are performance-related payments as a result of the game [21]. However, the linkage between stress and PRP is not central to their study, the experimental subjects are simply asked about their stress and exhaustion at the end of the experiment, and those who are in PRP express higher levels of stress and exhaustion at the completion of the experiment.
The second is an experimental study in which the authors examine whether there is increased productivity due to PRP [22]. They also ask the experimental subjects about levels of stress and find that the increase in productivity induced by PRP is fully offset by the increased stress of such a payment scheme such that 25 percent of subjects had lower productivity when paid for their performance particularly among the risk averse.
Both papers suggest that there is a link between PRP and stress, but both allow selection into PRP and so may be biased by endogeneity. In addition, both use self-reported Likert scale responses to a question inquiring about the stress felt by the subjects during the PRP experience. While self-reported stress may be suggestive of the true underlying, physiological changes in stress, it is not an objective measure of stress as subjective responses are contaminated by beliefs, cognitive dissonance and adaptation effects.
In psychology and more broadly in the medical literature, stress can be measured objectively through assessing the individual’s heart rate, blood pressure, or cortisol levels in blood or saliva. Cortisol is a hormone secreted when people are in stressful situations. In surveys of the literature on the use of cortisol as a physiological indicator of stress by Kirschbaum et al. and Nicolson et al. [23, 24], it is found that cortisol is quickly released into the body when a person experiences stress. In addition, the cortisol appears in saliva, thus negating the need for invasive blood tests (which may themselves potentially cause stress in subjects). Furthermore, the saliva test remains accurate even at room temperature for at least a week and hence facilitates repeated experiments.
In addition to being an important signal of stress to the body, the release of cortisol has real effects on the body. Under normal circumstances, cortisol (as part of the hypothalamic-pituitary-adrenal or HPA axis) helps to regulate the body’s response to stress by generally suppressing reactions to stress through allostasis, helping the body return to a normal equilibrium [25]. However, repeated or chronic stress can cause allostatic load [26] which dampens the ability of the body to return to ‘normal’ either by causing stress reactions such as increased blood pressure to continue beyond the direct impact of stress or by suppressing normal (e.g. immune system) responses to stress [26]. While these are simple examples and medical research [27] finds that the mechanisms can be quite complex, current medical research suggests a strong link between cortisol (and other HPA-axis hormones) and adverse health outcomes.
Summary
The literature above suggests a link between PRP and stress or health. However, problems with endogeneity and subjectivity may be influencing these links. Thus, the present study, first, randomly allocates subjects to two groups to circumvent endogeneity arising from selection bias. In one group, subjects are paid a flat rate while subjects in the other group are paid for their individual performance. Second, it uses an objective marker of stress (salivary cortisol level). The next section details how the design of the experiment accomplishes these two innovations in this literature.
The design of the experiment
The performance task used in this study was to let subjects calculate a variety of mathematics problems by hand and enter the result in a computer. This is similar to the methodology utilised by Dohmen and Falk [21]. These calculations last for ten minutes which is sufficient time for the rise of cortisol levels in the presence of possible stress. Subjects are randomly assigned either being paid by the number of questions answered correctly (the PRP group) or being paid a flat fee for answering ten questions correctly (the non-PRP group). During the experiment the computer program z-Tree [28] is used in order to record the correct answers and calculate payoffs. The study protocol has been reviewed and approved by the University of Aberdeen, College of Life Sciences & Medicine Ethics Review Board (CERB/2015/5/1198).
Forty subjects (average 22.7 years of age, 57.5% are female) were invited to the Scottish Experimental Economics Lab (SEEL) at the University of Aberdeen over two sessions (20 subjects per session) to generate the data. The students were recruited using the Online Recruitment System for Economic Experiments (ORSEE) database of potential subjects which is maintained by SEEL. Students were given details about the broad parameters of the experiment and the procedure of the cortisol sample testing. The subjects were also advised, in line with standard cortisol testing protocols, to abstain from eating, drinking caffeine, smoking or taking exercise two hours before the commencement of the experiment. To this effect reminders were sent via email 24 hours before the experiment was scheduled to take place. The two experimental sessions took place at 1500 hours on the Monday and the Wednesday of the same week to control for the known diurnal patterning of cortisol production and to standardise the experience of participants. Saliva samples and subjective stress reports were obtained before and after participation in the experiment. These provide the objective and subjective the stress measures for the individual subject, respectively.
The computer generated randomisation allocated 57.5% (23 out of 40) to the PRP group and the remaining 17 to the flat fee group. The random assignment of PRP to subjects addresses the issues of endogeneity and non-random selection discussed above. Therefore, the experiment tests whether there is a direct causal relationship between PRP and stress level by examining differences in cortisol (and the subjective measures of stress) across the two groups. Thus, one should expect that any comparisons should be representative of the direction and the strength of the PRP –stress relationship directly without any interference of the self-selection effects.
The experiment
Upon arrival to SEEL, subjects were given a con-sent sheet that they signed and were randomly allocated a seat at a computer terminal. Screens between terminals prevented subjects from seeing other subjects and their terminals. When all 20 subjects were registered, an outline of the experiment was read to the group and any questions were answered. All subjects were informed that for their participation they would earn £5, with the opportunity to earn more money during the experiment.
In the next stage a baseline of stress measurement for the subjects was obtained. First, instructions were given for the cortisol test, which involved chewing a cotton swab (SalivaBio Oral Swab, Salimetrics Europe) for 60 seconds and then placing the swab in a test-tube. Subsequently, a computer-based questionnaire was given to each subject, including the item ‘how stressed do you feel?’ (1 = not stressed at all, 5 = very stressed) (as in [21]).
After the completion of the questionnaires about stress, all subjects were given a practice round of answering three mathematical questions –one addition (in thousands), one multiplication (hundreds by tens) and one division (thousands by tens). Subjects were allowed to use a scratch piece of paper, but no calculator. This part of activity was not timed. After everyone completed this practice round, the computer program randomly assigned subjects into the PRP or non-PRP group. Subjects were individually told by the computer how they would be paid. The payment schemes were as follows: 20p for each correct answer for those randomly allocated to the PRP group and a £5 flat payment for the non-PRP group if at least ten questions were answered correctly. Subjects worked independently and were not aware of what the alternative payment schedule was or of how other subjects were being paid. Subjects were told that they had ten minutes to do the mathematics questions and that there were a maximum of 50 questions. When all were ready, a clock appeared on all screens counting down the time in seconds. During the task, the number of questions answered correctly was also displayed at the top of the screen. The mathematical questions (either multiplication, addition, or division) were shown in the middle of the screen. When those in the non-PRP group accomplished the minimum performance target of ten correct answers, this was indicated at the top of the screen. Although they were not eligible to receive any additional pay and were told this on their screen, subjects in this group could still continue to answer questions if they wished.
At the end of ten minutes, the task was stopped. Because the cortisol response to an acute stressor peaks around 20 minutes after stressor onset [29, 30], the subjects were asked to leisurely complete several tasks for the 10 minutes after the (10 minute) experiment was complete. First, they again gave a subjective rating of their stress level and provided some demographic information (e.g. gender, year at university, broad discipline of studies and age) before completing two non-stressful filler tasks (rating vignettes of potential jobs with different characteristics and colouring patterns). At the end of the ten minutes, subjects took a second cortisol test by chewing on another swab for one minute and putting the swab in a different, labelled test-tube after the minute was over.
Finally, subjects were called to the control room by seat number to get their payment and were thanked for their participation. Test-tubes were transferred into a freezer and when both sessions were complete, the frozen samples were sent to a laboratory (Salimetrics, Europe) for analysis of cortisol levels for each subject before and after the task.
Results
On average, PRP subjects answered 34.7 questions correctly (SD 9.6) whereas the non-PRP answered 32.1 questions correctly (SD 11.4). The standard deviation regarding the correctly answered questions is greater for the non-PRP mainly due to the fact that some subjects stopped after meeting the minimum requirement of correct questions. The average pay-out was £11.90 for the PRP group ranging from £8.80 to £14.60 and £10 for the non-PRP group. It is noteworthy that all non-PRP subjects obtained the minimum required performance.
An initial analysis of the cortisol measures of stress identified one subject whose cortisol measurement in the first assay was assessed to have a hormone level more than four standard deviations from the mean level of cortisol for the other 39 subjects. Given the likelihood that this measure reflected contamination and the potential for this clear outlier to affect comparisons, the information for this particular subject was excluded from the subsequent analysis, leaving 39 subjects in the analysis.
Stress results by group
Table 1 and Fig. 1 show the means of the stress levels for the cortisol and subjective measurements at the start (before PRP assignment) and the end of the experiment. A clear pattern emerges from these results with the objective and subjective measures of stress increasing for the PRP group and falling for the non-PRP group. Figure 2 shows the differences between the start and the end of the experiment for the stress of the two groups for the two measures, again showing that for the PRP group there is some positive elevation in stress reflected in both cortisol and subjective measures of stress but there are noticeable decreases for the two measures in the non-PRP group.
Descriptive results of stress by payment type
Descriptive results of stress by payment type

Cortisol and subjective stress measurements; before and after treatment.

Cortisol and subjective measurements; before and after treatment.
While suggestive, there figures do not indicate whether these changes are statistically different. In order to test this, first there is a need to establish that both the PRP and the non-PRP groups have unsignificant differences in the levels of mean stress before the treatment. To assess the validity of this requirement a t-test comparing the level of stress for the PRP and non-PRP groups before treatment is employed.
Statistical tests applied
Statistical tests applied
Snedecor and Cochran [31] suggest that unequal variances can influence the analysis. Hence in the first step, the Folded F-test for unequal variances is used to evaluate if equality of variances can be assumed. If there is support for inequality of variances, the Satterthwaite test for unequal variances is utilised, otherwise the Pooled test for equal variances is used. Research suggests that when in doubt it is safer to assume unequal variances.
In view of the above, first the homogeneity of variances between the two groups is assessed using the Folded F-test (see Table 1) that indicate that the two groups may have unequal variances with a p-value of 0.0343. In view of this, the Satterthwaite Unequal Variances test is employed giving a test statistic of 1.90 which shows that the null hypothesis of equality of means cannot be rejected at the 0.05 level of statistical significance.
Therefore, a Two-Sample t-test for the difference of the changes of stress between the start and the end of the experiment for the two groups is employed, under the null that there is no difference in the elevation of stress between the PRP and the non-PRP group. A Folded F-test for homogeneity of variances post-treatment shows that the null cannot be rejected with a p-value of 0.1867. Hence the Pooled for Equal Variances t-test reveals that the differences in the changes of stress between PRP and non-PRP is statistically significant with a p-value of 0.0476. Given that the stress level of the PRP group went up, then this lends support to the conjecture that the PRP payment scheme increases stress in subjects.
The above statistical procedure is also used for the subjective stress measure. The Folded F-test for the equal variances before assignment suggest that there is no difference in variances in the subjective stress measure (p-value of 0.3327). Thus, a standard t-test of equal means across the two groups cannot reject the null of equal means of the subjectively measured stress at the start of the experiment (p-value of 0.2068).
For the post-treatment differences in variances, a Folded F-test suggests no difference in variance (p-value of 0.3567). Interestingly, however, the Pooled for Equal Variances test reveals that the differences in the elevation of subjective stress between PRP and non-PRP are statistically insignificant (p-value of 0.2897).
Figure 3 shows that differences in productivity between the PRP and the non-PRP group as reflected in the group mean in correct answers during the experiment. The figure suggests no great difference in productivity between the two groups. This is confirmed by the formal statistical test. The Folded F-test cannot reject (p-value of 0.224) the null of equal variances. Testing the equality of the average number of correct answers, the Pooled for Equal Variances t-test shows that the difference in productivity between the two groups is statistically insignificant (p-value: 0.4515).

Productivity measurement; correct answers.
This study provides a real-effort experiment to reveal whether there is a link between PRP and low grade stress. Unlike previous research on stress and PRP in an experimental setting, the experimental methodology in this study used random assignment to PRP to circumvent concerns of self-selection and endogeneity bias and measures stress by both subjective and objective means.
The results show that low grade stress increases over the course of a simulated work task for subjects paid by PRP but falls for the non-PRP group. The differences in stress between the start and the end of the experiment show that for the PRP group there is some positive elevation in stress reflected in both cortisol and subjective measures of stress but there are noticeable decreases for the two measures in the non-PRP group. Furthermore, the study reveals that the differences in the changes of objectively measured stress between PRP and non-PRP are statistically significant. In view that the stress level of the PRP group increased, this lends support to previous studies proposing that the PRP payment scheme is inducing stress in subjects.
However, the study reveals that contrary to the obj-ective measures of stress through cortisol elevation, the small differences in the elevation of subjective stress between PRP and non-PRP are stati- stically insignificant. This implies that the respondents are not aware of their own body’s response to stress. Hence subjective responses may be contaminated by beliefs or cognitive dissonance. This issue needs further investigation. Finally, the results show that contrary to several studies (e.g. [1]), there appear to be no significant difference in productivity between the two groups.
Although the results are suggestive that PRP generates stress, this is a small study based only on 39 respondents. Future work will expand on this sample to measure the effects of PRP on stress across different groups. For example, as Dohmen and Falk [21] and others suggest, there are important differences in the effects of PRP by gender, perhaps caused by attitudes of risk. Although these issues should be circumvented due to the randomisation methodology of this study, further investigation for these issues is important. Another interesting extension moves beyond random assignment which is more similar to a ‘real world’ change in the employment contract for a firm. Many employment situations involve the element of choice of PRP or non-PRP (e.g. when people are looking for jobs) and the interaction between stress and PRP may be different in these cases. Allowing subjects to choose a PRP or non-PRP job would allow for this selection in an experimental setting and an insight on whether selection is actually a mitigating influence on stress in PRP settings –a common critique of survey-based papers on the stress-PRP relationship.
Conclusions
Results suggest that the subjective measure of self-reported stress levels and the objective stress measure obtained by measuring cortisol move in a similar direction for the PRP and non-PRP groups, but only the movement in the cortisol shows statistically significant differences between the two groups. This might suggest that individuals are underestimating the stress generated by the PRP method of payment systems, though more work will be needed to test this formally.
Confict of interest
None to report.
Footnotes
Acknowledgments
The financial support for this study by the Scottish Economic Society is gratefully acknowledged and appreciated. The authors are grateful for helpful comments by participants at the 2016 Scottish Economic Society Conference and seminar participants at the University of Aberdeen and the Université Panthéon-Assas as well as Daniel Powell. Help with z-Tree programming from Maria Bigoni is also greatly appreciated. All errors remain with the authors.
