Abstract
Identifying genuine underpayment of minimum wages is not straightforward. Some well-known statistical issues affect the measurement of compliance rates, but factors such as processing or behavioural influences amongst respondents can also have an impact. We study the quantitative measurement of non-compliance with the minimum wage, using UK apprentices (who have particularly high non-compliance rates) as a case study. We show that understanding the institutional and behavioural context can be invaluable, as can triangulation of different sources. While the binary nature of compliance makes such problems easier to identify and evaluate, this analysis holds wider lessons for the understanding of the characteristics of large and complex datasets.
Introduction
Minimum wages are widespread, particularly in high-income countries (HICs)-22 out of 28 EU countries (26 out of 34 OECD countries) had a statutory minimum wage in 2015, and most of the others had various wage floors determined by collective agreements [1, 2]. Minimum wages are seen by their supporters as a key part of the social infrastructure, and by some policy makers as a way to shift the burden of in-work social security expenditure from the state to the employer [3]. Ensuring that wages paid are compliant with the statutory minimum is therefore not just a legal necessity but important to social policy. Enforcement agencies use estimates of non-compliance to allocate monitoring resources effectively.
Studies generally assume that non-compliance is accurately measured. However, the binary nature of the compliance measure (is the wage lawful or not?) can cause problems for analysis. Consider what happens if the true wage peaks at the minimum but the observed wage is measured with a randomly distributed error with mean zero. Estimates of the mean and median wage are unbiased but non-compliance exceeds the true value: the increased dispersion of wages around the minimum wage level generates more observations falling on the ‘wrong’ side of the line.
There has been little research on the accuracy of non-compliance data in the economics, employment studies or statistics literature. Several papers have looked at non-compliance rates in low – and middle-income countries (LMICs), but these are so high as to make accurate measurement a refinement rather than an essential issue. In contrast, non-compliance in HICs is typically very low [4] and so accurate measurement can make a substantial difference to non-compliance rates and subsequent policy analyses.
Despite the importance of accurate measurement in HICs, the literature is largely limited to a handful of papers from the last century. One might expect policymakers and regulators, who have a direct interest, to fill the gap in the academic literature, but they seem reluctant to do so. With the exception of the UK, very few agencies have carried out specific analyses of non-compliance rates. The US Bureau of Labor Statistics website has no analysis of non-compliance, arguing ([5]: Technical Notes) that deriving an hourly wage leads to excessive measurement error. The New Zealand Ministry of Business Innovation and Employment produces relevant numbers but no analysis [6], while the Australian Fair Work Commission has commissioned over fifty research reports since 2006 but only one [7] directly addresses non-compliance. Moreover, both Antipodean ministries note that there are insufficient data to distinguish genuine non-compliance from misreporting [6, 7].
There is therefore a substantial gap in the literature. This paper aims to address that gap by analysing the sequence of events that lead from wage payment to policy inference, via the collection of quantitative data. We then apply this framework to study a particular case, non-compliance rates for UK apprentices. In the analysis, we pay particular attention to the effect of data collection processes, an area that is often overlooked when studying the statistical characteristics of data.
The rest of this paper proceeds as follows. The next section reviews the relevant literature. Section three revisits the data production process and considers the problems that may arise at each stage; section four then applies this framework to the specific case of UK apprentices. Finally, section five concludes by considering the wider lessons for statistical analysis, particularly given the likely future use of more complex data sources.
In this paper we focus on the measurement of non-compliance through statistical sources. One of the major problems in non-compliance is the existence of the informal, unmeasured, economy where non-compliance is thought to be much more common [8]. This is typically identified through qualitative research methods [9, 10, 11] and so it is outside the scope of this paper.
Previous work on non-compliance
In their extensive survey of the literature on minimum wages, Belman and Wolfson [12] report over 200 policy and academic papers published in English between 1992 and 2013 on minimum wages. However, a formal literature search indicates barely a handful of academic papers on non-compliance, and only two focusing on its measurement. Similarly, a literature review on non-compliance commissioned by the UK Low Pay Commission (LPC) largely turned up only reports and evidence submitted to the LPC itself [13].
Early academic analysis [14, 15, 16, 17, 18, 19, 20] focused on understanding the drivers of non-compliance in HICs. Using US data, Gramlich et al. [14] showed statistical relationships between non-compliance and other variables. Ashenfelter and Smith [15] used the same data but developed a cost-benefit framework that provided the theoretical template for later papers.
For these papers, the accuracy of the non-compliance measure is of secondary importance: small variations do not materially affect the broad functional relationships being identified (for example, between non-compliance rates and industry or occupation). Even studies that specifically estimate the probability of non-compliance at the individual level (e.g. [1, 21, 22, 23]), the quantitative and qualitative results do not seem to be sensitive to the particular definition of non-compliance used.
These papers recognise potential measurement problems. Three solutions are generally implemented:
Fuzziness: wages ‘close enough’ to the minimum wage may be considered as compliant. Quality restrictions: only high-quality data are used. Triangulation: data sources with different characteristics are checked for consistent findings.
For example, Ashenfelter and Smith [15] do all three: they treat a wage as being at the minimum if it is within
Since 2000, the focus of non-compliance analyses has shifted to LMICs, particularly Latin America and sub-Saharan Africa (see, for example [23, 24, 25, 26]). These papers rarely consider issues concerning the accuracy of the data. One reason may be that the data are relatively limited; for example, Strobl and Walsh [24] only had categorical data, while Ye et al. [23] only had annual wages, and supplementary payments were not accurately identified. However, a more likely reason is the very high rates of non-compliance in LMICs. Cross-country studies such as [26, 27, 28, 29] show compliance levels well below 100%; for Mali, Rani et al. [28] estimate compliance at only 10%. Small inaccuracies in the data are less important, as these do not change results substantially. For example, Yamada [25, p. 46] notes that allowing for measurement error does change compliance rates, but with compliance below 40% in all sectors, findings are very robust to alternative specifications.
In recent years there has been a resurgence of interest in non-compliance in HICs, largely driven by OECD-sponsored research and an awareness that the high compliance rates in the OECD are not problem-free. For example, Kampelmann et al. [4] repeatedly highlight the impact of measurement error on non-compliance estimates in Europe. In response, they provide alternative measures of non-compliance where the true minimum wage is reduced by 25% to produce ‘lower bound’ estimates. Garnero et al. [1] allow both 25% and 15% margins of error, although this is partly to allow for approximate compliance in countries where the minimum wage is bargained rather than statutory.
Finally, there is the question of what is meant by ‘non-compliance’. Bhorat et al. [30] argue that there is a qualitative difference between one or two cents below the minimum wage and being paid a dollar or euro below. They propose a weighted index such that trivial violations are ignorable, and which usefully includes counts and linear summations among its special cases. Applying their methodology to South Africa, they find a correlation between the number and scale of violations. Ham [31] comes to the same conclusion using data from Honduras.
In sum, academic researchers are prepared to live with inaccuracies in the measurement of non-compliance because the scale relationships they are interested in are not sensitive to the particular specification. However, this finding may arise because many analyses use ad hoc adjustments to deal with low-quality or missing data, leading to the circular argument that low-quality data shows that high-quality data are not needed.
This seems to have influenced regulators and policy analysts, who have largely ignored the issue of accurate measurement of non-compliance. One would expect those producing and using compliance statistics to be concerned about the accuracy of the measures. However, as noted in the Introduction, there seems relatively little interest among such bodies. The exception is the UK: the literature review [13] confirmed that most of what is known about non-compliance in the UK is derived from reports commissioned by the Low Pay Commission (LPC), or from evidence submitted to the LPC’s annual consultation exercise:
“…although it has rarely been an explicit part of our remit from the Government, we have always reported on the evidence gathered on compliance and enforcement matters through our consultation processes, and regarded this as an integral part of our role in advising on the minimum wage.” [32, p. 199].
Analyses such as the LPC’s 2017 review [33], based on quantitative and qualitative research, explicitly reference concerns over the accuracy of non-compliance estimates, particularly since early reports [34, 35] described problems of processing and rounding. Le Roux et al. [21] were the first to explicitly test the robustness of non-compliance estimates to data inaccuracies, and concluded that, like earlier academic analyses, multivariate relationships were not sensitive to these data limitations (although simple counts were). However, while there is an element of peer-review in the LPC’s annual research workshops, these findings have stayed within the realm of ‘government reports’ rather than adding to the academic research canon.
The aim of the analyst is to identify the amount and, ideally, the causes of non-compliance, but there are several processes that occur between employer decisions and observable evidence:
Employers decide what to pay workers. Employers pay workers. A subset of those worker-employer interactions are identified for sampling. Hours and earnings information is gathered from those sampled. Data are processed into suitable measures. Data are weighted. Compliance is calculated. Inferences are drawn.
The first two stages differ – the wage the employer intends to pay may not be the one that is actually paid; quantitative and qualitative findings from Ritchie et al. [49] show that intended compliance is higher than actual compliance – and each subsequent stage can affect the reliability of information available. This section considers stages (3) to (8) in turn as a potential source of error. Many of the examples come from the UK experience as this is the most scrutinised minimum wage regime, but we concentrate on the general scope for error.
The first concern is that there may be under-sampling of those below the minimum wage. For example, employers paying below minimum wages are less likely to respond to official surveys. Moreover, employees being paid below minimum wages are likely to be in jobs where the employer is ‘powerful’, in a psychological and sociological sense, and so may be discouraged from providing potentially negative information about their employer [8].
A larger problem is the existence of the informal economy, where non-compliance is more likely to be found [8, 11], and the population is by definition unknown. Previous research suggests that non-compliance is much more widespread amongst groups such as migrant workers, agricultural labourers, or family workers, particularly if these workers are illegally employed (see [28] for a range of developing countries; [31] for Honduras; [9, 11, 32] for the UK; however, [36] find no supporting evidence in South Africa). Workers in these groups are likely to face substantial pressure not to respond to official requests for information; or they may collude with employers to exploit social support systems [9].
There are concerns about de facto employees misclassified as ‘self-employed’ specifically to avoid employment regulations [11]. Employee surveys typically ask individuals to self-report their employment status; it may not be easy for an agency worker to accurately identify this, even with knowledge of her contract.
Finally, there is the question of timing. Consider collecting wage data shortly after the minimum wage has increased, and identifying wages that fall below the new minimum, but not the old one. There are three possibilities:
The ‘reference period’ of the interview is the period when the old minimum applied (the wage is compliant). The uprating has been delayed but will be implemented with back pay (the wage is not compliant at present, but the issue will be resolved). The uprating has not been paid and will not be backdated if it is paid (the wage is non-compliant).
The second case is problematic: it is not clear whether this should be marked as ‘compliant’ or not. It is also unclear how to distinguish between the last two cases. No large-scale survey of earnings in any country appears to collect information later in the year that would allow the analyst to distinguish delayed payment from non-payment. Ormerod and Ritchie [35] noted that, in the UK, simple non-compliance rates fall continuously through the year, but that the pace of this fall settles down after two quarters. However, while the implication is that this is a form of delayed wage-setting, it cannot be tested directly and cannot identify whether back pay is being paid.
The standard assumption is that employer data are of higher quality since they are derived from pay records. There is some evidence supporting this (for example [34, 37], in the UK [15]; in the US [6, 12]; elsewhere). However, this assumption has also been questioned: as paying below a statutory minimum wage is an offence, employers may not want to submit such non-compliant wages, and they may adjust the data before reporting [22].
LPC [32] also notes the scope for collusion between employer and employee: an employee may be paid below the minimum wage but the reported hours may at the same time be adjusted downwards to meet the minimum wage threshold; this could, for example, be used to allow the employee to claim welfare benefits based on part-time working. Metcalf [8] notes that in family-operated businesses, an hourly wage is not relevant: workers receive a daily or weekly wage and are required to put the hours in as necessary. Hours are then adjusted as necessary if reporting is requested by an authority. Similar to the sampling issue, the scale of such deliberate misreporting is fundamentally unknowable, because by definition the data collected are all that the respondent feels able to deliver.
Even if the data are honestly provided, recall bias presents problems. In the UK the rounding of wage data in employee surveys is a well-established phenomenon affecting non-compliance [35, 38, 39]. Fry and Ritchie [38] predicted distributions for a major employee survey, arguing that a minimum wage of £4.98 would lead many employees to report £5.00. Their estimates were too conservative: all employees reported £5.00 or above, despite employer data showing considerable numbers at £4.98.
The evidence from the UK (to date, the only country where this appears to have been studied in detail) suggests that rounding in employee surveys is persistent, time- and scale-invariant, and occurs at the point of data collection: hourly-paid workers round hourly wages, the weekly paid round weekly wages, and salaried workers round their salaries. Checking documentation when responding to the survey reduces the chance of rounding [39, 40], and Le Roux et al. [21] show that this can have a considerable impact on the compliance rate.
These findings in terms of wages are so far restricted to the UK but the persistence of such results suggests that this is a human phenomenon that would be reproduced in other countries. A similar preference for human-friendly, approximate responses is, for example, well established in fields such as environmental evaluation, where respondents gravitate towards focal points in round numbers of pounds, dollars or euros ([41], for example).
Notwithstanding the problems of dishonest or rounded responses, respondents may simply fail to provide accurate information through, for example, inability to account for breaks accurately, or failure to understand the questions being asked. Such errors are unlikely to be corrected by the respondents; human thinking is less focused on evaluating actions and much more focused on rationalising them post-factum [42].
Sometimes these may be amenable to retrospective adjustment. Griffiths et al. [34] improved accuracy in an employer survey by studying inconsistencies in wage components (for example, overtime hours matched with overtime pay). In other cases, there may be only the identification of the problem: Ritchie et al. [43] show that hours data are clearly missing from an employee dataset, but there is no information which can fill in the gaps.
Not all employees are paid time-based earnings. Employees may be paid piece-rates, for example in agriculture, textiles, or home-based work; these appear to be more common in developing countries. There is a conceptual problem with asserting whether a piece-rate worker is under-paid or not: does a slow worker have the right to the same hourly wage as a fast worker? In some countries, minimum wages for piece-rate workers are based on an average time to completion, but calculating compliance requires much more information. The practical problems associated with piecework are clear: Gittleman and Pierce [44], for example, note that only total piecework remuneration is available in US data, with no data on hours worked. Finally, piece rates may also be associated with informal or hidden work [45], making identification doubly difficult.
Other problems include the time spent travelling to work (for example, for domiciliary care workers), tips, and bonuses or other incentive payments. It may not be clear (a) whether these should be included in wage calculations, and (b) whether they are actually included in collected data. If this information is not available (and in OECD countries, it seems rare), accurate rates cannot be calculated.
Labour market data collection are still dominated by surveys, but they are increasingly being replaced by administrative data sources, particularly in high-income countries. Administrative data present different quality issues. These are less likely to be subject to recall bias, rounding, or other errors associated with questionnaire or survey data. However, they are also likely to have been cleaned and checked to meet the original data collector’s administrative purpose, not the statistical purposes; and they require individuals to have interacted consistently with the administrative source. As Oberski et al. [46] point out, it is in the nature of administrative data that no ‘gold standard’ for quality checking exists: every administrative source will have errors deriving from a unique data collection process.
Processing
Researchers are in general aware of the statistical problems noted above, but rarely consider the impact of statistical production processes. Statistics New Zealand is unusual in flagging up processing errors to researchers [47]. For a binary question such as compliance, minor variations can have disproportionate effects. These need not be errors: Griffiths et al. [34] demonstrated that changing precision to five or six decimal places increased non-compliance rates from 1.2% to 1.7%.
Processing poses particular difficulties: it is not clear where to look for problems, particularly when using data from statistical offices who are not normally resourced to look at minor variations. In recent years the dominant model has become ‘statistical editing’, where potential errors are only investigated if they substantially change the statistical aggregates. This generally applies to very large respondents, or ones in very small sample groups. Neither of these applies to low wage employees: the wage estimates that statistics offices produce are not materially affected by such errors. As a result, data on the low-waged are less likely to be checked.
For the statistical office to address this requires willingness, sufficient knowledge to design checks correctly, and appropriate resources. For example, the UK’s Office for National Statistics (ONS) has a specific rule: check input data if it leads to an hourly wage being below the minimum wage. However, ONS is only resourced to check the accuracy of the data recorded on its form, not whether the respondent arrived at the information correctly.
With household surveys (as opposed to data from employer administrative records), an extra element of processing error comes from the presence of the interviewer. One of the authors of this paper has observed household data collection in the UK, and noted that interviewers do not sometimes record the full detail offered by respondents; there seems no reason to suspect that interviewers in other countries are any more or less diligent. On the other hand, it is debatable how important this is, given the other problems in answering accurately; the ONS did change its interviewer instructions after the publication of [35], but this appeared to have had little impact.
Allowance for processing error is usually done through allowing wages within a band to be ‘compliant’. However, the band varies considerably. In the UK, employer responses are typically rounded to the nearest penny [39] because the data are generally assumed to be precise. In contrast, Garnero et al. [1, 22] treated wages up to 25% below the minimum as ‘compliant’, to allow for wider measurement error.
Weighting
Most data sources are sampled, and so contain weights to produce population estimates. These are based upon the ex ante sampling weights, adjusted for actual responses. The ex post adjusted weights should reflect the expected sampling problems addressed above.
Nevertheless, there is often limited opportunity to verify the accuracy of ex post sampling weights. For example, in the UK there are three sources of information on the workforce: the Annual Survey of Hours and Earnings (ASHE), the Labour Force Survey (LFS) and the decennial Census. However, these are not independent: ASHE weights are derived from the LFS, whose weights are in turn derived from the decennial Census, with some intercensal adjustment. Hence, while these weights are the ‘best available estimate’, triangulation between these sources to establish a ‘robust’ estimate is not statistically valid.
It is possible to use administrative records (such as tax information) to provide an independent source, but this tends not to have the necessary detail on respondent characteristics to allow stratification of estimates. Where data is wholly based on an administrative population, such as Statistics New Zealand’s Linked Employer-Employee Data [47], weighting is theoretically unnecessary but other surveys are still referenced for triangulation purposes.
Calculation of the non-compliance rate
The typical reported ‘rate’ is the proportion non-compliant over the whole workforce, or at least the relevant age group. This answers the question “what proportion of workers (in that age group) are being paid below the statutory minimum?”
Some analyses use a subset closer to the minimum wage. The rationale is that the minimum wage is not relevant for high-earners: non-compliance should instead answer the question “Out of the very low paid, what proportion of those are not getting the statutory minimum?” The difficulty here is that the number of ‘low paid’ employees is a subjective measure. Should the denominator include all those at or below the minimum wage, or should it also include some above the minimum?
In the UK, the LPC defines a ‘minimum wage worker’ as one earning the minimum wage plus 4p (5p band including the rate). In its 2016 report the LPC began reporting non-compliance as a proportion of both all workers and ‘minimum wage workers’. However, Fry and Ritchie [39, p. 6] argue that the LPC definition ignores the human propensity to pay (and report) at rounded numbers. They recommended using “NMW up to the next 10p rounding point”, claiming that this would reduce fluctuations in non-compliance rates caused by arbitrary shifts in the denominator. Under this definition, an adult worker would be a “minimum wage” one if being paid £6.00 in 2011 when the adult NMW was £5.93, whereas the LPC definition would exclude these workers.
Overall, it is clear that the definition of the preferred ‘non-compliance rate’ is not unambiguous. Bhorat et al. [30] argue that any single ‘rate’ is of limited information as it does not capture the scale of the problem. They propose weighting underpayment by distance from the minimum, in line with some poverty indices:
The measure of interest,
The difficulty with these measures is interpreting the resulting values, particularly as no agency has yet adopted them and so there is no benchmark. However, informative relative wage effects can be drawn out, to analyse variation over time [48, 49] or changes over both time and countries [50]. Ritchie et al. [49] augment the Bhorat indexes with a ‘penny index’ to show that the absolute size of underpayment appears to be relatively stable, and falling underpayment is due to the rise in the denominator value.
Drawing inferences
What inferences can be drawn from compliance measures? Most work on the minimum wage is carried out by economists, who tend to use data to test pre-identified hypotheses, and then interpret results with reference to those hypotheses. This approach has been criticised by statisticians, who argue that this approach encourages confirmation bias in analysis.
More generally, econometric or statistical analysis is often plagued by the ‘identification problem’: several alternative theories are consistent with the same statistical finding. For example, the stylised fact that more education is associated with higher wages is consistent with human capital, search and Marxian theories of the labour market.
Alternative data sources can be used to help identification. Several countries produce employer and employee surveys, and these provide scope for triangulation. For example, Ritchie et al. [39] show that, once genuine rounding behaviour by firms (identified in employer surveys) is taken into account, measurement error in employee surveys due to rounding appears to be entirely random. However, it is rare to find clear results unambiguously aligned to theory. Le Roux et al. [21], for example, find that the different data sources tell opposite stories concerning the impact of the recession on non-compliance.
Case study: Measuring non-compliance amongst UK apprentices
The previous section covered the factors affecting the estimates of non-compli-ance. We now proceed with a case study on measuring compliance among UK apprentices. We study the period 2011–2015 when the first assessments of apprentice non-compliance were made, and we contrast the stories told by different information sources.
We focus on apprentices for four reasons. First, apprentice pay is more complicated than the standard UK minimum wage; this seems to be partially responsible for non-compliance. Second, a dedicated survey of apprentice pay is available to measure non-compliance, allowing comparison with other data sources. Third, the survey was radically overhauled after two waves; the reasons for the overhaul shed light on expectations of respondent behaviour. Finally, the NMW for apprentices is considerably lower than for any other group; this may be a source of non-compliance in itself, as it increases the relative size of rounding errors.
The minimum-wage context for UK apprentices
The statutory National Minimum Wage (NMW) was introduced in the UK in 1999 for all employees aged 16 and over, with age-related rates set each October. Apprentices became eligible for an ‘apprentice rate’ (NMWAR) in 2010. Table 1 presents the relevant rates for each year and group.
NMW rates by applicable group
NMW rates by applicable group
The NMWAR is unusual in having both age and year-of-training components. The NMWAR applies to apprentices on their first year of training or if they are under 19; otherwise, the standard age-specific NMW applies. Note that the age boundary for the NMWAR (16–18) does not align with that of the standard NMW (16–17, 18–20). Measuring hours for apprentices is also more complicated: a minimum number of on- and off-the-job training hours should be included in paid hours.
The ONS’ Annual Survey of Hours and Earnings (ASHE) collects data from employers on 1% of employees. This is the primary data source for low pay analysis in the UK, but it has only collected information to allow the analysis of apprentice pay since 2013.
In 2011, an Apprentice Pay Survey (APS) was commissioned to monitor specifically the impact of the new NMWAR. It was a random sample of all those registered for an apprenticeship in the UK, with different levels of coverage across the four countries of the UK. The 2011 and 2012 APS are described in [51, 52], respectively. These surveys were heavily criticised [43], and in 2014 a completely redesigned APS went into the field [53].
Overall non-compliance with the NMW has been relatively stable at about 1% of the employed workforce. However, since the start of the recession, non-compliance amongst the under-21s has risen substantially: from being relatively stable until 2009 at round 3.5%–4%, it has steadily increased and in 2015 was 8.5% for 16–17 year olds and 7% for those aged 18–20 [32].
Non-compliance for apprentices is much higher than for other workers. LPC [32] argues that most of this is due to the rising number of young apprentices. As well as being higher overall, rates are higher for those not on the NMWAR, particularly for those aged 19–20 (see Table 2).
Overall non-compliance rates for apprentices (in % of total apprentices)
Source: Authors’ calculations; APS 2011, 2012 and 2014, weighted data; ASHE 2013–15, unweighted data.
The 2012 APS shows non-compliance rates of around 50% for 19–20 year old apprentices, while even higher rates are observed for some occupational sub-groups such as hairdressers [43]. These findings are consistent across different datasets, years, and combinations of characteristics (such as apprenticeship framework or gender). As will be discussed below, some of this is almost certainly due to errors in survey design, but even so non-compliance rates for apprentices are significantly higher than for other workers.
We now proceed with the examination of the issues identified in the previous section, evaluating both the APS and ASHE as sources of information.
Sampling
A major criticism of government surveys is that these do not fully incorporate the ‘grey’, cash-in-hand or illegal economy. There are also concerns about de facto employees wrongly classified as ‘self-employed’ specifically to avoid employment regulations. By its nature, the extent of non-compliance because of these reasons is unknowable.
For the APS this problem appears to be addressed effectively. The population of apprentices is known with a great deal of certainty: only those registered count as apprentices, and this is the sampling frame for the APS (with some rare exceptions due to historical anomalies). The APS is voluntary, and response rates are relatively low. Nevertheless, the survey designers argue reasonably [52, 53] that the responses are representative of the apprentice population.
However, the timing of the 2012 survey is a major problem. The 2011 and 2014 surveys took place in the middle of each year, when minimum wage adjustments tend to have settled down [38]. In contrast, the 2012 APS went into the field in October 2012, just after the minimum wage had changed. It was therefore impossible to identify whether wages being paid at the previous year’s rate were legitimate or not.
While the APS response rate is seen as low but representative, more questions are raised by ASHE. ASHE normally samples around 0.75% of the working population, but for apprentices the rate is less than half that (see Table 3).
Sampling rates in ASHE
Sampling rates in ASHE
Source: ASHE data, authors’ calculation; registered apprentices from SFA (2015). Point-in-time apprentices estimated by adjusting to weighted APS estimate (581,000 in 2014).
It seems unlikely that the apprentices in ASHE are missing randomly. First, absences from ASHE are disproportionately likely to be made up of low earners changing jobs frequently [54]. Second, Drew et al. [40] observe that individuals who start an apprenticeship with their current employer are more likely to be paid at or above the relevant minimum wage than those starting with a new employer. Individuals who remain at their employer for longer are more likely to be identified in ASHE, which uses “latest known employer” information from the HMRC to trace respondents. Third, employers with poor administrative processes or using cash-in-hand payments are less likely to be identified by the HMRC. In short, those apprentices recorded in ASHE are likely to be those in stable, long-term employment with an employer, with good and up-to-date record-keeping.
This may explain why the ASHE non-compliance rate is similar to that calculated for those who use payslips to provide information in the APS 2014: we would expect these APS respondents to be in the same subset of organised PAYE-paying employers. Table 4 shows the level of non-compliance in the APS by whether documentation is used or not.
Extent of non-compliance, by APS source
Notes: Source APS 2014, authors’ calculations; weighted data.
When information for both hours and pay is provided from a payslip, non-compliance rates in the APS and ASHE (see last row of Tables 2 and 4) become much more similar. While the APS and ASHE broadly agree on the non-compliance rates for fully documented earnings and hours, the low sampling rates for ASHE are likely to be biased towards compliant observations. Hence, the ASHE non-compliance estimate can be taken as a ‘lower bound’ for non-compliance.
The 2011 and 2012 APS asked for wage data at the payment level: for hourly-paid workers, an hourly wage was recorded, for weekly-paid a weekly wage, and so on. Standard hours were also recorded, so that an hourly wage could be derived for the non-hourly paid. There is a substantial difference in non-compliance between the hourly and the non-hourly paid (see Table 5).
Non-compliance rates in APS, by pay period
Non-compliance rates in APS, by pay period
Source: APS data, authors calculation; weighted.
As with the other employee surveys in the UK, wages in the APS are rounded at the pay period, which may account for some of the disparity in responses. However, the main cause for the difference between the hourly and the non-hourly paid seems to be a mismatch between hours and earnings: the APS derived wage has a much wider distribution than the hourly rate (see Fig. 1 for the 2014 data; similar results occur for 2011 and 2012).
Part of the confusion may be that the 2011 and 2012 APSs asked for both working hours and training hours (on- and off-the-job). There was some ambiguity on how these questions should be answered, and, hence, the accuracy of the derived value for the hourly pay is questionable. This may explain why the hourly paid non-compliance rate is so much lower: it does not need an accurate estimate of hours. Some further support for the idea that hours is the main cause of the problem comes from comparing non-compliance rates by type of training received. Table 6 reports the relevant estimates.
Non-compliance rates in APS by training
Source: APS data, authors’ calculations; weighted.
Derived hourly pay, payship information and the stated hourly pay, Source: Authors’ calculations; APS 2014, unweighted data, stated hourly pay sample.
These concerns were raised in [43], and led to the overhaul of the APS. The revised 2014 APS only asked for total hours, and did not break out training hours. The rationale seemed to be that trying to get accurate detail had been shown to fail, and it was better to concentrate on a better, simpler, total hours measure. This demonstrates the tension between getting more information and getting accurate information.
The 2014 APS also made more effort to get information from reliable sources and, specifically, from payslips. The results supported the idea that employees tend to guess answers at focal points in the absence of any accurate information; non-compliance was much lower where documentation was consulted (see Table 4, above).
Owing to the re-structuring of the survey, only 8% of apprentices in 2014 APS report an hourly rate. These observations also show the lowest rates of non-compliance (but note that the actual rate derived by dividing pay by hours worked gives a higher non-compliance rate; this implies that the hourly rate is not always being received). They are, however, close to the rates reported for those who do not report an hourly pay but have both hours and pay taken from their payslip (see Table 4). Thus although wage rates calculated from documented wages and hours might be the best measure of actual wages paid, stated rates appear more accurate than rates calculated from undocumented estimates of wages and hours; this has also been reported for the LFS employee data [38].
Recognising concerns over data quality, Drew et al. [40] carried out a qualitative analysis, interviewing apprentices, trainers and employers. This showed that, for the apprentices, ‘hours of work’ was a concept of little relevance; what mattered was total take-home pay, not the hourly rate. Apprentices saw low wages during training as ‘serving one’s time’; they also assumed that employers were paying the correct wages. Documentation remains the key to getting accurate information from apprentices.
In summary, a large part of the initial non-compliance identified in APS appears to be a result of poor data quality, and possibly a naïve interpretation of the data when aggregate statistics are presented. In response, the 2014 APS has traded off less detail for more accurate information.
Using the information on how the data were collected has shown that the data problems are psychologically plausible and indirectly supported. As data errors are more likely to lead to an overestimation of those paid below the NMW, this suggests that the APS could act as an ‘upper bound’ for the true non-compliance rate.
In contrast to APS, ASHE should not suffer from the hours/wages problem: it simply asks the employer to record total paid-for hours. However, the question could be misinterpreted, and it could be that ASHE respondents are omitting training hours to keep pay above the minimum. To investigate this we compared hours of apprentices and other employees in ASHE, and concluded that there was no systematic difference. If firms are under-reporting hours, it does not seem to be something consistent and statistically significant.
A third problem with the APS 2011 and 2012 was the error-checking process. Each reported wage had a check value: if the reported wage was less than that, then the interviewer would get the interviewee to confirm results. For the 2011 APS the hourly-paid check was set at the NMWAR for that year, £2.50 per hour. Unfortunately, this was retained in 2012, although the NMWAR had increased. Moreover, the “50p” focal points are extremely popular when guessing responses: a third of employees report wages that are multiples of 50p ([38], Table 12). Thus, in the 2012 APS, many users reported a wage of £2.50, but it was impossible to tell if these were:
Accurate and legitimate, as the pay period was before the new NMW came in. Accurate and unlawful, as the pay period was actually October or November. Inaccurate reporting by someone being paid at or above the NMWAR. Inaccurate reporting by someone being paid something below the NMWAR.
This was not picked up in initial processing. This single mass point thus contributed significantly to the infeasibly high non-compliance initially estimated from that year’s data (see Table 2).
In ASHE, wage calculations are routinely rounded to the nearest penny. Hourly pay in ASHE is calculated by dividing weekly basic earnings by weekly basic hours. For the monthly paid, hours are still often specified on a weekly basis; in these cases, respondents are asked to multiply weekly hours by 4.348 to get a monthly total for both wages and hours. Respondents are also required to convert decimal times (e.g. 37.5 hours) into hours and minutes (37 hours, 30 minutes). Clearly, there is much scope for error, particularly as the ASHE questionnaire only has space for two decimal places.
It was noted above that there were a small but significant number of apprentices being paid one penny below the NMWAR. Table 7 shows the effect of allowing 1p below the NMW to be included as ‘compliant’.
Non-compliance in ASHE allowing for rounding
Source: ASHE 2013-2015 averages, authors’ calculations, unweighted.
All of these apprentices affected by rounding were monthly paid, all had calculated hours, and all had a stated hourly wage equal to the NMWAR. It seems plausible that the multiple calculations (from weekly hours and monthly pay to actual monthly wage; then wages and hours as reported to ASHE; ONS’ calculation of a weekly wage, and then an hourly one), were one of the causes of the under-payment. The numbers are small but sufficient to materially affect compliance rates.
This problem had not been observed in analyses of non-compliance before, as it was almost entirely limited to apprentices. The reasons for this are not entirely clear. The best explanation seems to be that the very low level of the NMWAR provides more scope for rounding errors. This cannot be fully investigated though, as the other NMWs have never been as low as the NMWAR.
Weighting the APS is relatively uncontroversial, as it is felt to be representative of the population. In contrast, weighting the ASHE apprentice data is problematic. ASHE includes weights to produce population estimates, and an additional set of weights calculated specifically for low pay work. As noted above, ASHE is likely to be under-sampling the lower-paid (this is why we use the unweighted data in this paper; see Tables 2 and 7). As a result, Ritchie et al. [49] recommended that the weighted ASHE estimates should be taken as a lower bound for estimates of non-compliance.
Calculation of the non-compliance rate
One aspect that is simplified for the apprentice data is calculation of the non-compliance rate. First, “the proportion of apprentices paid below the statutory minimum” does not suffer from the same relevance problems as “the proportion of all workers paid below the statutory minimum” where ‘all workers’ also includes bankers and football players. Second, as apprentices tend to be very low paid, there is little to be gained from defining a special “low-paid apprentice” category.
It is possible to create indices for Apprentice Pay as described in the formula for
Inferences
Drawing inferences from the apprentice pay data is also somewhat easier. For example, the much lower non-compliance rate for first-year apprentices is found in all breakdowns of the data, whether univariate or multivariate, and whether ASHE or APS is used [40, 49]. While statistical data cannot prove any particular explanation, this suggests that the change in the rate in the second year of the apprenticeship is causing problems. The fact that it appears in both the employer and employee data is also strong evidence that it represents genuine non-compliance in wages paid, and not misreporting. Of course, we still cannot distinguish between whether such non-compliance is deliberate or caused by employers’ mistake. To resolve this, qualitative evidence is needed.
Summary
Broadly, the measurement of non-compliance amongst apprentices can be summarised as follows:
A clearly defined population simplifies the analysis. External information is necessary for understanding samples, such as the likely non-random under-representation in ASHE. Collecting detailed information proved inconsistent with collecting accurate information from individuals. A detailed study of the distributions can highlight potential problems. Interrogation of the data collection instrument helps understand problems, such as the concentration of responses at £2.50 in APS 2012. Processing rules can affect the outcome. Triangulating two surveys allows common features and differences to be identified. Having both surveys also allows the idea of ‘lower’ and ‘upper’ bound estimate to be developed.
Measurement problems can be seen as a technical matter, of interest mainly to data producers rather than researchers, and studied by statisticians rather than end users of data. This paper has shown that a solid understanding of the data collection process can have substantial benefits for research and policy analysis. We have considered an under-researched area of the literature: how can we measure the level of non-compliance with minimum wage legislation? We have taken a step-by-step process which mimics the statistical processes that turn paid wages into population estimates.
Minimum wages are part of the social and economic landscape in an increasing number of countries. Measuring and understanding compliance is important for the success of the policy to be evaluated, and for enforcement to be effectively targeted. The lessons learned here are particularly relevant for high-income countries where a high level of compliance means that tiny variations in data quality can lead to substantial policy differences.
Although the yes-no nature of non-compliance makes it relatively easy to demonstrate the importance of knowledge of the data, these lessons can be applied to a wider understanding of data quality. Factors such as human preferences for round numbers are hard to deal with, but a good understanding of this may allow one to develop mitigating strategies. Many studies talk about ‘measurement error’, but we have demonstrated that this can be composed of several distinct elements:
Inappropriate samples or population estimates Timing of data collection Interpretation of questions Ability to answer accurately Willingness to answer honestly Errors introduced by data processing
In principle, all but the last of these are familiar problems, and it is a natural reaction to apply statistical tools to evaluate the data quality. In practice, as this paper has shown, addressing these issues can require an extensive technical knowledge of processes, coupled with a considerable amount of ‘detective work’. In the case of non-compliance with the minimum wage, qualitative studies [40] can also be invaluable in guiding statistical analysis to find weak points in the data. The paper has also illustrated the compromises between data detail and accuracy; more information does not necessarily equal better information.
Finally, the analysis of non-compliance has thrown the value of triangulation into sharp relief. Just as Gramlich et al. [14] found forty years ago, the ability to compare multiple data sources has been invaluable. This is likely to become even more relevant as governments and regulators increasingly rely upon administrative data.
Footnotes
Acknowledgments
This paper brings together results from four research projects funded by the UK Low Pay Commission 2012–2016: [38, 39, 40, 43,
] [38, 39, 40, 43, 49]. We are grateful to LPC for the comments and discussion on those reports, to participants at the LPC’s research conferences and the Scottish Economic Society conference, and to discussants of our presentations. We are also grateful for the comments of Andrea Garnero and Sarah Brown on early drafts. Statistical results presented in this paper using ASHE or LFS data are Crown Copyright. The use of the ONS statistical data in this work does not imply the endorsement of the ONS in relation to the interpretation or analysis of the statistical data. This work uses research datasets which may not exactly reproduce National Statistics aggregates. All statistical results in the paper are generated from ONS data or the Apprentice Pay Survey by the authors, unless otherwise stated. Access to the ASHE data was given by the Office for National Statistics under project no. 12016. Access to APS data was granted by the Department for Business Industry and Skills. The views expressed in this paper are those of the authors and may not reflect the views of the Low Pay Commission, ONS or BIS. All errors and omissions are the responsibility of the authors.
