Abstract
This article quantifies personal income tax compliance by regions for the first time in Spain and identifies the factors explaining differences in tax compliance between regions, an aspect that has scarcely been analyzed in the literature. To this end, and in addition to the dynamic and spatial components considered by Alm and Yunus, this article considers the variables included in the classical tax evasion model of Allingham and Sandmo, as well as tax morale and political-institutional variables, including those linked to the country’s fiscal decentralization. The results obtained confirm, on one hand, those reached in the very extensive literature studying tax evasion from the individual perspective (including the importance of the dynamic element) and, on the other, the relevance of the spatial component in explaining tax compliance, so that greater or lesser tax compliance is partly explained by factors such as the tax behavior of neighbors or how those neighbors are treated by the public sector.
Introduction
Personal income taxes are usually based on self-assessment systems, in which individuals voluntarily report the income they obtained over the tax period, determine their tax liability, and then pay the tax. The voluntary nature of tax compliance makes that occasionally tax compliance is less than total. This may be due, in the first place, to the taxpayer failing to comply with tax laws and engaging in tax evasion, either because the tax return is not filed (nonfiling gap), because not all income is declared (underreporting gap), or because not all of the tax payable is paid (underpayment gap). Second, the taxpayer can perform legal tax avoidance activities, such as income shifting or tax deferral, which also reduce compliance. The difference between what taxpayers actually pay and what they should be paying without tax evasion or avoidance (commonly referred to as a tax gap) is an indicator of the degree of tax compliance. In this article, we are interested in the part of the noncompliance due to tax evasion.
The economic and social consequences of tax evasion are of great relevance for any economy, in terms of equity and efficiency. Tax evasion leads to budget deficits that force spending cuts or higher taxes; it leads to poorly allocated resources when the tax cheaters change their behavior regarding investments, working hours, and so on; it alters income distribution, insofar as some taxpayers exploit the tax system better than others; it leads to mistrust of the law and institutions and a loss of collective values; it affects the identification of the beneficiaries of public services and benefits; its presence means that governments must allocate resources to detecting noncompliance, measuring its scale and penalizing it; and it affects the quality of macroeconomic statistics (Alm and Soled 2017). Additionally, tax evasion represents a taxpayer behavior at least as important as changes in labor or saving supply (Saez, Slemrod, and Giertz 2012; Piketty, Saez, and Stantcheva 2014). For all these reasons, attempts to quantify, explain, and reduce tax evasion have been a constant, especially in developed economies.
Since the work of Allingham and Sandmo (1972), a very large amount of theoretical and empirical literature on tax evasion has been produced (see, e.g., the reviews of Andreoni, Erard, and Feldstein 1998; Hashimzade, Myles, and Tran-Nam 2013; Slemrod 2017; Alm 2019). However, studies that deal with tax evasion at the intermediate (state or regional) levels are more recent and scarce. Only a few papers have estimated differences in tax compliance at the regional level. As we will see shortly, these studies refer to decentralized and federal countries, which is logical, since it is in those countries where the regional perspective is more interesting. Our aim with this article is to add to the literature the analysis of another federal country, Spain, which has particularities that differentiate it from those others, estimating the differences in compliance among Spanish regions in personal income tax and making some contributions to which we will refer below.
For the United States, various authors have used the information provided by the Internal Revenue Service to estimate compliance in the federal personal income tax. At the state level, Dubin, Graetz, and Wilde (1987, 1990) do this for the years 1977–1986; Plumley (1996) for the period 1982–1991. At the level of the first three digits of the US zip code, we can mention the works of Witte and Woodbury (1985) and Dubin and Wilde (1988), both for 1969, and Gentry and Kahn (2009) for 2001. All these papers consider various factors in the explanation of tax evasion, linked to the static tax evasion model of Allingham and Sandmo (1972): audit intensity, effective tax rate, productive structure, education, age, unemployment, income, and so on, and also some variables relating to tax morale or political attitudes.
Alm and Yunus (2009) extend the tax evasion model of Allingham and Sandmo (1972) to include the spatial dimension and estimate the factors explaining evasion in US federal income tax at the state level from 1979 to 1997, taking into account that tax compliance in a state may depend on tax compliance in the neighboring states. Alm, Bloomquist, and McKee (2017) also analyze, in the context of personal income tax, how an individual’s tax behavior depends on the information he or she has on what his or her neighbors are doing although they do this by conducting laboratory experiments with American college students. Spatial dependence may be due to taxpayers exchanging information among themselves, so that if a taxpayer commits tax evasion and is not detected by the tax authority, this can lead others to also evade, although it could also have the opposite effect, by increasing the probability of being caught if the other person got away with it. McFadden (2006) suggests that individuals may act by simply imitating others in tax matters. Social norms may also generate spatial dependence in the decision to evade, insofar as they may set a standard for taxpayer behavior: individuals will be compliant if they believe tax compliance is the social norm and behave differently if the opposite if true (Gordon 1989; Posner 2000; Sandmo 2005; Benabou and Tirole 2011; Alm 2019; Besley, Jensen, and Persson 2019). Estimates not considering the magnitude of spatial variation in tax evasion would be misleading and only partially informative, especially in the presence of significant regional diversity about economic fundamentals, so that the model would not adequately report the true factors behind tax evasion, 1 and therefore, it would not correctly guide the policies against income tax fraud.
Later, Di Caro and Nicotra (2014) for the period 2007–2011, and Carfora, Vega, and Pisani (2018) for the period 2001–2011, analyze tax compliance for the Italian regions, also using spatial econometric models, although the paper of Carfora, Vega, and Pisani (2018) is not limited to income tax but analyzes total tax gap.
The cited paper by Alm and Yunus (2009) has a special interest because, as well as adding the spatial dimension to the analysis of tax evasion, it includes its dynamic component (already present in the original contribution of Allingham and Sandmo 1972), that is, the fact that the degree of tax compliance in a region may depend on its compliance in previous years. 2 Carfora, Vega, and Pisani (2018) also take into account the dynamic component of tax evasion but independently of the spatial component. The dynamic component should not be ignored in estimates of tax evasion, since it can explain part of the evasive behavior of the taxpayers. Individuals tend to repeat their patterns and behaviors because of the cost of adjustment that changes in behavior entail. In addition, morality, which is one of the factors explaining tax evasion, is persistent over time. It also seems reasonable to consider that if a taxpayer evades tax in one year and is not detected, he or she will probably evade in the following year. When the problem being analyzed has a dynamic dimension, it is necessary to include it in the analysis not to produce biased estimators.
For the Swiss cantons, Feld and Frey (2006) consider that the available evidence supports the existence of a “psychological tax contract” between the administrations and the citizens, so that direct democracy and respectful treatment of taxpayers are factors that foster the citizens’ tax morale and thus tax compliance.
There are no studies on tax compliance at the regional level in Spain, so in this article, we will fill this gap in the literature, quantifying compliance in personal income tax (Impuesto sobre la Renta de las Personas Físicas [IRPF]) at the regional level for Spain and estimating econometrically the factors explaining the differences in tax compliance between regions (Comunidades Autónomas [ACs]). The study is limited to the “common regime” ACs, since the “foral regime” ACs (Navarre and the Basque Country) have a different tax system from the others, and we do not have the necessary information to include them in the analysis. IRPF is a tax partly decentralized to the ACs. From 1994, the ACs received a share of 15 percent of the IRPF paid by residents in their respective territories. After 1997, a further 15 percent was assigned by the State to the ACs as an autonomic (“ceded”) tax. The assigned percentage was increased to 33 percent from 2002 (when the initial tax share disappeared) and to 50 percent from 2009. Central government has the power to regulate the tax base, that is common to central and regional IRPF, and to manage both central and autonomic taxes. The regions have the power to legislate on the autonomic tax rate and certain autonomic tax credits. 3
Our research includes several contributions to the empirical literature on tax evasion. First, this is the first study that quantifies the degree of compliance with IRPF in each AC and estimates the factors that explain the difference in compliance between regions.
Second, the exercises presented in this article expand on the results provided by the estimates carried out for other federal countries. Spain has some characteristics that place this country in an intermediate position between the United States and Switzerland, on the one hand, and Italy on the other. Although with notable differences between them, the United States and Switzerland are two countries with greater decentralization than Spain, both in terms of expenditure and, above all, revenue, and Spain is more decentralized than Italy (see http://www.oecd.org/ctp/federalism/fiscal-decentralisation-database.htm#C_3). The American states and Swiss cantons have extensive powers in the design and administration of their personal income tax, while in Spain (and even more so in Italy), the powers of the regions are, as we have seen, more limited, and the personal income tax is managed by the central administration. However, while Spain, Italy, and Switzerland apply equalization systems among their regional governments, the United States abolished its General Revenue Sharing in 1986 and does not currently apply a system of equalization grants to the states. In the latter country, interregional differences in disposable income inequality are lower than in the other three, with Italy being the country with the greatest disparities (Adalet and San Millán 2019). Finally, in Italy and Spain, the mobility of households within the country is much lower than in the United States and Switzerland (Caldera-Sánchez and Andrews 2011).
The above characteristics may help to explain some peculiarities of regional tax compliance in Spain. On the one hand, the literature has shown that there is a positive relationship between fiscal decentralization and tax compliance (Torgler and Werner 2005; Torgler, Schneider, and Schaltegger 2010). Consequently, we can expect that the increase in decentralization experienced by Spain in the period studied has translated into an improvement in tax compliance. On the other hand, the above information suggests that in Spain, there is more concern about regional differences and how the public sector corrects them than in Switzerland and the United States. In the latter country in particular, regional differences are less than in the others, do not seem to be of such concern, and are addressed more by mobility than by public intervention. If the above is true, the spatial dimension has to be especially important in the study of tax compliance in Spain, since the differences (real or perceived) in the treatment of ACs by central government policies can be projected into differences in the degree of tax compliance between regions.
The third contribution of our article concerns the specification of the regional tax compliance model. Unlike Alm and Yunus (2009), in whose application the dependent variable is the federal tax evaded, in our model, the dependent variable is the ratio of declared income to actual income, which affects both the central and regional IRPF. With respect to exogenous variables, we rely on several groups of factors: those relating to the classical model of Allingham and Sandmo (1972), those linked to tax morale and political-institutional factors, and persistence and spatiality factors, which, as far as we know, after Alm and Yunus (2009) have not been considered again in this joint form in any applied work on the subject.
Both the spatial and dynamic factors are relevant in the Spanish case. With regard to spatial dependence, the degree of economic interdependence of the regions is very high, especially among the closest ones. It seems clear that certain collectives of taxpayers (such as professionals, business people, executives), which are those most related to the concealment of tax bases, have information on the activity of the Tax Agency (whose competences extend to the entire national territory) and the degree of compliance of these collectives in other regions and that this information may affect the decisions they take on their own degree of compliance. And with regard to the dynamic component, it seems clear that tax behavior of taxpayers in a region today depends on what they did in the past. The success in tax evasion by a taxpayer in previous years will positively affect the decision to evade in the current year. Similarly, if the tax administration detects a taxpayer’s evasion, the taxpayer’s behavior will change for subsequent years. Later, we will show how the estimates are altered if spatiality and persistence are not taken into account in the analysis.
And fourth, from a methodological perspective, our article represents an advance with respect to Alm and Yunus (2009), to the extent that we apply several model selection criteria, which point to a model with spatial dependence on endogenous and explanatory variables, specifically a dynamic spatial Durbin model (SDM), and we perform a maximum likelihood estimate using spatial econometrics techniques, which take into account both spatial and dynamic effects simultaneously and in an unbiased and consistent way (Belotti, Hughes, and Mortari 2017). As Alm and Yunus (2009) pointed out, a unified approach for considering both spatial and dynamic effects was not available at that time, so they used a simpler alternative method. 4 However, as Anselin (1988) points out, the existence of a functional relationship between what happens at a certain point in space and what happens in another place demands more complex and specific techniques of spatial econometrics. Moreover, we use Driscoll–Kraay standard errors because it is necessary to introduce a nonparametric estimator of the covariance matrix that provides consistent standard errors against heteroscedasticity in presence of the usual patterns of spatial and temporal dependence (Hoechle 2007).
After this Introduction, this paper is structured in the following sections. The second section quantifies the degree of IRPF compliance in the Spanish ACs from 2003 to 2014, using a macroeconomic approach that measures the gap between the income reported by taxpayers in each region in their IRPF returns and the income recorded in Spain’s regional accounts. The third section offers an econometric estimate of the factors explaining regional IRPF compliance. The first part of this section presents the theoretical model, based on Alm and Yunus (2009), the second presents the adopted specification, while the third part explains the estimates and discusses the results. All groups of factors considered are relevant for explaining the differences in tax compliance among ACs. Both the spatial and the dynamic component are found to be significant. The fourth section summarizes the main results obtained and their policy implications.
A Quantification of the Degree of Spanish IRPF Compliance by Regions
Unlike other countries, Spain does not have any quantification of personal income tax compliance by regions. 5 In this section, we will calculate this for the first time for the period 2003-–2014, using the macroeconomic approach already used by the Comisión del Fraude Fiscal (1988 [Tax Fraud Commission]; see also Lagares 1990) to estimate evasion in Spanish IRPF at the national level from 1979 to 1987. 6 Essentially, this means comparing the income reported in the IRPF by the taxpayers (which we will call “fiscal income”), grouped by ACs, with the income aggregated by regions as recorded in the Official Spanish Regional Accounts (which we will call “real income”) and express this comparison in the form of a quotient. 7
The fiscal income of individuals is obtained from the samples of IRPF filers and nonfilers (Muestras de Declarantes y Muestras de no Obligados no Declarantes del IRPF), published by the Institute of Fiscal Studies (Instituto de Estudios Fiscales, IEF, Ministry of Finance), which contain microdata information from the fifteen common regime regions’ taxpayers. The IRPF samples contain the income reported to the Tax Agency by the taxpayers in the different components of the tax base: wages and salaries, capital income, real estate income, and income from self-employed and business owners.
The real income of individuals is calculated based on the Household Income Accounts of Spanish Regional Accounts (Contabilidad Regional de España), which are macrodata built according to the methodology established in the European System of National and Regional Accounts established by Regulation EU (SEC-2010). 8 Regional Accounts contain the income generated by households in each region computed according to the SEC-2010 regulations: wages and salaries, gross operating surplus, and mixed income. Regional household income calculated in this way is the best proxy we can find for the real income generated in a region. This income may or may not have been declared by the taxpayers in their IRPF returns. The ratio of fiscal income to real income will then be a good proxy for the tax compliance level.
The databases have been harmonized, so that the components of taxable income in IRPF are as close as possible to the income recorded in the Regional Accounts, in such a way that the differences between them that are not due to tax compliance are not relevant. 9
It is true that part of the difference between real income and fiscal income will be due to the existence of legal tax avoidance, but the most significant part of the total gap will be explained by tax evasion because, to a large extent, tax avoidance strategies do not alter the aggregate income reported in each region (although the tax paid will change). One of the most important avoidance strategies in IRPF is income shifting, for example, between labor and capital income (López-Laborda, Vallés-Giménez, and Zárate-Marco 2018), but its effects are neutralized when calculating the total income declared by each taxpayer. Another avoidance strategy is the income splitting among taxpayers, for example, among family members. To the extent that these people reside in the same region, the aggregation of declared rents also neutralizes the effects of this strategy.
Figure 1 shows the evolution of tax compliance by regions from 2003 to 2014, calculated as the quotient between fiscal income and real income. The results obtained must be taken as an approximation of the level of tax compliance in each region and its evolution, as despite having adjusted the databases used, it is impossible to ensure they are totally homogeneous.

Evolution of IRPF compliance by autonomous communities, 2003–2014. IRPF = Impuesto sobre la Renta de las Personas Físicas.
The evolution of tax compliance is similar in all the ACs: it rises from 2003 to 2014 in all the regions (except Catalonia where it falls by 0.6 percentage points), despite the overall drop in 2011 and 2012. The lowest growth is in Madrid, with 1.8 percentage points, and the greatest in Extremadura, with 12.8 percentage points.
There are two regions whose compliance levels are clearly lower than the others throughout the period (both in labor income and in income from other taxed sources): the Balearic Islands and the Canary Islands; plus Catalonia from 2009. Asturias stands at the top of the list, with compliance generally higher than the other ACs in the period examined. In any case, the dispersion of compliance levels between regions is not high. The standard deviation does not exceed 4 percentage points in any year and hardly reaches 3 points if we exclude the Balearic Islands from the calculation. However, the socioeconomic characteristics of the “common regime” ACs are very different. Madrid and Catalonia are the regions with higher per capita income (33,809 and 29,936 current euros of 2014, respectively), and Extremadura and Andalusia are in the opposite end (€17,262 and €18,470). Andalusia is the biggest AC in terms both of population (8,426,405 inhabitants) and surface (87,599 km2), and La Rioja is the smallest AC in population (313,582 inhabitants) and almost in surface (5,045 km2), being only surpassed by the Balearic Island (4,992 km2).
Figure 2 shows the degree of compliance for the aggregate of the fifteen “common regime” regions, which is also broken down by the two items permitted by the available information: labor income and other income (from movable capital, real state, and self-employed and business owners). Tax compliance is always high in labor income and fairly stable: around 90 percent. In contrast, compliance in other income is low throughout the period. It increased from 29.9 percent in 2003 to 45.3 percent in 2009 and then fell to 35.0 percent in 2014. For total income, the profile is as shown in Figure 1. The aggregate tax compliance percentage is 69.3 percent in 2003 and 76.2 percent in 2014.

Evolution of IRPF compliance by income sources, 1979–1986 and 2003–2014. IRPF = Impuesto sobre la Renta de las Personas Físicas.
There is no empirical evidence to compare with our results by ACs. At the national level, the Comisión del Fraude Fiscal (1988) published figures for the years 1979–1986, using real income as provided by the Spanish National Accounts (Contabilidad Nacional de España) and applying a methodology that is not entirely consistent with that of our research. Figure 2 compares the results of both researches. According to the Comisión del Fraude Fiscal, in 1986, Spaniards resident in the “common regime” regions declared 55.1 percent of total income obtained, reaching tax compliance values of 71.3 percent for labor income and 30.4 percent for other income. The results we obtained seventeen years later show tax compliance at 69.3 percent for all income. This increase of 14 percentage points is due exclusively to improvement in compliance in labor income, which reached 90.1 percent in 2003, as the degree of compliance in other income is practically the same in that year as in 1986: 29.9 percent. Given these results, it seems that the main advances in personal income tax compliance in Spain have taken place in the income that was previously more monitored and hence less prone to concealment.
More recently, Pulido (2014), using a similar but not identical methodology to that of the Comisión del Fraude Fiscal, calculated the degree of income tax compliance from 2003 to 2012 for all income, obtaining higher results than ours for the first four years. We have included his estimation in Figure 2.
Other researches have used different methodologies to estimate IRPF compliance in Spain. Esteller (2005) estimates stochastic frontiers at the local (provincial) level, obtaining an average IRPF compliance level of 82.2 percent for 1993–2000. Domínguez-Barrero, López-Laborda, and Rodrigo-Sauco (2015, 2017) estimate evasion in the IRPF for the period 2005–2008 by income sources, applying the methodology of Feldman and Slemrod (2007)—which is itself an adaptation of the approach of Pissarides and Weber (1989)—which looks for traces of fraudulent taxpayer behavior in the relationship between some of the items they recorded on their tax return (such as charitable contributions) and the income they report. For the average of diverse scenarios in 2008, assuming that there is no evasion in income from pensions, Domínguez-Barrero, López-Laborda, and Rodrigo-Sauco (2015) obtain full compliance in labor income and much less in other income categories: 39.08 percent in capital income, 45.53 percent in real estate income, 52.60 percent in business and professional income calculated under a direct assessment scheme, and 54.79 percent when this last income is calculated under an objective assessment scheme.
Estimates of the Factors Explaining IRPF Compliance in Regions
In this section, we will estimate the factors explaining the degree of tax compliance (or, from another point of view, the level of tax evasion) by ACs which we have quantified in the previous section. The first part of the section presents the theoretical model on which the applied exercise is based. The second describes the specification used and the independent variables considered. The third part will contain the estimates and discuss the results obtained.
Theoretical Model
The theoretical framework is based on the classic Allingham–Sandmo–Yitzhaki model (Allingham and Sandmo 1972; Yitzhaki 1974), with the adaptation proposed by Alm and Yunus (2009) to include the dynamic and spatial components.
Allingham and Sandmo (1972) develop a model that combines the economics of crime and the economics of risk and uncertainty. An individual decides whether he or she is going to evade part or all of the income he or she has obtained and does so maximizing its expected utility, which is a weighted average of the utility attained in the two situations that can arise, the weights being the respective probabilities he or she assigns to each situation. In the first situation, evasion is not detected, so the individual only has to pay the income tax corresponding to the reported income. In the second situation, the Tax Agency detects the evasion so that, in addition to the tax, the individual has to pay a fine (which, according to Yitzhaki [1974], is established on the evaded tax).
Allingham and Sandmo (1972) themselves recognize the simplicity of their basic model and note various extensions to it, two of which interest us here. First, they point out that the decision to evade taxes may be affected by the reputation of the individual as a member of a community. Allingham and Sandmo (1972) introduce this reputational factor as an additional argument in the utility function of the individual. More recently, Sandmo (2005) suggests that the individual’s subjective probability that his or her evasion is detected may depend on the tax-evading behavior of everybody else. Second, Allingham and Sandmo (1972) also study the dynamic component in the decision to evade, assuming that when an individual is audited, the evasion he or she committed in that period and eventually also in all previous periods is detected so he or she will have to pay a fine corresponding to all the concealed income.
Alm and Yunus (2009) introduce the elements of persistence and spatial dependence through the subjective probability of detecting fraud,
Taxpayer i will declare in period t the income
where
Following Alm and Yunus (2009), the effect of
where
Consequently, the utility maximizing problem for individual i will have the following general functional form:
In other words, the evaded income,
Specifications
To identify the factors explaining differences in IRPF compliance between regions, we consider a panel of the fifteen common regime regions in the period 2003–2014 and estimate a spatial and dynamic model. It is a spatial model because it takes into consideration that tax compliance in a region may depend on compliance in the other regions, on certain explanatory variables in neighboring regions 10 and also on a combination of omitted variables that may be spatially correlated. And it is also a dynamic model because it considers that compliance in one year may depend on past experience, based on the idea that the tax compliance decision is serially correlated because of the adjustment cost caused by a sudden change in the taxpayer’s filing decision.
Based on Alm and Yunus (2009), we propose the following extended specification:
The dependent variable is
The explanatory variables of tax compliance which we introduce in the model are grouped in several blocks. The definitions and descriptive statistics of the variables used in the estimates can be seen in Tables 1A and 2A of Online Appendix, respectively. The correlation analysis between the variables is shown in Tables 3.1A and 3.2A.
Results of Estimating Regional IRPF Compliance with a Dynamic SDM with Regional Fixed Effects and Temporal Dummies.
Note: IRPF = Impuesto sobre la Renta de las Personas Físicas; SDM = spatial Durbin model.
** Significance at 5 percent.
* Significance at 10 percent.
Direct, Indirect, and Total Effects of the Explanatory Variables.
** Significance at 5 percent.
* Significance at 10 percent.
Variables in the Allingham–Sandmo model
The first block includes the variables relating to the classic tax evasion model described in Theoretical Model section. We have included several variables relating to the probability of detection of fraudulent behavior:
11
the percentage of companies in the region without salaried workers (nonsalaried
We have also included two variables relating to the penalties imposed when evasion is detected. The first represents the ratio between penalties plus enforcement surcharges
12
and the total tax revenue collected in each region for direct and indirect taxes, levies, and other revenue managed by the national Tax Agency (penalty). The second variable is the weight of revenue from IRPF audits plus the tax payable from returns submitted after the deadline and other items in the IRPF tax payable in each region (auditrevenues
We have incorporated the average (national plus regional) IRPF tax rate (averagetr), which we have lagged one period, and the logarithm of per capita income (income) of the region. The sign of both these variables is expected to be undetermined. In the framework of the Allingham–Sandmo–Yitzhaki model (i.e., with fines imposed on evaded tax payments), and in a context of decreasing absolute risk aversion, it is true that larger tax rates reduce evasion. However, the literature has demonstrated that if factors relating to, for example, taxpayers’ honesty or social norms are added to the model, the relationship between tax rates and evasion could be ambiguous (Gordon 1989). Similarly, in this framework of decreasing absolute risk aversion, an increase in individual income increases the volume of evaded income, but the effect on the percentage of evaded income depends on relative risk aversion.
Tax morale variables
Based on the plenty literature that shows that the dissuasion model is insufficient to explain tax compliance (e.g., Torgler 2007), we have gathered a series of variables relating to tax morale: a qualitative variable, justified, with value between 0 and 1, which indicates how far citizens consider justifications to exist for evasion; the number of people convicted per thousand inhabitants in each region (sentenced); another qualitative variable, civicduty, with value from 1 to 4, which measures taxpayers’ perception of the link between the civic duty to pay taxes and tax compliance, and the percentage of the population with secondary education (educ). We expect a negative sign for the coefficients of the first two variables, and positive for the last two.
We have also added some variables indicative of the relationship between the benefits of public spending as perceived by the citizen and the taxes they have paid, as this can influence the tax morale of citizens and thus their compliance (Falkinger 1988; Luttmer and Singhal 2014). For this, we introduced the “fiscal balance” of each region (i.e., the difference between the expenditure executed and the revenues obtained by the central level in the region) in terms of gross domestic product (GDP; balance), lagged one period; a dummy which takes the value 1 when the citizens in the region are on average quite satisfied with the public services they use (satis) and 0 otherwise; a qualitative variable, management, with value from 1 to 4, which captures to what extent citizens believe that public services are managed correctly; and the percentage of the population aged over sixty-five (oldpop), given that this segment of the population receives a large part of public spending, in the form of pensions, care, and healthcare. We assigned an expected positive sign to the coefficients of these four variables.
Political and institutional variables
In accordance with the literature, a block of political and institutional variables was introduced in the specification, which can contribute to explaining differences in tax compliance between regions. The political factors that can influence tax compliance are the color of the ruling party (color), taking the value 1 if left-wing and 0 otherwise; the percentage of votes obtained by the ruling party (votes); and a dummy taking the value 1 if the government of the AC is regionalist and 0 otherwise (reg). The coefficients of these variables have an a priori undetermined sign.
The first group of institutional variables seeks to capture the possible influence of fiscal decentralization on tax compliance. As explained above, in Spain, IRPF is partly ceded to the ACs. Although regions have no power to manage the tax or to set up the tax base, they can establish the tax rate of the regional IRPF, and some tax credits. The IRPF is the tax that provides the greatest collection to the country and the one that best represents the tax duties derived from belonging to a political community. In this sense, it can be expected that the greater the degree of IRPF decentralization, citizens will better perceive the relationship that exists between the taxes they pay and the services they receive from regional governments, and they will be more committed and motivated to comply with their tax responsibilities (Torgler and Werner 2005; Torgler, Schneider, and Schaltegger 2010).
To reflect the effect of tax decentralization on the degree of compliance, we have constructed the following variables: the maximum (maxtr) and minimum (mintr) tax rates in the regional IRPF; a dummy which captures whether the region has exercised regulatory responsibilities upward in the IRPF, raising the marginal maximum or minimum tax rate (raisedtr); and a variable taking the value 1 starting from 2009, when the amount of the IRPF ceded to ACs was raised from 33 percent to 50 percent (dcession). The expected sign for the coefficients of the first three variables is undetermined, and the fourth is positive: a greater tax decentralization could favor citizens’ connection with the regional government, leading to more compliance. 13
There is another group of three institutional variables, which are common to all regions, and consequently predict changes in the compliance of all regions. The first one is taxamnesty, with value 1 in the years 2013 and 2014, after the approval of the last tax amnesty in Spain, and 0 otherwise. On the one hand, an amnesty is expected to improve tax compliance by incorporating individuals who have benefited from the amnesty into the taxpayer census. On the other hand, an amnesty causes a comparative grievance in favor of noncompliant taxpayers, who are offered advantageous treatment after having cheated, with respect to those who have paid their taxes on time. This can produce a crowding-out effect that favors noncompliance by reinforcing extrinsic motivations for compliance—determined by inspections and fines—and weakening intrinsic motivations—determined by morality or social norms (López-Laborda and Rodrigo 2003; Congdon, Kling, and Mullainathan 2011). The expected sign for the coefficient of this variable is then undetermined.
The second is the dummy dreform, with value 1 in the years 2006–2014 and 0 otherwise. This variable tries to capture the effects that the IRPF reform of 2007 had on the reported income of individuals. This reform substantially reduced tax rates on capital income, especially for higher income taxpayers. For the same reasons given above for the variable averagetr, we cannot assign a determined sign for the coefficient of dreform variable.
And the last institutional variable is a dummy which captures the recession years at the national level (crisis). Anyway, the effect of the recession years can also be taken into account through other alternative regional variables: the regional unemployment rate (unemploy) and the GDP growth rate in each region (regionalgrowth). 14 All the variables, except income, were constructed in levels, and the monetary variables were deflated.
Estimates and Results
We have first considered the potential endogeneity of certain independent variables included in the model. Specifically, revenue from IRPF audits (auditrevenues) and from penalties and enforcement surcharges (penalty), and the tax rates (averagetr, maxtr, and mintr) as these variables may be conditioned by reported income. To this end, we have applied the conventional endogeneity tests both through a linear model where one or more of the regressors are endogenously determined (Durbin and Wu-Hausman statistics) and additionally through the spatial and dynamic model we have specified in equation (5) (Hausman test). Specifically, we perform the two-stage Hausman (1978) procedure, using instrumental variables (IV). The instruments we use are valid, that is, they are sufficiently correlated with the potentially endogenous variables but not with the error term; following Wooldridge (2019), they are not included in the model or are exogenous variables; and the used instrumental equations are nor misspecified. This is confirmed by the Sargan and Basman tests shown in Table 4A of Online Appendix. In addition, we have completed the analysis of the potential endogeneity of the suspicious variables, carrying out a test that considers all potentially endogenous variables simultaneously (joint endogeneity test). All these tests are showed in Table 4A of Online Appendix, and they suggest we cannot reject the null hypothesis of exogeneity of the variables. Table 4A of Online Appendix also shows the IVs used.
To test the potential spatial dependence of the model which we hypothesized in the previous sections, we used the Pesaran and Moran tests (Hoyos and Sarafidis 2006). The results of which are presented in Table 5A of Online Appendix. Both confirm the presence of spatial dependence, so for the estimators to be consistent, spatial dependence models must be used, like those proposed in equation (5). These spatial models take into account that the sample considered contains less information than the uncorrelated samples usually used in econometrics, due to the fact that there is a spatial correlation. To this end, a matrix of spatial weights must be constructed, which describes the connectivity or neighborhood of regions exogenously (Anselin 2002) and is also significant enough to represent dependence in the endogenous variable or the error term. If we consider that “everything is related to everything else, but near things are more related than distant things” (Tobler 1970), and to avoid the problem of isolated regions or ones with an excessive amount of neighbors, we have defined neighbors as the five nearest regions in terms of distance, using a 15 × 15 spatial matrix. 15
Table 6A of Online Appendix summarizes the results of estimating regional IRPF compliance with different fixed-effect models. We use fixed effects as they are more appropriate for our data, as the sample used represents the entire taxpayer population of the Spanish common regime regions rather than a random sample (Elhorst 2014). Meanwhile, it must be realized that dynamic models require the inclusion of fixed effects, without allowing for random effects. However, problems of simultaneity have led us to incorporate, instead of temporary fixed effects, a set of temporal variables related to specific events that involved economic/legal/structural changes that make tax compliance move in a similar direction in all ACs and that, as the model obtained shows, are going to be essential in the explanation of tax compliance in the Spanish regions. This is the case of the institutional and economic cycle variables taxamnesty, dreform, dcession, and crisis, the inclusion of which will also improve the degrees of freedom in the estimates, as it implies a smaller number of variables to be estimated. The fixed effects that we include in the analysis are, therefore, individual fixed effects and capture the unobserved characteristics of each region, that is, the characteristics that involve a differential behavior at the regional level and that can also condition tax compliance. Specifically, we implement the fixed effects variant of the SDM model using the bias corrected maximum likelihood approach described in Yu, de Jong, and Lee (2008) and provide Driscoll–Kraay standard errors against heteroscedasticity, in presence of the usual patterns of spatial and temporal dependence.
The spatial correlation coefficient (ρ) and the coefficient of the explanatory variable balance of the neighboring regions (ϕ) are significant and have a clear effect on tax compliance in the region i. Also, ρ is quite far from 1, so the equations are not very likely to have a unit root. The spatial autocorrelation model and the spatial error model also show spatial dependence in the error term; however, the Akaike information criterion and the Bayesian information criterion indicate that the best specification is provided by the dynamic SDM, shown in the last column of Table 6A, in which the spatial dimension derives from the endogenous and explanatory variables. As far as possible, 16 this result is corroborated by the LR tests in Table 7A and ratified by the tests of absence of spatial autocorrelation in the error (LM error) and presence of spatial autocorrelation in the spatially lagged dependent variable (LM lag) as shown in Table 8A. Meanwhile, the SDM lets us take into account, as well as the spatial component, the dynamic component of the endogenous variable, which was one of the purposes of our research.
Table 1 presents the selected dynamic SDM. In this model, the coefficient of the dynamic component or persistence (γ) is significant, which means that the average taxpayer learns over time. His or her tax behavior today depends positively on what he or she did in the past, as found by Alm and Yunus (2009) and Carfora, Vega, and Pisani (2018), although in the latter, separately from spatial dependence. As mentioned above, the spatial correlation coefficient (ρ) is also significant, indicating that there is a regional interaction in the tax compliance decision and that this interaction is positive, the same result found by Alm and Yunus (2009) for the United States and Carfora, Vega, and Pisani (2018) for Italy. Thus, greater tax compliance in neighboring regions is associated with greater compliance in one’s own region.
The three groups of independent variables included in the specification were found to be relevant in the explanation of the differences in compliance among ACs and with the expected sign. However, in SDMs, a change in the explanatory variable of a region has an effect on the same region (direct effect) and, potentially, an effect on all the other regions (indirect effect) via the spatial multiplier mechanism. Because of this, the spatial interrelations which appear in these models are complex and the interpretation of the effect of each variable zi y zj cannot simply be done using its regression coefficient but requires estimating the direct effects, 17 the indirect effects, 18 and the total effects, as a sum of the previous ones (LeSage and Pace 2009). Meanwhile, when using a dynamic model, all these effects are determined in the short and long term, as can be seen in Table 2.
Both short- and long-term effects are significant, although the long-term coefficients of the variables are greater. In the short term, the direct effects of the explanatory variables are generally greater than the indirect effects and in the long term the contrary usually happens. Anyway, the relevance of the direct and indirect effects confirms the need to introduce spatial analysis in the study of regional tax compliance. For the sake of simplicity, we will now focus on short-term effects.
The coefficients of two variables measuring the opportunity for tax evasion (or, to put it another way, the probability of detecting evader behavior) have been shown to be significant in the model and with the sign predicted in theory and mainly confirmed by the empirical evidence (Alm 2019). First, if the percentage of small companies in a region, specifically without salaried workers (pcompnowork), increases by 1 percentage point, IRPF compliance in this region is reduced by 0.63 percentage points, with the weight of small companies in other regions (indirect effect) being less relevant than that of one’s own region (direct effect). This result is a clear sign of the higher level of evasion in this business segment, as it is subject to less scrutiny by the Tax Agency, which tends to focus more on large companies. Carfora, Vega, and Pisani (2018) also find a negative relationship between company size and evasion in Italy. Second, if the weight of income subject to withholding (withholding
As for the variables relating to fines, the model shows that if revenue from IRPF audits in relation to tax collection (auditrevenue) increases by 1 percentage point in an AC, tax compliance rises in this region by 0.39 percentage points. This result is in line with those documented in the literature for other countries (Plumley 1996; Alm and Yunus 2009; Di Caro and Nicotra 2014; and Carfora, Vega, and Pisani 2018, among others).
An increase by 1 percentage point in the average tax rate applied to income in the region (averagetr) reduces compliance by 0.13 percentage points, although only the direct effect is significant (and only in the short term). The low significance of the coefficient of this variable can be a reflection of the fact that in the Allingham–Sandmo–Yitzhaki model, the empirical evidence is not conclusive. Among the papers dealing with tax compliance at the regional level, Dubin, Graetz, and Wilde (1990), Gentry and Kahn (2009), and Di Caro and Nicotra (2014) find a negative relationship between tax rate and compliance, while Alm and Yunus (2009) obtain a positive relationship. 19
With respect to the variables that capture the citizens’ attitude to evasion, the model shows that when citizens in a region have less tax morale, measured as the number of people convicted per thousand inhabitants (sentenced), compliance is lower. Plumley (1996) and Dubin (2007) find that convictions for economic offences reduce evasion in the United States, and Carfora, Vega, and Pisani (2018) find a positive relationship between the rate of offences and evasion in Italy.
Concerning the relationship between spending, taxes, and tax compliance, a positive sign of the coefficients of satis and management suggest that when the citizens in the region are satisfied with the public services they use and with their management, the level of tax compliance is 0.56 and 0.30 percentage points higher, respectively, than if there is no satisfaction. The variable balance shows that if the difference between what a region receives from the central government and what it contributes, in relation to GDP, increases by 1 percentage point, IRPF compliance increases by 0.06 percentage points in that region i (direct effect). However, if the fiscal balance is improved in the other regions, compliance in the region i is reduced by 0.15 percentage points (indirect effect), which includes the local indirect effect which captures the response of taxpayers in the region i who feel disadvantaged compared to the residents of neighboring regions. This way, the total effect of this variable on the endogenous one is negative. Similarly, Gütz, Levati, and Sausgruber (2005) provide experimental evidence for Germany that the citizens living in territories which make large net contributions to the federal budget have lower tax morale. Differences in income between ACs and differences in the treatment received by ACs in the central government’s revenue and expenditure policy are a core issue of concern in the public debate in Spain. This surely helps to explain the sign and significance of the coefficient of the balance variable. In view of the information on decentralization, regional inequalities and mobility provided in Introduction of this article, it does not seem to be such an important factor in other federations such as the United States or Switzerland.
The coefficients of the political variables considered in the estimates are not significant, while the institutional variables are very relevant in the model. The 2012 tax amnesty has a clear effect of reducing compliance, which is indicating that the harmful effects that the tax amnesty may have on normally honest individuals seem weigh more than its apparent advantage, such as the relatively rapid recovery of tax liabilities and the inclusion of new taxpayers in the tax authorities’ records. 20 However, the 2007 tax reform has favored tax compliance, perhaps due to the lowering of tax rates, which especially affected the highest incomes.
A number of variables relating to decentralization have a significant coefficient. On one hand, dcession suggests that increasing the assigned percentage of the IRPF from 33 percent to 50 percent since 2009 has raised the level of tax compliance. This result is in line with Torgler and Werner (2005) and Torgler, Schneider, and Schaltegger (2010), who found that greater local autonomy in Germany and Switzerland means higher tax morale and higher compliance. Nevertheless, our result must be interpreted with caution, as we are measuring decentralization by a dummy, which could also be capturing other things. On the other hand, when a region uses its regulatory powers to raise the tax (raisedtr), tax compliance is reduced. This result is corroborated by the significance and negative sign of the coefficient of the variable representing the maximum marginal tax rate set by each region (maxtr).
The negative sign of the coefficient of the variable crisis shows that tax compliance has a pro-cyclical behavior, as would be expected theoretically, and as shown in the empirical evidence (Dubin and Wilde 1988; Alm and Yunus 2009, among others). In periods of economic crisis, many people, especially those with financial problems, tend to work in the black economy and not declare their income.
If the model ignored the spatial and dynamic components, most of the coefficients of the variables in the model would cease to be significant and, therefore, would no longer explain tax compliance (auditrevenues, averagetr, sentenced, satis, management, damnesty, maxregtr, raisedtr, dcession, and crisis); and the weight of the coefficients of the variables that remain significant in the model would change. Specifically, on the one hand, the variable balance would change its sign by capturing only the direct effect of the “fiscal balance” in each region and therefore ignoring its indirect effect on neighboring regions, which, as seen, is negative and quantitatively greater. On the other hand, if the percentage of companies without salaried workers rises, the predicted reduction in tax compliance in the spatial and dynamic model is 1.08 times smaller than it is in a regular ordinary least squares (OLS) estimate (−0.63 vs. −0.68). And if the weight of income subject to withholding rises, the predicted reduction in tax compliance in the spatial and dynamic model is 2.45 times smaller than it is in a regular OLS estimate (0.2 vs. 0.49).
Concluding Remarks
This article was intended to make some useful contributions to the empirical literature on tax compliance. First, this research is the first quantification of IRPF compliance in Spanish regions. A macroeconomic approach was used (the only possible with the information currently available), comparing the income reported by individuals for tax purposes with the income recorded in the household income accounts of Spanish Regional Accounts. The figures obtained show that tax compliance has increased overall from 2003 to 2014 and that there is little variance in compliance levels among the regions.
Second, this article joins a very small number of papers that attempt to identify the factors explaining differences in tax compliance between regions or local entities in decentralized and federal countries. We do this for the previously calculated compliance levels of the Autonomous Communities.
In methodological terms, we have tried to make our approach as complete as possible. As well as including the dynamic and spatial components considered by Alm and Yunus (2009), we considered three groups of variables that can explain differences in compliance: the variables included in the tax evasion model of Allingham and Sandmo (1972), tax morale variables, and political-institutional variables, attributing special importance to those linked to the country’s fiscal decentralization.
The results obtained confirm, on the one hand, those reached in the very extensive literature studying tax evasion from the individual perspective (including the importance of the dynamic element), and on the other, the relevance of the spatial component in explaining tax compliance. This way, both spatiality and persistence must be considered in order to correctly model tax compliance. Our model also reveals that variables relating to decentralization, typical of the Spanish institutional framework, are relevant in the estimate: tax compliance is directly related to the degree of IRPF decentralization and inversely related to the use of the AC’s regulatory powers to raise the tax.
The significance of persistence and spatiality to explain differences in tax compliance between regions must be carefully considered by the tax administration when designing the best policies to fight tax evasion. In this regard, it may be useful to recall the three paradigms for tax administration proposed by Alm (2012, 2019) to identify policies to improve compliance: “enforcement paradigm,” which focuses on policies to increase detection and punishment; “service paradigm,” which focuses on the services of the tax administration to taxpayers; and “trust paradigm,” which looks for a change in the culture of paying taxes.
Two examples may help to illustrate the above statement. First, the literature widely agrees that personal income tax evasion is mostly found among entrepreneurs and professionals (as well as recipients of capital income), but not among salaried workers, who, being subject to withholding, have little chance of successfully evading the personal income tax (Alm 2012, 2019; for Spain, Domínguez-Barrero, López-Laborda, and Rodrigo-Sauco 2017). Our own estimates suggest that when income is subject to withholding, tax compliance is higher. Consequently, and as shown by the relevance and significance of the coefficient of the spatial lag (ρ), we think our estimates are consistent with the hypothesis that certain groups of taxpayers, such as professionals, entrepreneurs, or executives, are aware of the activity of the Tax Agency and the degree of compliance of these same collectives in other regions, especially the neighboring ones and that this information affects the decisions they make about their own degree of compliance, in such a way that greater tax compliance in neighboring ACs is associated with greater compliance in one’s own AC. 21 These externalities must be taken into account by the tax administration when designing and implementing its audit policy in each region, in accordance with the enforcement paradigm.
And second, the relevance and significance of the coefficients of the balance variable also suggest that citizens translate into their tax behavior the perception they have of how their own AC, and other ACs are treated by central government intervention through its revenue and expenditure policy. Citizens specifically react by reducing tax compliance when their AC’s fiscal balance is adverse or when they perceive that central government treats other regions better. In consequence, central government should clearly inform and explain to citizens about the taxes it requires and the services it provides in each AC, and what the inequalities they perceive are due to. The literature shows that, in general (and with the notable exception of the foral regions, due to their special status), and as might be expected, such differences arise from differences in income and population between regions (Uriel and Barberán 2015). An appropriate step in this direction is the elaboration and publication by the central government of the System of Territorialized Public Accounts (Sistema de Cuentas Públicas Territorializadas; see https://www.hacienda.gob.es/es-ES/CDI/Paginas/OtraInformacionEconomica/Sistema-cuentas-territorializadas-2014.aspx), which provides a detailed picture of the territorial distribution of public budgets on both the revenue and expenditure sides. This kind of information and pedagogy exercise by governments could help to increase the tax compliance of the citizens, and it would fit perfectly into the trust paradigm.
Supplemental Material
Supplemental Material, Supplemental_material_IRSR - Personal Income Tax Compliance at the Regional Level: The Role of Persistence, Neighborhood, and Decentralization
Supplemental Material, Supplemental_material_IRSR for Personal Income Tax Compliance at the Regional Level: The Role of Persistence, Neighborhood, and Decentralization by Julio López-Laborda, Jaime Vallés-Giménez and Anabel Zárate-Marco in International Regional Science Review
Footnotes
Acknowledgments
The authors thank Francisco Pedraja and Fernando Rodrigo for their helpful comments, and the Government of Aragon and the European Regional Development Fund (Public Economics Research Group), the Ministry of the Economy and Competition, Project ECO2016-76506-C4-3-R (Julio López-Laborda) and the Ministry of Science, Innovation and Universities, Project RTI2018-095799-B-I00 MCIU/AEI/FEDER, UE (Jaime Vallés-Giménez and Anabel Zárate-Marco) for their funding
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: This study received funding from the Government of Aragon and the European Regional Development Fund (Public Economics Research Group), the Ministry of the Economy and Competition, Project ECO2016-76506-C4-3-R (Julio López-Laborda) and the Ministry of Science, Innovation and Universities, Project RTI2018-095799-B-I00 MCIU/AEI/FEDER, UE (Jaime Vallés-Giménez and Anabel Zárate-Marco).
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
