Estimating persistence in Employee Business Expense correspondence examinations using Hidden Markov and Semi-Markov models 1

Abstract

We use Hidden Markov and Semi-Markov Models to study persistence of the compliance impact for tax examinations of Employee Business Expense (EBE). Using panels of yearly returns for taxpayers reporting EBE, we compare future filing behavior of those audited to those not audited, by fitting and comparing Hidden Markov and Semi-Markov Models for both groups. The Markov state space is EBE reporting compliance. The observation vectors are a function of reported line item amounts for a series of annual returns filed starting two years after the year of the tax audits, the baseline year. The functions used to create the observation vectors are proxies for compliance, and the unobserved Markov state space is true compliance. The observations have a probability distribution that is conditional upon unobserved compliance status. Our fitted models give some evidence that a no change audit may worsen compliance slightly.

Keywords

Hidden Markov Model tax compliance

1. Introduction

The United States tax system relies in part on individuals and businesses self-reporting their tax liabilities. Given the voluntary nature of the system, there is a gap between the taxes that should be paid and those that are paid – the tax gap. Recent estimates of the underreporting tax gap are $387 Billion, with $264 Billion attributed to Individual Income Tax.2

IRS Examination programs support voluntary compliance with tax laws by addressing compliance issues through audits – field examinations having multiple issues and conducted face-to-face with the taxpayer, and correspondence examinations focused on a single issue and conducted through written correspondence with the taxpayer.

Audits may have direct effects (immediate results from an examination – change/no change for example), and indirect effects (persistence – changes in the future filing behavior after an audit) on voluntary compliance. Given the tax gap and the cost of examinations, audit persistence is becoming increasingly important.

We focus on the question of whether a taxpayer changes his/her future filing behavior because of an audit, and whether any such changes are enduring or only fleeting.

Table 1
Unreimbursed employee expenses filings for tax years 2003–2013

Tax	1040 filing	Schedule A unreimbursed	Schedule A unreimbursed	Form	Form	Percent
year	(count)	employee expenses	employee expenses	2106/2106EZ	2106/1206EZ	of 1040
		(count)	($1,000)	filings (count)	filings ($1,000)	filings
2003	130,423,626	14,896,433	63,210,079	6,813,407	44,791,299	11.42
2004	132,226,042	15,545,955	68,497,230	7,483,103	49,666,017	11.76
2005	134,372,678	15,920,218	75,824,189	7,825,703	56,639,758	11.85
2006	138,394,754	15,985,244	75,600,830	8,664,367	53,303,582	11.55
2007	142,978,806	16,479,370	82,105,794	8,966,892	58,925,639	11.52
2008	142,450,569	15,790,907	82,225,607	9,206,616	63,467,240	11.09
2009	140,494,127	14,942,268	75,607,218	8,704,483	57,855,103	10.64
2010	142,892,051	14,631,890	72,143,485	8,351,710	54,728,296	10.24
2011	145,370,240	14,730,817	76,857,890	8,709,898	58,552,419	10.13
2012	144,928,472	14,604,311	81,428,583	8,757,770	62,064,311	10.08
2013	147,351,299	14,544,643	85,604,965	8,552,245	62,808,743	9.87

Source: IRS Statistics of Income, Individual Income Tax Returns Line Item Estimates, Publication 4801 (Rev. 10–2014).

1.1 Related research

Recent work [1] on audit persistence using Tax Years 2006–2009 National Research Program (NRP)3 examinations together with a randomized control group found that for individuals overall, an audit increased reportable income by an average of $1,000 with a persistence impact of at least six years. However, taxpayers with different income sources and claim eligibility (i.e. wages, self-employment income, refundable tax credits, and certain deductions), appear to respond differently to audits in both the increase in average reportable income and persistence effect. In addition, the persistence effect associated with audit type, correspondence or field, is larger for correspondence audits.

Using a similar approach, but using only operational administrative tax data for nonfarm self-employed segment [2] found the indirect effects on an audit persist for at least three years but depend on the audit outcome – change with adjustment (positive impact) and no change (negative impact).

1.2 Problem description

Our work builds on the existing literature by using Hidden Markov and Semi-Markov Modeling frameworks, stochastic modeling approaches, to explore patterns of taxpayer filing behavior over time for Employee Business Expense (or Unreimbursed Employee Expenses, herein EBE) Correspondence Exams.4 EBE is reported on Form 1040 Schedule A line 21 under “Job Expenses and Certain Miscellaneous Deductions.”

We chose this work stream for several reasons. First, the EBE audit program started around Tax Year 2003 and continues today so there is a reliable time series of data.5 Second, EBE exams are straightforward, single-issue correspondence exams allowing us to test whether our modeling approach can be successfully applied to understanding audit persistence in this segment over the period TY2003–TY2013. Table 1 provides an overview of EBE filings over the period. EBE has remained fairly stable over time with a slight downward trend.

We track the future EBE filing behavior6 of three groups of taxpayers who reported EBE; those who were audited for EBE and had a positive tax change, those who were audited for EBE but did not have a positive tax change (no-change), and those who were not audited but whose reporting behavior suggested possible EBE non-compliance. We fit several Hidden Markov and Semi-Markov Models to each group and compare the results.

2. Methodology

Prior research suggests that audit impact may persist for six or more years, so to allow for as long a follow-up period as possible we selected study groups of taxpayers who claimed EBE for Tax Year 2003. We created three groups, those whose audited EBE deductions resulted in a positive tax change, those whose audited EBE deductions did not result in a positive tax change, and a random sample of those whose EBE reporting suggested possible non-compliance but were not audited.7 Comparing how these three groups behaved in the years following TY2003 would help us determine how an EBE audit impacts future EBE compliance, how long the impact persists, and how these impacts differ depending on the outcome of the audit.

Most of the taxpayers whose EBE claims for TY2003 were audited, were not audited again in the follow-up period.8 Thus, we cannot know their EBE compliance with certainty. Instead we study their behavior by fitting Hidden Markov Models (HMM) and Semi-Markov Models (HSMM) to compute indirect indicators of compliance.

The HMM and HSMM frameworks9 posit a Markov process whose state membership cannot be observed directly, only inferred by the behavior of an emission variable whose probability distribution differs across the hidden Markov states. The states we are interested in are valid EBE claims (which we refer to as compliant) and invalid EBE (which we refer to as noncompliant). For our purposes, only over-stated claims are considered noncompliant. A taxpayer who claims less EBE than the amount to which he may be entitled, or a taxpayer who claims no EBE at all, is considered compliant.

The compliance of a non zero EBE claim cannot be directly observed, but we can tell with certainty when no EBE claim was made, which makes “no-claim” a special kind of directly observable compliance. A given taxpayer may claim EBE in one tax year but not the following year, or vice versa. We tried dealing with the intermittency of EBE claims in two ways. In our first approach, we fit models to data that excluded any taxpayers who did not continuously claim EBE. We fit only HMM not HSMM to this kind of data.

In our second approach, we included intermittent claimers but added a no-claim Markov state, but ensured that this state would be easily distinguishable from the other states. Wherever there were no EBE claims, we assigned emission function values that were radically displaced from the emission function values we assigned for cases with non zero claims.10

When we fit HMMs, the two approaches, excluding intermittent claimants vs. including them and using a special no-claim state, yielded similar estimates for the parameters they shared. The second approach allows for a broader range of taxpayer behavior patterns. When we fit updated Hidden Markov and Semi-Markov Models, we always included intermittent claimers, giving their emission function a value sampled from a normal distribution greatly displaced in the direction of compliance.

Our notation for the HMM is as follows: EBE claims fall into one of $k$ compliance categories, labeled $C_{1},\ldots,C_{k}$ In the three-state example, $k=3$ and the categories are $C_{1}=$ “valid or understated EBE $>$ 0”, $C_{2}=$ “no EBE claim”, and $C_{3}=$ “overstated EBE”. A given taxpayer’s series of EBE claims generates a series of random variables $x(t)$ , where each $x(t)=C_{j}$ for some $j=1,\ldots,k$ and $0\leqslant t\leqslant m$ . Our HMM assumes these $x(t)$ ’s are a one-step Markov process with $\{C_{1},\ldots,C_{k}\}$ as the state space. Elements of the transition matrix are denoted:

$\displaystyle p_{ij}\equiv P[x(t+1)=C_{j}|x(t)=C_{i}]$ $\displaystyle\text{for all }j,i=1\ldots k$

We call the emission function $y(t)$ . Each element of the Markov state space, $C_{i}$ , generates a conditional probability distribution $F_{i}$ for the $y(t)$ values.

In an HMM, the time a case remains in a given state, called the sojourn time, follows a geometric distribution whose Bernoulli parameter appears on the diagonal of the transition matrix. A Hidden Semi-Markov Model (HSMM) is more general than an HMM, because it accommodates the specification of other waiting time distributions for the sojourn (If the HSMM specifies a geometric sojourn, it reduces to an HMM). Under a HSMM, once a case reaches the end of its sojourn, its transition behavior is like that under a regular Markov model.

HMMs and HSMMs are fit using an Expectation-Maximization (E-M) algorithm11 which performs an iterative fit, requiring several starting values. The first is an initial Markov transition matrix, $[p_{ij}]$ which we always specified with each entry equal to $\frac{1}{k}$ where $k$ is the number of Markov states. We specify initial values for the proportion of units initially occupying the Markov states, which we call $(p_{1}(0),p_{2}(0),p_{3}(0),\ldots,p_{k}(0))$ . For our fits, we specified equal proportions in all states. Finally, we require specification of the starting distributions $F_{1},\ldots,F_{k}$ of the emissions variable, conditional on each Markov state. We specified Normal distributions and provided initial location and spread parameters. Fitting an HSMM also requires selecting a type of sojourn distribution (we used gamma and a shifted Poisson) and providing starting values for its parameters.12

Prior to setting up the data for fitting models, we needed to address a missing data problem. EBE claim amounts are not generally available in IRS research databases for tax years prior to 2006. Using Statistics of Income (SOI) data, we built imputation models to fill gaps in our study subjects’ EBE claim histories. SOI data is a large, multipurpose, complex weighted statistical sample of filed tax returns. It is the primary source of authoritative estimates of tax reporting.13 We used these completed EBE histories both to determine who qualified to be in our nonaudited control group, and when necessary to construct portions of the emissions data vectors. Just as we had two versions of our HMM framework, we also had two versions of our imputation methods. These are described in detail in the Data Set-Up section.

The structure of the Form 1040 Schedule A Itemized Deductions imposes algebraic constraints on EBE reporting, which we exploited whenever possible. EBE is combined with other expenses into a preliminary total we refer to as “Gross Miscellaneous Deductions”. This is reduced by 2% of Adjusted Gross Income (AGI), and the result is recorded as Net Miscellaneous Deductions.14Although EBE amounts are not available for all Tax Years in our research databases, Net Miscellaneous Deductions and AGI are. The availability of these data fields allows us to cap our imputed EBE claims by Gross Miscellaneous Deductions, and to set to zero the effective EBE claims (imputed or known) for all taxpayers whose Net Miscellaneous Deductions amounts are zero.

The other components of Gross Miscellaneous Deductions include tax preparation expenses.15 Taxpayers with multiple sources of substantial nonwage income tend to file added schedules, where they report details of their non wage income. They also tend to incur higher tax-preparation expenses, due to their higher income levels and return complexity. We took advantage of the association between added schedules and the non-EBE components of Gross Miscellaneous Deductions, to improve imputations and to create emissions functions that are correlated with compliance.

Because we are interested in the persistence of the impact of a single audit, we did not allow any observation vectors to extend beyond the year of a second audit, if one occurred within our timeframe. We also truncated the observation vector of any taxpayer who did not file a tax return for any year within our time frame. Although ceasing to file returns is one possible response to an audit, we do not include that option in our Markov state space. We regard filing a return, when required by law, to be a pre-condition to reporting accurate amounts (including EBE claims) on the return.

To create the data to fit our models we needed to determine what variable associated with noncompliance to use as our emission function.16 At first glance, the criteria used to select returns for audit might seem a good candidate, because they are de facto indicators of likely noncompliance. We decided against using audit selection criteria, because by design, all the cases in the study groups met these criteria in the TY2003 baseline.

What we sought were variables associated with compliance, preferably unconditionally, but where necessary, conditionally upon meeting audit selection criteria. These variables determine what functions we use to populate our data vectors, and how we specify the emissions distributions for the HMM and HSMM fits. The analysis results we used to make these determinations are described in the Data Set-up section. We specified normally distributed emissions for the positive compliant and the positive non-compliant Markov states. The starting parameters were estimated via descriptive analysis of known change and no-change audit cases, with the mean for compliant cases smaller than the mean for non-compliant cases.

For the no-claim state, we specified normally distributed emissions, giving a large negative mean and a tiny variance as our starting values. The no-claim state was thus made to appear ultracompliant and as distinguishable as possible from the other states.

The final normal parameters for each Markov state’s emissions are one component of the fitted HMM and HSMM models. These are included in our Results section. The final fitted parameter values for the emission distributions preserved the initial order of the means for the three states as well as the huge separation between the no-claim emissions and the emissions for the other states.

Our strategy for compensating for any selection bias that results from using operational IRS audit cases for our treatment groups was to use a nonaudited control group that met basic audit selection criteria in TY2003, the baseline year. Our purpose in fitting HMMs was to see if this type of model could help gauge the impact of a particular IRS audit program on future compliance. Audit selection methods for the EBE audit program have been similar from year-to-year, so conclusions based on TY2003 study groups can help gauge the ongoing impact of the program on the portion of the EBE claimant population commonly subject to EBE operational correspondence audits.

3. Data Set-Up

We used several IRS data sources in our analysis. We have already mentioned NRP and SOI data. In addition, we used IRS research databases of filed Form 1040 returns and associated schedules, and operational audit results for the correspondence EBE audits project.

3.1 Imputing missing Employee Business Expenses

Our first models apply only to taxpayers who continuously claim EBE, and the Markov states are compliant and noncompliant. Our first set of imputation models was estimated to use with our first set of HMM fits. SOI data includes panels, so we exploited the EBE claimants subset of that data to estimate a multiyear imputation model, where known EBE claims for TY2006–2008 were available as predictors of missing EBE claims for TY2003–2005. Net Miscellaneous Deductions (NMD) is available for all tax years, so our models impute nonzero EBE only when NMD is nonzero. We performed no out-of-sample validation on this first set of imputation models. Fitted parameters for these models are shown in Table 2.

Table 2
Continuous claim multi-year variable linear regression model

Label	2003	2004	2005
Intercept	–	–	–
Line item #1	0.538	0.513	0.460
Line item #2	1.271	1.353	0.993
EBE claim in 2006	1.141	1.148	1.201
EBE claim in 2007	N/A	N/A	0.376
Binary indicator #3	$-$ 0.670	$-$ 0.531	$-$ 0.486
Binary indicator #4	$-$ 0.334	$-$ 0.454	$-$ 0.403
R-square	0.353	0.501	0.542
Adjusted R-square	0.352	0.500	0.541
Number of observations used	2,510	2,510	2,510

Source: SOI Individual Sample Data Tax Years 2003–2007.

We created another imputation model for our second HMM and HSMM fits. The framework for this set of fits includes “no EBE claim” as one of the states in the Markov space. Our imputation model correspondingly was designed to allow “no EBE claim” as one of its predicted values. The imputation proceeds in two stages. First, a logistic regression predicts whether there is a claim. Conditional upon having predicted there is a claim a second stage linear regression predicts the size of the claim. The data set for the stage two model included all EBE claimants, rather than all predicted claimants, which made it unwise to use conventional model fit summaries to assess how our imputation methods work. To address this problem, we reserved a 15% holdout sample to carry out final performance evaluations of our second HMM fit imputation models. We evaluated both the first stage logistic regression and the composite two-stage model. Fitted parameters for this model are shown in Table 3.

Table 3

Logistic results

Label	2003	2004	2005
Intercept	–	–	–
Line item #1	0.720	0.812	0.829
Line item #2	0.492	0.438	0.634
Ratio #1	2.200	3.108	2.336
Line item #3	$-$ 0.138	$-$ 0.326	$-$ 0.269
Line item #4	$-$ 0.350	$-$ 0.342	$-$ 0.348
Line item #5	$-$ 0.397	$-$ 0.397	$-$ 0.371
Line item #6	$-$ 0.197	$-$ 0.246	$-$ 0.283
Line item #7	$-$ 0.162	$-$ 0.152	$-$ 0.053
Line item #8	0.430	0.384	0.296
Percent concordant	92.3	92.4	93.4
Percent discordant	7.7	7.6	6.6
Number of observations used	14,448	14,921	20,059

Source: SOI Individual Sample Data Tax Years 2003–2005.

3.1.1 Continuous claim model

We looked at continuous EBE claimers who were in the SOI data from Tax Years 2003 to 2008. This limited the number of observations used to build the model. We used available variables from the Form 1040 and Schedule A in the imputation target years and the three closest known years (TY2006–2008) to build a linear regression model using the log scale EBE claim. This approach worked best for TY2005 and worst for TY2003, which was the most distant from the known EBE values.

3.1.2 Two-stage model for continuous and intermittent claimers

Including intermittent claimers allows us to include a no-claim state in the HMM which brings the modeled behavior closer to the available real-world states. For this model, we focused on same year variables. We reserved 15% of the data for final model testing for each year. We split the remaining data into estimation (70%) and validation sets (30%). This modeling approach had a couple of advantages: (1) we could include more observations (most people are not sampled continuously for the SOI sample), and (2) we avoided possibly inducing colinearity in the HMM vectors by having the early imputed EBE values be dependent on later known values.

We also improved the model by using a two-step estimation process. The first step was a logistic regression focused on estimating the probability of an EBE claim. Most returns with positive Net Miscellaneous Deductions have an EBE claim but some do not. The second step was linear regression to estimate the log of EBE claim magnitude similar to the continuous claim model. The claim magnitude was modeled using only positive EBE claims. The variables in each year were similar in magnitude and sign. By multiplying the two predictions together we had a combined EBE estimate with a range that included the zero values.

For both the logistic and claim magnitude regression we found that generally the same line items and indicator variables were significant across the three imputation years.

For our purposes, a false positive imputation was worse than a false negative one so we increased the cut-off for predicting the presence of an EBE claim from 0.5 to 0.75. We tested several cut-offs; 0.75 was the point which best balanced decreasing the false positives against increasing the missed true values.

The estimated proportion correctly predicted ranged from 0.88–0.89 for the three years in the weighted Estimation dataset and 0.91 for all three years in the weighted final test set.

Table 4 shows a comparison of the true and predicted values between the estimating data and the holdout sample.

Table 4
Logistic results

Confusion matrix for estimation dataset
(where $\hat{p}\geqslant$ 0.75 $=$ EBE claim)
2003			2004			2005
True	Predicted		True	Predicted		True	Predicted
	0	1		0	1		0	1
0	10%	6%	0	11%	5%	0	11%	5%
1	6%	78%	1	6%	78%	1	7%	78%
Confusion matrix for final test data
(where $\hat{p}\geqslant$ 0.75 $=$ EBE claim)
2003			2004			2005
True	Predicted		True	Predicted		True	Predicted
	0	1		0	1		0	1
0	11%	5%	0	10%	4%	0	11%	4%
1	4%	80%	1	5%	81%	1	5%	80%

Source: SOI Individual Sample Data Tax Years 2003–2005.

Our continuous claimant regression for our first HMM fit provided candidate predictors for the stage two magnitude regression. To these predictors we added a ratio and several binary indicators.

To evaluate the composite two-step model, we compared the root mean square error for the training and holdout (test) sample. The results, as with the logistic alone, were similar between the two sets. We used the composite (two-stage) model to impute missing EBE values for the second HMM fit.

Table 5a summarizes the fitted Stage Two model, and Table 5b compares the Root Mean Squared Error for the training and holdout samples.

Table 5a

Stage two magnitude regression results

Label	2003	2004	2005
Intercept	–	–	–
Line item #1	0.518	0.481	0.489
Line item #2	1.440	1.416	1.355
Ratio #1	2.214	2.216	2.521
Binary indicator #1	0.146	0.218	0.182
Binary indicator #2	$-$ 0.171	$-$ 0.170	$-$ 0.169
Binary indicator #3	$-$ 0.132	N/A	N/A
Binary indicator #4	0.359	0.226	0.393
R-square	0.698	0.665	0.685
Adjusted R-Square	0.697	0.665	0.685
Number of observations used	6,083	6,280	9,714

Source: SOI Individual Sample Data Tax Years 2003–2005.

Table 5b

Root mean square error table

Training data		Holdout data
2003 log scale RMSE	2.40	2003 log scale RMSE	2.25
2003 RMSE	$4,213	2003 RMSE	$4,026
2004 log scale RMSE	2.29	2004 log scale RMSE	2.39
2004 RMSE 2004	$4,816	2004 RMSE 2004	$4,357
2005 log scale RMSE	2.30	2005 log scale RMSE	2.35
2005 RMSE	$4,444	2005 RMSE	$4,392

Source: SOI Individual Sample Data Tax Years 2003–2005.

3.2 Emissions analysis

3.2.1 HMM fits for continuous EBE claimants only

For our first HMM fit, we used NRP data to estimate the distribution of one year changes in logged EBE claims, among the population of continuous claimers of EBE. NRP data includes audit results for a stratified sample of all individual income tax returns.17 Among the NRP cases audited for EBE compliance, there appeared to be partial separation between compliant and noncompliant cases, so we used the location and spread parameters from these distributions of one-year differences in logged EBE values as our initial values for describing emissions for the two Markov states. Figure 1 shows parallel histograms of one-year differences for compliant and non-compliant claims.

Figure 1.

Histograms of calculated emission function on NRP TY2010 audits.

3.2.2 HMM fits including intermittent EBE claimants

We could not use the one-year difference in logged EBE claims with data that included intermittent EBE claimants. Since the no-claim state is not hidden, we specified an initial emissions probability density concentrated in a displaced interval where the initial densities of the other two Markov states were close to zero. As we built the input data, we set observed emissions for no-claim years to values near the center of this displaced interval. Doing so made it unlikely that the fitting algorithm would even temporarily infer that a no-claim emission belonged to any but the no-claim Markov state.18

Using operational audit data from TY2006–2010 we estimated a logistic regression of audit change vs. no change on tax return characteristics.19 This logistic regression provided the basis for formulating the emission functions for compliant and noncompliant positive claim states and for selecting initial parameters for these states’ emission distributions. We compared histograms data set (shown in Fig. 2). of the fitted values for the compliant and noncompliant cases in our logistic regression data set. The two distributions could be described as similar but slightly different mixtures of normal distributions. The compliant distribution had a smaller expectation than the non-compliant, but the variances of both distributions were large enough that the densities had considerable overlap. We used the parameters of these empirical distributions as initial values. The initial emission distribution for the no-claim state was defined to have extremely little overlap with the other two initial densities, centered well to their left and with a small spread.20

When there was a claim, the emission function was based on indicators for Interest and Ordinary Dividends, and Capital Gains and Losses, Total Itemized Deductions as reported on Form 1040, and Net Miscellaneous Deductions as reported on Schedule A.

Figure 2.

Histograms of calculated emission function on multi-year sets of no-change and change EBE audits.

Figure 3.

Histograms of calculated emission function for each analysis set.

Once we had decided on an emission function we needed to create the emission data vectors for our three study groups. The three data files consisted of taxpayers who were audited for EBE in Tax Year 2003 and received an adjustment (the change group); taxpayers who were audited for EBE in TY2003 and received no adjustment (the no-change group); and taxpayers who were eligible for an EBE audit in TY2003, but did not receive one (the no-audit group). Observations began in returns for TY2005, at which point most taxpayers would know the outcome of their audits, and continued as far as TY2012. The string of observations for any taxpayer terminated if they were audited again or stopped filing altogether. Taxpayers were not dropped from the sample if they claimed no EBE for one or more years; instead, “no EBE claim” was defined as a third, technically nonhidden Markov state, with its emission normal around $-$ 10 with a small variance. Table 6 and Fig. 3 summarize the three study data files for the second HMM fit.

Table 6

Study groups

Group	Number of	Number of
	taxpayers	observations
Audit with tax change 2003	8,692	58,440
Audit with no change 2003	2,384	16,750
No audit 2003	16,269	111,182

4. Results

We fit two sets of HMM models, one set to only continuous EBE claimants, and the second set with intermittent claimants included. For both sets of fitted models we had three groups of subjects. The first group was a random sample of taxpayers with Net Miscellaneous Deductions who were not audited for EBE, but whose imputed EBE claims for TY2003 met the criteria for possible noncompliance. The second consisted of taxpayers whose TY2003 EBE claims were audited but with no resulting tax change. The third consisted of taxpayers whose TY2003 EBE audits led to a tax change.

4.1 HMM fits for continuous EBE claimants

The emissions function for these models was a one year difference in logged EBE claims (equivalently the log of the ratio of two successive claims). We obtained the estimated transition matrices shown in Tables 7–9.

Table 7
Estimated transition matrix for Un-audited continuous claimants

	Compliant	Noncompliant
Compliant	0.668	0.332
Noncompliant	0.118	0.882

Source: Form 1040 and Schedule A filings 2003–2012.

Table 8

Estimated transition matrix for audit with no change continuous claimants

	Compliant	Noncompliant
Compliant	0.751	0.249
Noncompliant	0.151	0.849

Source: Form 1040 and Schedule A filings 2003–2012.

Table 9

Estimated transition matrix for audit with tax change continuous claimants

	Compliant	Noncompliant
Compliant	0.602	0.398
Noncompliant	0.225	0.775

Source: Form 1040 and Schedule A filings 2003–2012.

In all three groups, the noncompliant state is “stickier” than the compliant. The probability of staying in the same state from one year to the next in the post audit period is higher for noncompliant taxpayers. Compared to the control group, both audited groups are less likely to remain in the noncompliance Markov state having once entered it, but the decrease is more pronounced for the tax change group. In this sense audits improve compliance, regardless of the outcome. On the other hand, the no-change audit group is less likely to remain in the compliance state having once entered it, even though the tax change group is more likely to do so. In this sense, the no-change audit has a perverse effect on future compliance. Exposing a seemingly compliant taxpayer to an unnecessary audit could worsen his future compliance.

In summary, after a tax change audit, the noncompliant Markov state becomes less “sticky”. After a no-change audit, the compliant Markov state becomes more “sticky”.

4.1.1 Sojourn distributions

Tables 10–12 show the computed probabilities for geometric distributions with the diagonal elements from the estimated transition matrices as the Bernoulli parameters. Our discussion refers to how long it takes a taxpayer to switch states. By this we mean how long it takes to make the first switch. Some taxpayers will of course switch back and forth between compliance states. The larger the one-year transition probabilities in the first row of the table, the more volatile state occupancy will be over a set period of follow-up years. In this sense, no-change audited taxpayers are less volatile than unaudited controls, and those audited with tax change are more so.

Table 10
Estimated sojourns for unaudited continuous claimants

	Probability of move from noncompliant	Cumulative probability	Probability of move from compliant	Cumulative probability
	to compliant		to non-compliant
1	11.8%	11.8%	33.2%	33.2%
2	10.4%	22.2%	22.2%	55.4%
3	9.2%	31.4%	14.8%	70.2%
4	8.1%	39.5%	9.9%	80.1%
5	7.1%	46.6%	6.6%	86.7%
6	6.3%	52.9%	4.4%	91.1%
7	5.6%	58.5%	2.9%	94.0%

Source: Form 1040 and Schedule A filings 2003–2012.

Table 11

Estimated sojourns for audit with no change continuous claimants

	Probability of move from noncompliant	Cumulative probability	Probability of move from compliant	Cumulative probability
	to compliant		to noncompliant
1	15.1%	15.1%	24.9%	24.9%
2	12.8%	27.9%	18.7%	43.6%
3	10.9%	38.8%	14.0%	57.6%
4	9.2%	48.0%	10.5%	68.1%
5	7.8%	55.8%	7.9%	76.0%
6	6.7%	62.5%	5.9%	81.9%
7	5.6%	68.1%	4.5%	86.4%

Source: Form 1040 and Schedule A filings 2003–2012.

When unaudited continuous EBE claimants start out as noncompliant we estimate they remain so for many years; it takes six years before at least half of them have switched to the compliant state (the cumulative probability for the sixth follow-up year is 0.529). When unaudited continuous claimants start out compliant, over half of them have switched to noncompliance within two years (the cumulative probability for the second follow-up year is 0.554). Apparently, EBE audits are necessary to maintain compliance in the group of taxpayers whose EBE claims meet IRS audit selection criteria (see Table 10).

In contrast to the unaudited controls, we estimate that when a no-change taxpayer becomes non-com- pliant he returns to compliance within five years rather than the six years it takes for the unaudited group.21 After a no-change audit, we estimate that within three years, rather than two, he will have switched to noncompliant (see Table 11).

The tax change group has the shortest estimated sojourn times. We estimate that within three years over half (53%) of the initially noncompliant have switched to compliance, while within two years almost two thirds (64%) of the initially compliant have switched to noncompliance (see Table 12). In comparison to the unaudited, when a taxpayer’s EBE audit has led to a tax change but he continues to claim EBE, his EBE compliance is estimated to be more changeable from year to year. This is not the case for the audited but no-changed continuous EBE claimant.

Table 12

Estimated sojourns for tax change continuous claimants

	Probability of move from noncompliant	Cumulative probability	Probability of move from compliant	Cumulative probability
	to compliant		to noncompliant
1	22.5%	22.5%	39.8%	39.8%
2	17.4%	39.3%	24.0%	63.8%
3	13.5%	53.0%	14.4%	78.2%
4	10.5%	63.5%	8.7%	86.9%
5	8.1%	71.6%	5.2%	92.1%
6	4.9%	76.5%	3.1%	95.2%
7	3.8%	80.3%	1.9%	97.1%

Source: Form 1040 and Schedule A filings 2003–2012.

We noticed some unexpected behavior of higher powers of the transition matrices, which consist of k-step transition probabilities, where k is the power of the one-step matrix. The two audited groups’ k-step matrices converged to very similar matrices within 16 steps, while the nonaudited group’s matrix converged to a different matrix with a more “sticky” noncompliant state. It is unrealistic to expect an audit impact to persist for as long as 16 years, so we would have expected all three k-step transition matrices to converge to a common matrix.22 Using the “remain in current state” probabilities, the expected sojourn times for the three groups are given in Tables 10–12.

Table 10 shows that if we follow a group of unaudited taxpayers who become newly compliant, the fitted HMM estimates that after four years we should expect 80% of them to have switched to non-compliance. Table 11 shows that for audited and no-change taxpayers we should expect it to take almost six years, whereas Table 12 shows that for audited and changed the figure is about three years. This seems to indicate that a change audit shortens a taxpayer’s sojourn in the compliant state, the contrary of what an enforcement agency would hope. But it should be remembered that a change audit is viewed as causing a fresh entry into the compliant state, whereas many no-change audits are of taxpayers who have occupied the compliant state for some time already. No-change audits also confirm for taxpayers that they or their tax professionals have interpreted the EBE claim rules properly.

4.2 HMM fits for all EBE claimants

The emissions function for cases with an EBE claim was computed based on a logistic regression fit to known compliant and noncompliant claims. When there was no EBE claim, the emissions were set to a large negative value. The fitted transitions for the three groups are given in Tables 13–15.

Table 13
Estimated transition matrix for audited with tax change EBE claimants

	No claim	Compliant	Noncompliant
No claim	0.7790	0.1530	0.0680
Compliant	0.2293	0.7324	0.0383
Noncompliant	0.1800	0.1030	0.7170

Source: Form 1040 and Schedule A filings 2003–2012.

Table 14

Estimated transition matrix for Un-audited EBE claimants

	No claim	Compliant	Noncompliant
No claim	0.7780	0.1460	0.0760
Compliant	0.1980	0.7550	0.0470
Noncompliant	0.1460	0.1090	0.7450

Source: Form 1040 and Schedule A filings 2003–2012.

In all three groups, the no-claim state is the stickiest, and the noncompliant the least sticky, but the difference is noticeable only for the tax change group (see diagonal elements of Tables 13–15). No-change taxpayers are a bit less likely to migrate into the no-claim group than the unaudited, while tax change taxpayers are more likely to do so. Migration into the compliant state is similar for all three groups (the first columns of each matrix look similar), but is slightly lower for both audited groups, compared to the unaudited. Migration from compliant to noncompliant gets a little worse when the audit results in no tax change, and a little better when it results in a tax change, using the unaudited as a reference point (see column 3 of Tables 13–15).

We looked again at the behavior of the k-step transition matrices. This time it was the no audit and audit with tax change cases that most closely resembled each other. After 20 steps the audit no-change matrix had uniformly higher transitions to the compliant state and lower transitions to the no-claim state, compared with the no audit and tax change matrices.

Table 15

Estimated transition matrix for audited with no-change EBE claimants

	No claim	Compliant	Noncompliant
No claim	0.7980	0.1580	0.0440
Compliant	0.1685	0.7790	0.0525
Noncompliant	0.1045	0.1175	0.7780

Source: Form 1040 and Schedule A filings 2003–2012.

Table 16

Three treatment groups – fitted HMM transition matrices – short data vectors

Treatment	Change audit			No audit			No-change audit
	no cl	compl	noncom	no cl	compl	noncom	no cl	compl	noncom
No claim claim	0.7790	0.1530	0.0680	0.7780	0.1460	0.0760	0.7980	0.1580	0.0440
Compliant	0.2293	0.7324	0.0383	0.1980	0.7550	0.0470	0.1685	0.7790	0.0525
Noncompliant	0.1800	0.1030	0.7170	0.1460	0.1090	0.7450	0.1045	0.1175	0.7780

Source: Form 1040 and Schedule A filings 2003–2012.

Table 17

Three treatment groups – fitted HMM transition matrices – long data vectors

Treatment	Change audit			No audit			No-change audit
	no cl	compl	noncom	no cl	compl	noncom	no cl	compl	noncom
No claim claim	0.7917	0.1439	0.0644	0.7928	0.1358	0.0714	0.8259	0.1310	0.0431
Compliant	0.2234	0.7325	0.0441	0.1930	0.7556	0.0514	0.1640	0.7779	0.0581
Noncompliant	0.1780	0.0991	0.7229	0.1415	0.1057	0.7528	0.1056	0.1127	0.7817

Source: Form 1040 and Schedule A filings 2003–2013.

Table 18

Three treatment groups – fitted HSMM – geometric sojourn

Treatment	Change audit			No audit			No-change audit
	no claim	compl	noncom	no claim	compl	noncom	no claim	compl	noncom
No claim c		0.6908	0.3092		0.6553	0.3447		0.7527	0.2473
Compliant	0.8352		0.1648	0.7898		0.2102	0.7385		0.2615
Noncompliant	0.6425	0.3575		0.5725	0.4275		0.4837	0.5163
Same state	0.7917	0.7325	0.7229	0.7928	0.7556	0.7528	0.8259	0.7779	0.7817
E(sojourn)	4.801	3.739	3.609	4.502	4.580	5.742	4.091	4.045	4.826
Emission mu(sojourn)	$-$ 10.00	1.02	2.29	$-$ 10.00	0.88	2.29	$-$ 10.00	1.28	2.42
Emission sigma	0.0001	0.6860	0.0480	0.0001	0.5176	0.0401	0.0001	0.6538	0.0214

Source: Form 1040 and Schedule A filings 2003–2013.

Table 19

Three treatment groups – fitted HSMM – gamma sojourn

Treatment	Change audit			No audit			No-change audit
	no claim	compl	noncom	no claim	compl	noncom	no caiml	compl	noncom
No claim c		0.71646	0.28354		0.67874	0.32126		0.77767	0.22233
Compliant	0.85071		0.14929	0.79405		0.20595	0.76289		0.23711
Noncompliant	0.65079	0.34921		0.55820	0.44180		0.47100	0.52900
Gamma shape	1.889	2.036	2.588	1.837	2.100	2.672	1.853	2.159	2.946
Gamma scale	1.475	1.537	1.285	1.395	1.508	1.303	1.530	1.584	1.235
Expected sojourn	2.786	3.128	3.327	2.562	3.168	3.481	2.837	3.419	3.639
Emission mu	$-$ 10.00	1.03	2.28	$-$ 10.00	1.28	2.41	$-$ 10.00	0.88	2.29
Emission sigma	0.0001	0.6939	0.0489	0.0001	0.6564	0.0221	0.0001	0.5271	0.0401

Source: Form 1040 and Schedule A filings 2003–2013.

Table 20

Three treatment groups – fitted HSMM – poisson sojourn

Treatment	Change audit			No audit			No-change audit
	no claim	compl	noncom	no claim	compl	noncom	no claim	compl	noncom
No claim c		0.72410	0.27590		0.69217	0.30783		0.78484	0.21516
Compliant	0.90579		0.09421	0.86866		0.13134	0.82621		0.17379
Noncompliant	0.77236	0.22764		0.67482	0.32518		0.57013	0.42987
Poisson scale	2.507	3.396	3.502	2.062	3.576	3.658	2.559	3.663	3.478
Poisson shift	1	1	1	1	1	1	1	1	1
Expected sojourn	2.507	3.396	3.502	2.062	3.576	3.658	2.559	3.663	3.478
Emission mu	$-$ 10.00	1.08	2.28	$-$ 10.00	1.34	2.42	$-$ 10.00	0.95	2.30
Emission sigma	0.0001	0.7265	0.0509	0.0001	0.6825	0.0222	0.0001	0.5873	0.0392

Source: Form 1040 and Schedule A filings 2003–2013.

4.3 HMM fits for all EBE claimants – extended data sets

After we had fit our first set of Hidden Markov Models, another year of tax return data became available. We extended our data vectors by one year and refit the models to the same set of taxpayers’ data. Tables 16 and 17 present the estimated transition matrices from both fits in compact form. The sets of transition matrices are similar, but not identical. The main difference we noted is that the no-claim state is estimated to be a little “stickier” when we use longer data vectors. Certainly, it is the case that some taxpayers who have once claimed EBE permanently stop claiming this deduction.23 It may be that using longer and longer data vectors as we fit Hidden Markov Models would make the no-claim state look more and more sticky.

4.4 HSMM fits – extended data sets

Using data for the longer observation vectors, we explored the impact of relaxing the strict Markov assumption. We fit two kinds of Hidden Semi-Markov Models, one specifying a gamma distribution for sojourn times, another specifying a Poisson distribution. HSMM models’ fitted parameters omit diagonal elements of the transition matrix, instead providing estimated parameters and expected values of the gamma or Poisson sojourns. The off-diagonal elements of the rows of the transition matrix are scaled to sum to one, since they are conditional probabilities of entering a new state, given that a sojourn has come to an end. Table 18 reformulates and extends the information in Table 17, providing all of the long data vector fitted HMM models’ estimated parameters and presenting it as a HSMM with a geometric sojourn. Tables 19 and 20 provided analogous estimates for the fitted gamma and Poisson HSMMs.

Both the gamma and Poisson Semi-Markov fits have noticeably shorter expected sojourns than the geometric sojourn model (i.e. the full Markov model). The gamma fit’s estimated conditional transition probabilities are closer to the full Markov model’s than are the Poisson fit’s transitions. The estimated gamma shape parameter is however quite far from one, so the gamma sojourn is not even approximately geometric, as it was in the regular HMM. The Poisson estimated transitions are tilted toward the no-claim state, more so than the gamma’s. We noted in Section 4.3 that using longer (as opposed to shorter) observation vectors also made the full Markov model’s transitions tilt towards the no-claim state. The Poisson model fitting may be more responsive to cases where the claimant permanently stops claiming EBE.

All three fits have similar emission distribution estimates, with somewhat larger values of mu for the noncompliant state, as compared to the compliant. All the models in Tables 18–20 have the same peculiar pattern in the estimated emission sigmas, which are an order of magnitude smaller for the non-compliant claims than for the compliant claims. This is troublesome, since it does not reflect how distributions of the emission function look for known compliant and noncompliant claims. The no-claim groups have tiny estimated sigmas and huge negative means, very close to the starting values we provided. Our strategy of deliberately displacing the observed emissions function values for the no-claim state to keep it distinguishable from the others seems to have worked.

5. Conclusions

Hidden Markov and Semi-Markov Models produce estimates of post-audit EBE compliance behavior that concur to some extent with the findings of other research on audit persistence [2]. We found some evidence that undergoing a no-change audit can adversely affect a taxpayer’s compliance, but the impact is not large.

Footnotes

https://www.irs.gov/PUP/newsroom/tax%20gap%20estimates%20for%202008%20through%202010.pdf.

https://www.irs.gov/pub/irs-soi/mazur.pdf.

Unreimbursed Employee Expenses include expenses paid or incurred during the tax year for carrying on one’s trade or business of being an employee, and are considered ordinary and necessary. Included are expenses such as limited liability insurance, specialized equipment, or memberships in professional societies. See IRS Publication 529 (https://www.irs.gov/publications/p529) for a more complete discussion.

All following references to years are tax years.

We start our data vectors for the Hidden Markov Models with Tax Year 2005, to allow for lags between filing and auditing of tax returns. An audit can have no impact on future compliance until the taxpayer is aware of it. Correspondence audits have a fairly quick turnaround time, so by the time TY2005 returns must be filed, most audits of TY2003 returns will have been completed.

Returns may be selected for examination based on machine scoring. Both the audited and control groups have comparable scoring distributions.

A small percentage of such taxpayers were reaudited, in some instances several times, until their EBE claim was found to be compliant. These cases were excluded from our modeling dataset. When the repeat audit taxpayers were analyzed separately, the number of repeat audits had an approximately geometric distribution. This small set of taxpayers whose EBE compliance was repeatedly and directly observed did have a sojourn time in the noncompliant state that is at least superficially consistent with the notion that their EBE compliance operates like a two-state Markov process.

We follow the approach described in [].

The emission function for non-zero claim cases was based on a logistic regression fit to cases whose compliance was known. The emission function mean was smaller for compliant cases than for noncompliant cases. We defined emissions for no-claim cases as following a normal distribution with a very large negative mean (i.e. a mean far displaced in the direction of compliance). We also set the variance to a very low value, with the result that the emissions distributions of no-claim and non-zero claim cases had essentially no overlap.

As implemented in the R package mhsmm [4, ].

When the gamma shape parameter is set to one it reverts to the exponential. Thus, the HSMM fit could converge to a memoryless sojourn time, which would indicate that a fully Markov model might have been adequate. The Poisson has the advantage of being a discrete distribution that is under-dispersed compared to the implicit geometric distribution for sojourn time of a fully Markov model.

https://www.irs.gov/pub/irs-soi/14indescofsample.pdf.

If the reduction yields a negative number, Net Miscellaneous Deductions is set to zero. It is thus possible for a taxpayer to record an EBE amount that, because of the 2% of AGI reduction has no effect on his tax. We focus in our analysis only on effective EBE claims, that is, those where the Net Miscellaneous Deductions amount is positive.

None of these other components of Gross Miscellaneous Deductions were available for this research.

In contrast to other HMM fitting packages, mhsmm allows the user to stack short vectors for many subjects into one long vector, and to use that as the data input. Once the data are stacked in this way, some pairs of sequential entries in the long data vector correspond to a change in subjects, rather than representing sequential emissions from the same subject. The user also provides mhsmm with a list of integers that tells the package how many entries in the long data vector sequence are associated with each subject.

Under NRP protocols, all line items on a return are potentially subject to audit, even items that do not necessarily meet selection criteria for routine tax audits.

The final fit preserved this large displacement of the no-claim emission distribution.

This logistic regression differs from the one used for imputation.

Recall that the no-claim state is a directly observable type of compliance, hence our decision to center it to the left of the compliant nonzero claim, which is already slightly to the left of the noncompliant nonzero claim.

IRS audits are not infallible, so some small percentage of no-change taxpayers may have been non-compliant. In addition, some compliant no-change taxpayers may immediately switch to non-compliance after their audit.

To check whether our expectation of a common limiting k-step matrix was justifiable, before fitting our second set of HMMs (which allowed for a no claim state), we fit a directly observable Markov model to a two-state space, with the states being EBE claim and no EBE claim. The k-step transition matrices for the three groups took longer to converge, closer to 30 steps, and when they did it was the no audit and audit no-change groups whose matrices resembled each other. The no-claim state was stickier for the audit tax change group than for the other two groups, both in the one-step and the long-term stable k-step matrices. We thus found the behavior of very long-term transition matrices estimated by HMM somewhat contradicted by the behavior of analogous matrices fit to a fully observable claim/no-claim Markov process.

Even compliant taxpayers might do this because they have changed employers or professions.

Acknowledgments

Thank you: Barry Johnson, Director SOI, for the data and support, Stephen Klotz of SB/SE Research Group 4, for the time to do this project, and Thi Nguyen of OCA/RAAS, for portions of the analysis.

References

DeBacker

Heim

Tran

, and Yuskavage

, Once bitten, twice shy? The lasting impact of IRS audits on individual tax reporting, 2015. Available at: http://conference.nber.org/confer/2015/PEs15/DeBacker_Heim_Tran_Yuskavage.pdf.

Beer

Kasper

Kirchler

, and Erard

, Do audits deter future noncompliance? Evidence on self-employed taxpayers, Taxpayer Advocate Service 2015 Annual Report to Congress, Vol 2, 2016. Available at: https://www.irs.gov/pub/irs-soi/16resconkirchler.pdf.

Rabiner

L.R.

, A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 1989, 257–286.

O’Connell

, and Højsgaard

, Hidden semi-Markov models for multiple observation sequences: The mhsmm package for R. Journal of Statistical Software, 2011, 1–22.

O’Connell

, and Højsgaard

, mhsmm: Parameter estimation and prediction for hidden Markov and Semi-Markov models for data with multiple observation sequences. R package version 0.4.14, 2015.

Estimating persistence in Employee Business Expense correspondence examinations using Hidden Markov and Semi-Markov models 1

Abstract

Keywords

1. Introduction

Table 1 Unreimbursed employee expenses filings for tax years 2003–2013

1.2 Problem description

2. Methodology

3. Data Set-Up

3.1 Imputing missing Employee Business Expenses

Table 2 Continuous claim multi-year variable linear regression model

3.1.2 Two-stage model for continuous and intermittent claimers

Table 4 Logistic results

3.2.1 HMM fits for continuous EBE claimants only

4.1 HMM fits for continuous EBE claimants

Table 7 Estimated transition matrix for Un-audited continuous claimants

Table 10 Estimated sojourns for unaudited continuous claimants

Table 13 Estimated transition matrix for audited with tax change EBE claimants

4.4 HSMM fits – extended data sets

5. Conclusions

Footnotes

Acknowledgments

References

Table 1
Unreimbursed employee expenses filings for tax years 2003–2013

Table 2
Continuous claim multi-year variable linear regression model

Table 4
Logistic results

Table 7
Estimated transition matrix for Un-audited continuous claimants

Table 10
Estimated sojourns for unaudited continuous claimants

Table 13
Estimated transition matrix for audited with tax change EBE claimants