Abstract
Abstract
Overpayment estimation using a sample of audited medical claims is an often used method to determine recoupment amounts. The current practice based on central limit theorem may not be efficient for certain kinds of claims data, including skewed payment populations with partial overpayments. As an alternative, we propose a novel Bayesian inflated mixture model. We provide an analysis of the validity and efficiency of the model estimates for a number of payment populations and overpayment scenarios. In addition, learning about the parameters of the overpayment distribution with increasing sample size may provide insights for the medical investigators. We present a discussion of model selection and potential modelling extensions.
Keywords
Introduction
Medical expenditures are a significant part of the governmental budgets. For instance, US health care spending grew 3.6% in 2013, reaching US$ 2.9 trillion or US$9, 255 per person, which accounts for 17.4% of the nation's gross domestic product (CMS, 2014a). It is reported by US governmental agencies that each year 3% to 10% of the overall healthcare spending is lost to fraud, waste and abuse (Shin et al., 2012. Medical fraud is defined as an intentional deception or misrepresentation made by a person or an entity, with the knowledge that the deception could result in some kinds of unauthorized benefits (NHCAA, 2012), whereas waste and abuse differ by the level of intention and knowledge. We will use the term overpayment in reference to fraud, waste and abuse. These overpayments have direct cost implications to the government and to the taxpayers. In addition, overpayments diminish the ability of the medical systems to provide quality care to the deserving patients (Anderson and Hussey, 2001). The size and complex nature of the medical data make the use of sampling and estimation methods necessary for extrapolation of the overpayments. This article proposes an overpayment model that is shown to be valid with respect to the governmental guidelines, and discusses a number of cases where it can be an efficient alternative in recovering overpayment.
In the USA, governmental medical services are mainly provided through the federal and state programmes of Medicare and Medicaid which are administered by the Centers for Medicare & Medicaid Services (CMS). There are a number of initiatives to oversee the health care spending. The 2013 annual report (OIG, 2013a) prepared by the joint efforts of Department of Health and Human Services and Department of Justice gives a broad overview of the current governmental efforts against overpayments. In its five-year strategic plan of 2013, Office of Inspector General (OIG) categorizes such efforts as identification and investigation of fraudulent activities that lead to overpayment, obtaining fair recovery amounts from wrongdoers and fraud prevention (OIG, 2013b). The following subsection discusses the use of statistical methods in medical fraud assessment, particularly overpayment estimation.
Statistical methods in medical fraud assessment
Identification of the overpayments are ideally done by domain experts via audits of the medical claims. However, comprehensive auditing is only feasible for a small number of claims and for cases where overpayment can easily be identified due to irrefutable evidence. Such an example can be providers who bill for dead beneficiaries. In many cases, it is impractical to identify each overpaid claim because of the complex nature and size of the medical data. This makes the use of statistical approaches a necessity in medical fraud assessment. Statistical approaches mainly include the use of data mining, sampling and estimation methods. Various data mining approaches are proposed to reveal existing patterns and flag potentially fraudulent claims, see Li et al. (2008) for a comprehensive review. This article focuses on sampling and estimation methods.
An important consideration in Medicare audits is the fair estimation of the recovery amounts. Sampling design is an important choice in this general framework. In the USA, use of probability sampling methods for medical investigations has been accepted to be a part of the legal framework since 1986. Yancey (2012) provides a comprehensive list about these legal sampling procedures and the parties involved in US governmental medical insurance programmes. Section 8.4 of CMS guidelines (CMS, 2011) lists a number of steps for the construction of valid sampling and estimation methods:
Selection of the provider; Selection of the period; Definition of the universe, sampling unit and sampling frame; Design of the sampling plan, selection of the sample; Review of each sampling unit; and Estimation of overpayment.
A requirement of a valid sampling design, that is listed in the fourth step, is that each unit in the sample must have a known probability of selection that is greater than 0. Simple random sampling, systematic sampling, stratified sampling and cluster sampling or a combination of these are listed as the most common acceptable sample designs in Medicare audits (see the relevant discussions in CMS, 2014b; OIG, 2013c; OIG, 2014a). Daniel (2011) presents the advantages and disadvantages of the various sampling designs, see also Cochran (2007) for an overview. To keep the article parsimonious and focused on the overpayment estimation that is listed as the sixth step, we use simple random sampling.
The population of interest in these procedures is usually the payment amounts to a provider some of which result with overpayments. A payment amount associated with a claim can result in one of three outcomes when audited. A claim can be classified as completely legitimate, completely illegitimate or partially overpaid. Claims data, where each claim is either a legitimate payment, or a completely illegitimate payment, is referred to as ‘all or nothing’. According to the current governmental sampling guidelines (CMS, 2001), in most situations the lower limit of a one-sided 90% confidence interval for the total overpayments should be used as the recovery (recoupment) amount from the provider under investigation. Using the lower bound allows for a conservative recovery without requiring the tight precision to support the point estimate, sample mean. The sole application of central limit theorem (CLT) is based on the assumption that overpayment population either follows the normal distribution or that the sample size is reasonably large. However, it is very common that medical claims data exhibit skewness and non-normal behaviour requiring large sample sizes for the valid application of CLT.
Edwards et al. (2003) showed that methods based on the CLT may not perform well for certain kinds of overpayment populations with small sample sizes. As an alternative, they propose the ‘minimum sum method’, a non-parametric inferential method which makes use of the hyper-geometric distribution and compute the respective lower bound estimates. If negative overpayments are non-existent or of negligible frequency, as is often the case, minimum sum method is shown to be mathematically valid such that it provides a lower confidence bound for total overpayment with confidence level greater than or equal to the nominal level of 90%. A number of extensions are proposed for the minimum sum method. Ignatova and Edwards (2008) utilized it within a two stage sampling procedure in that they use first stage (probe) samples to decide if the cost of additional sampling is justified. Gilliland and Feng (2010) provided an adaptation in order to address cases of varying payments. Gilliland and Edwards (2010) improved its efficiency via randomized lower bounds in which payment amounts are audited in equal sized packets. Edwards et al. (2015) extended their packet sampling idea by using penny samples. The minimum sum estimates are shown to be efficient in the recovery of overpayment from the claims that are essentially ‘all or nothing’. Edwards et al. (2003) discussed a simple extension, so-called q-adaptation minimum sum method, which is based on re-definition of illegitimate payments; so that the payments are defined as illegitimate if q per cent of the payment is in error. In addition to these, standard stratified expansion (Buddhakulsomsiri and Parthanadee, 2008) and combined ratio estimators of the total are also proposed. Mohr (2005) also presented a normality based overpayment model to capture ‘all or nothing’ billing pattern.
These methods are shown to be practically robust and provide a good coverage of lower bound estimates. However, Mohr (2005) acknowledged they are not able to capture the heterogenous nature of claims data, therefore may not be efficient for such cases. Edwards (2011) discussed the fact that there are also other certain kinds of illegitimate provider behaviours such as overcharging that correspond to partial overpayments. There are rare attempts to address this issue. King and Madansky (2013) explicitly modelled the overpayment percentage using a two-valued step function and a continuous exponential function. They assumed overpayment to follow gamma distribution, and proposed a proportional stratification sampling method. Edwards et al. (2015) considered one partial overpayment pattern in assessment of penny sample adaptation of the minimum sum method. The beta mixture model of Ekin et al. (2015) addressed the mixed distribution characteristics of claims data. To the best of our knowledge, there is not an estimation model that is shown to be valid for a number of payment and overpayment scenarios, and that can potentially be an efficient alternative in the cases of multi-modal overpayment patterns.
Motivation
In some situations, an investigator may need to consider completely legitimate, and completely and partially overpaid claims simultaneously to learn about the overpayment population. Most of the existing models lose efficiency in the presence of partial overpayments in addition to ‘all or nothing’ claims, since they do not explicitly model the characteristics of partial overpayment patterns.
Our interest is on such cases with heterogeneous medical data with multiple patterns and with a spike of zero values. These cases are not rare in medical claims data. For instance, an OIG investigation reports that ‘skilled nursing facilities’ (SNFs) billed one quarter of all claims in error in 2009, resulting in US$1.5 billion in inappropriate Medicare payments (OIG, 2012a). The majority of the claims in error were overcharged, meaning that they were billed at higher levels than warranted. In another OIG investigation, it was reported that Medicare inappropriately paid US$6.7 billion in 2010 for claims of ‘evaluation and management’ (E/M) services (OIG, 2014b) which also included cases of overcharging. Recommendations of OIG to CMS include encouraging contractors to review E/M services billed by overcharging physicians, and follow up on claims for E/M services that were paid for in error. CMS agrees to reassess the effectiveness of reviewing claims billed by over charging physicians. These are some of our motivating examples in which the claims data include partial overpayments in addition to completely legitimate claims. Our model can also be utilized when the population of interest is all claims paid to a given health care provider in a specified time frame, or individual claim lines for a particular procedure code with various modifier codes.
The existing practice includes the use of point estimates of overpayments and does not reveal the actual overpayment pattern. However, capturing and learning about the inflated and mixed characteristics of the overpayment distribution is crucial. This has recently gained more attention because of the shown persistence of the overpayments despite the increased amounts of the recovery. As pointed out by Musal (2010), the federal budgetary report of ‘US Office of Management and Budget’ argued that the existing measures have not resulted with a trend of decrease in the overall level of health care overpayments. It should be noted that one of the ultimate goals of medical fraud assessment is to inhibit the sustained culture of medical overpayments. In a related area, there is an increasing awareness about the low-level partial overpayments. For instance, a press release by CMS urges senior citizens to join the fight against fraud despite the fact that the dollar amounts involved with these claims are relatively smaller (CMS, 2013a).
The proposed Bayesian inflated mixture-based model links the known payment population and the information gathered from a sample of audited investigations. The initial objective of the model is to provide valid estimates that comply with the governmental guidelines. All overpayment methods are expected to provide a lower confidence bound with at least 90% confidence. Second, we investigate the efficiency of the model with respect to the recovery amount, with a focus on partial overpayments. Third, our model can also help reveal the overpayment pattern and quantify learning. It allows the decision-makers to conduct simultaneous estimation of both probability of each overpayment pattern and the percentage of partial overpayments, and can result in further insights. This can potentially have a positive impact on the culture of medical overpayments.
In Section 2, we provide a brief review of the inflated zero and one mixture models, motivate the use of the Bayesian approach and describe how our model fits within the literature. Section 3 explains our modelling framework. Section 4 provides an analysis with respect to validity and efficiency of the model, and comparisons with the current practice. Section 5 discusses model selection, whereas Section 6 presents modelling extensions. The article concludes in Section 7 with a discussion of findings and future work.
Literature review
The review provided in this section is by no means exhaustive and mainly aims to show the proposed model's fit within the literature. We provide a discussion of the existing zero-inflated models, gamma mixture models and show the relevance of Bayesian approaches for audits.
The existence of abundant number of a particular value in a dataset has been a long-known and well-studied phenomena. Such models can be dated back to Aitchison (1955) that considers an inference problem for such a mixture population. The seminal work by Lambert (1992) discusses models for a manufacturing dataset with abundant zeroes for the number of defective items. Min and Agresti (2005) studied biomedical and sociological applications in which there is a spike of zeroes, referring to this as having zero inflation. The authors proposed Poisson and negative binomial distribution-based models with random effects for pharmaceutical and occupational injury prevention. Another such application is the zero-inflated Poisson model of Cruyff et al.(2008) which uses data from a Social Security survey to model the respondent behaviour with self-protection bias. Aside from Poisson and negative binomial distributions, the general class of zero- or one-inflated beta regression models are discussed by Ospina and Ferrari (2012). The zero-inflated gamma distribution is not as commonly applied. Feuerverger (1979) is one of the rare applications and attempts to model the rainfall data.
In terms of the use of Bayesian inference, Ghosh et al. (2006) provided a description of the properties of zero-inflated regression models. Neelon et al. (2010) presented alternative zero-inflated models using Poisson and negative binomial distributions. They discussed how Bayesian models allow incorporating expert information and result with improved parameter estimation. Muralidharan (2010) gives an example that fits empirical Bayesian mixture models via the expectation maximization algorithm. On the other hand, Erosheva et al. (2007) utilized individual-level mixture models using variational approximation methods. Gamma mixture models with known and unknown number of distributions within the Bayesian framework is elaborated in the work of Wiper et al. (2001) which involves the estimation of the properties of a queue for email data. Webb (2000) presented a Bayesian gamma mixture model for target recognition using data from radar. In health care context, Venturini et al. (2006) proposed gamma shape mixtures for estimation of the proportion of medical expenditures that exceed a given threshold.
The use of Bayesian models are not common in medical audits, although they are shown to provide better estimates in the domain of tax auditing. Guthrie (1989) presented a report on the use of non-standard distributions in auditing; discussing the need for mixtures of standard and degenerate distributions, which have masses at particular measurements such as zero. He compared different estimation methods used within audits, mainly focusing on the use of dollar unit sampling, which models the total population error as the product of the known payment amount and the mean tainting per dollar unit. He numerically illustrated that Bayesian methods such as Cox and Snell (1979) and Tsui et al. (1985) provided better lower bounds for the adjustment population compared to the frequentist estimators such as the difference estimator, separate and combined ratio estimators. Matsumura et al. (1991) proposed a multinomial Dirichlet Bayesian approach with comparisons. In a related work, Ghosh and Meeden (1997) presented an empirical Bayesian estimator of the finite population mean.
Modelling framework
This article presents a hierarchical Bayesian model for overpayment data that can belong to one of the three populations such that an overpayment observation is either zero, is the payment value or is a value between zero and the respective payment. These respectively correspond to completely legitimate, completely illegitimate and partially overpaid claims. When the overpayment is partial, it belongs to one of the mixture sub components with each having different mean values. We assume
We introduce a latent variable vector, z to indicate the membership of the main overpayment components for all N claims. The indicator vector, z, is a categorically distributed random variable where each ith claim belongs to one of the three main components; z
i
∈ {1, 2, 3}. For the ith claim, zi being equal to 1 implies that Yi=0, and when zi is equal to 2, Yi=Xi. When zi is equal to 3, the claim has the case of a partial overpayment, overpayment taking values between zero and the respective payment;
We assume the number of mixture sub components, K, to be unknown and utilize Dirichlet process (Ferguson, 1973) within a semi-parametric framework via the reversible jump Markov chain Monte Carlo algorithm (Green, 1995) in OpenBUGS (Thomas et al., 2006). This algorithm allows us to consider U different models with differing K values and obtain the uncertainty distribution over K. In doing so, a binary vector S with size U is introduced so that the algorithm can switch between models with different values of K. The latent variable that indicates the mixture subcomponent membership of ith claim is denoted as m
i
, and has a categorical distribution with the probability vector η of size K. The η has a Dirichlet prior distribution which takes the form of a uniform distribution for the special case of K = 2. In summary, the distribution of mi in the case of K sub components is
In addition, we explicitly modelled the mean and standard deviation of partial overpayments. Partially overpaid claims are assumed to follow one of the K gamma distributions with mean μi and standard deviation σk. A linear equation helps model μi as:
The choice of the gamma distribution is appropriate due to the distribution's existence on the positive scale and its ability to represent skewness (Wiper et al., 2001). Alternative distributions that exist on the positive scale include the log-normal and the exponential distributions. The log-normal distribution's mean and variance may not be as easily separable and harder to interpret than the gamma distribution. On the other hand, the exponential distribution's variance is simply the square of the distribution's mean and has a rather strict relationship. Gamma distribution is utilized since it can be easily re-parametrized to accommodate a flexible distribution with separate mean and variance terms.
Having defined μi and specified σ as
Given the samples
This section presents the application of the proposed model with a number of real world payment populations and various overpayment scenarios. Overpayment scenarios are constructed using a variety of probability vectors in order to assess the model versatility. First, we describe the payment populations and explain the construction of the overpayment scenarios. Then we discuss the model specifications with a focus on prior selection. This is followed by the analysis with respect to validity and efficiency of the model as well as the evaluation of learning aspect of the method.
Data
We use four payment populations that represent characteristics of real world claims data. Table 1 lists the descriptive statistics and Figure 1 displays the box plots for these four payment populations.
Descriptive statistics of the payment populations
Descriptive statistics of the payment populations
Box-plots of the payment populations
First two populations are replicated using the motorized wheelchairs claims data of Edwards et al. (2003). First population has left skewness with low standard deviation and an inter-quartile range (IQR) value of zero, whereas the second population has a higher standard deviation and range, with strong separation from 0. Third population corresponds to the case where all payment values are same. The fourth represents a population with mixed characteristics, and it has a higher range and standard deviation. Payment data for the fourth population are retrieved from the servers of CMS (CMS, 2013b). They correspond to claims from the 2008 outpatient procedures file that are billed for the procedure code ‘J9041’ (injection of bortezomib 0.1 mg). This procedure is selected because it was identified to have frequent overpaid billings in the recent investigations (OIG, 2012b; Noridian Healthcare, 2015; Youngstrom, 2015).
The density plot in Figure 2 shows the skewness of the payment distribution with multiple local maximums.
Density plot of the fourth payment population
We consider a total of 13 overpayment scenarios to evaluate a variety of patterns. Table 2 reports the values for the joint probability vectors,
Joint probability vectors for each overpayment scenario
Scenarios 1–10 represent populations in which each claim is either completely legitimate or completely illegitimate, so-called ‘all or nothing’ case. The analysis in literature have focused on these cases, for example Edwards et al. (2003) and Mohr (2005). On the other hand, scenarios 11–13 include partial overpayments in addition to completely legitimate and completely illegitimate claims. It is assumed that there are two partial overpayment sub populations in addition to the spikes that represent completely legitimate and completely illegitimate claims. These two partial overpayment sub populations are simulated randomly using two Beta distributions, Beta(0.15, 0.85) and Beta(0.85, 0.15), which result in mean overpayment percentages of 0.15 and 0.85, respectively. Specifically, scenario 11 represents the pattern in which the first partial overpayment pattern has the higher probability compared to the second. Whereas in scenario 12, second partial overpayment pattern has a higher probability. scenario 11 also represents a population with higher probability of overpayment compared to no overpayments. In Scenario 12, these probabilities are equal to each other. Scenario 13 represents a case in which the majority of the population consists of completely legitimate claims.
This subsection describes the details of the model specification with a focus on prior selection. The prior selection in this article reflects a relative lack of knowledge on the part of the modeller about overpayments. The choice of weakly informative prior values lets the data to drive the learning process about the posterior distribution of parameters.
It is assumed that the modeller lacks knowledge of the frequency of main components and mixture sub components, and the respective probability vectors of π and η. The probability vector,
The prior for
We set the hyper parameters for
Analysis
This subsection presents a discussion of the validity of the model estimates with respect to the current governmental guidelines. Then it proceeds to provide an efficiency comparison with the basic use of CLT. Lastly, an evaluation of the learning aspect of the proposed method is presented. The proposed model is referred to as M.1 and the approach of CLT using the sample to retrieve the relevant estimates is called as M.2.
The proposed model is run for a sample size of 50 for each payment population and overpayment scenario listed in Table 2. Monte Carlo Markov Chain (MCMC) simulation is conducted using the algorithm of Thomas et al. (2006) within OpenBUGS software and the estimates are utilized using the R software (R Core Team, 2014). The MCMC simulations are run for 3 independent chains where 20 000 samples from these chains are analyzed after discarding the first 200 000 samples as burn-in. The Brooks–Gelman–Rubin (BGR) statistic (Brooks and Gelman, 1998) via the R coda package of Plummer et al. (2006) is utilized to assess convergence. The chains are judged to have practically converged when BGR statistics are less than 1.05 for all parameters.
The posterior overpayment mean,
Next, we present the computation of the lower bound of a one-sided 90% confidence interval of the overpayment,
Validity assessment
In order to assess the validity of the methods, average coverage probabilities are computed for both models. For a given number of simulation replications, the frequency of the times that the sample mean overpayment value is greater than the lower 90% bound provides the estimated average coverage probability.
Figure 3 provides the average coverage probabilities for the proposed model compared to the nominal 90% confidence level, which is represented as a dashed horizontal line. For payment populations 1 and 2, the proposed model is found to be valid for all cases other than scenario 2. In fact, our lower bound estimates for most scenarios are conservative than necessary. For those two cases, model results in average coverage probabilities of 82% and 86%. This can potentially be explained by the relatively high standard deviation due to low overpayment rate and lack of learning. CLT-based methods are shown to have coverage levels that are lower than the nominal 90% level for some populations, including the ones with high overpayment rate (Edwards et al., 2003). For such cases, the proposed model is found to be even conservative for many populations of interest.
Average coverage probabilities for overpayment scenarios 1–10
Average coverage probabilities for overpayment scenarios 1–10
Next, we analyze the validity with respect to the overpayment scenarios 11–13 that have partial overpayments. Table 3 lists the average coverage probabilities which show evidence for validity for the model.
Average Coverage Probabilities for Overpayment Scenarios 11-13
In the case of scenarios 12 and 13, for all payment populations the average coverage probabilities are found to be higher than or very close to the nominal level. This provides evidence for the validity of the model. However, for scenario 11 the model resulted in average coverage probabilities of 80% and 81% for payment populations 1 and 3, respectively. This can be potentially explained by the existence of 75% overpaid claims, a third of which have partial overpayments.
Next, we explore the efficiency of the proposed model compared to the CLT-based method. Particularly, we compare the overpayment recovery estimates of the proposed models
Figure 4 presents the MAPE values for scenarios 1–10 and each of the four payment populations for the proposed model. For model 1, overall average efficiency is found to increase (MAPE values decrease) for the scenarios that have high overpayment rates.
Mean absolute percentage error for overpayment scenarios 1–10
Table 4 lists the MAPE values of both methods for scenarios 11–13. In general, this implies that the proposed model outperforms the CLT-based model when there exists partial overpayments.
In the cases of scenarios 11 and 12, M.1 provides better estimation compared to M.2 for all payment populations. However, the superiority of M.1 is small for populations 1 and 2, while they are same for the third population. For the fourth payment population with mixed distribution characteristics, the proposed model provides highly superior efficiency compared to the M.2.
Mean Absolute Percentage Error for Overpayment Scenarios 11–13
Despite the increase in efficiency, the errors are still relatively significant from the modelling perspective. Therefore, we also provide a discussion of a couple modelling extensions in Section 6.
When we repeat the analysis with a number of sample sizes (n = 25, n = 50, n = 75 and n = 100), there are not any significant changes in the results. As expected, the efficiency of both methods improve due to better estimation. However, the patterns stay similar.
This article focuses on the validity and efficiency of the Bayesian posterior estimates to be used as the recovery amount in medical audits. In addition, Bayesian estimation and inference also provide probability interpretations on quantities of interest such as hypotheses, intervals of parameters, membership of a subject and model selection (Jackman, 2009). Proposed model can help the modeller to quantify the learning about a certain overpayment characteristic. For instance, the changes in the posterior distributions of
In order to illustrate this trait of the model, a new overpayment scenario is constructed with
Descriptive Statistics of the sample with sample sizes 25 and 100
Descriptive Statistics of the sample with sample sizes 25 and 100
Parameter estimates of M.1 for Scenario 5 with Sample Sizes 25 and 100
Table 6 reports the posterior descriptive statistics of the parameters for samples with size of 25 and 100 respectively. In addition to the changes in the posterior means, the standard deviation values of the parameters decrease with an increase in sample size, providing evidence of learning about the parameters with additional data. This is further illustrated via Figure 5 that shows the posterior distributions of the parameters. The differences between the prior and posterior distributions indicate the extent of learning about the uncertainty of these parameters. Especially in the cases of heterogeneous and large size of claims data, investigators may want to use probe samples to have an initial validation on their hypotheses. Ignatova and Edwards (2008) presented such a sampling plan called 30-6-3, where 30 payments are randomly sampled and the first 6 are examined. If at least 3 of them are found completely illegitimate, the rest of the sample is examined.
The evidence from learning can also be used for model validation. The change between the prior and posterior distributions, learning, can help the modeller to validate the prior choice, mixture size and sample size. Figure 5 indicates there is an evidence of learning for π and ρ3. Whereas the negligible change in distributions of Density plots of posterior distributions
Edwards et al. (2015) suggested that model-based approaches may be vulnerable to abuse and they are open to arguments in a legal proceeding. In addition, they point out the unethical practice of some government contractors, in order to increase recoupment in their extrapolations. This section provides suggestions and remarks about the use of our model with a focus on model selection.
Model uncertainty and selection are important issues in statistics (Hoeting et al., 1999). For instance, one can argue that medical overpayments are complex as it stands, and simple and parsimonious models might be preferred. However, in some cases the heterogeneous nature of the claims data results in inefficiencies which motivates the need for more complex models. The proposed method is such a model that considers zero and one inflation as well as mixtures. It is shown to be an efficient alternative in the case of partial overpayments. However, our model may not be preferred in a number of cases because of its complexity. For instance, in cases without any observations of partial overpayments in the sample, our model may be more complicated than necessary. Instead, we recommend the auditors to use other approaches based on CLT or hypergeometric likelihood for such ‘all or nothing’ cases. Development of a more comprehensive method that is based on model averaging (Hoeting et al., 1999) is beyond the scope of this article.
Another important matter in a Bayesian model is the choice of prior distributions. As Lindley notes in section 8 of his seminal paper (Lindley, 2000), the prior distribution is the quantification of a researcher's uncertainty over model parameters which may have explicit physical interpretations. This can let the auditors to utilize the available knowledge via elicitation of priors from experts. In a related example, Matsumura et al. (1991) used an informative prior that has more weight on zero and fully overpayments. However, this should be used with caution since it can certainly be argued that an analyst may unethically select priors that lead to his/her desired outcome. As a method of check and balance, the prior and posterior distributions can be compared to assess the impact of data and if learning has occurred. Nevertheless, in this article, we recommend the use of weakly informative prior distributions. For instance, the prior over π quantifies our relative lack of knowledge on the probability of overpayment of a given claim. As suggested by Gelman (2009), weakly informative prior values are used to allow the data to drive the learning process about the posterior distribution of parameters.
Edwards et al. (2015) also pointed out that Bayesian approaches may be abused when the provider might be asked for recoupment, although the sample shows no evidence of impropriety. Our computational results show the proposed model with the use of weakly informative priors is valid for such cases. In computing the 90% lower confidence bound, the small but positive posterior mean (0.01) is offset by the consideration of standard deviation and the use of lower bound of 90% confidence interval. Furthermore, the probability of no impropriety existing in a randomly drawn claim is computed to be higher than 99%. This points out a conservative result, which shows evidence for the fair treatment of the provider. Since we assume non-negative overpayments, potentially negative lower bound values are truncated to 0 as the recovery amount. Therefore, a case with a sample of all legitimate claims should be dismissed since the provider can be assumed to be innocent beyond a reasonable doubt. To give a concrete example, suppose we are provided a sample with 25 claims of which all are found to be legitimate. We assume that the prior for π is weakly informative, following a Dirichlet distribution with {0.01, 0.01, 0.01}. This prior distribution represents our relative lack of knowledge about zero, full and partial overpayments; and it implies these overpayment categories are equally likely. Due to the conjugacy of Dirichlet categorical distributions, the posterior probabilities are
Modelling extensions
This section presents two extensions of the proposed model, and provides brief, but not comprehensive, assessments of validity and efficiency.
Zero-one inflated finite mixture model
The proposed model with unknown number of mixtures may be unnecessarily complex, if the auditor has enough evidence about the number of partial overpayment patterns. We introduce a more simple zero-one inflated model with known number of mixture sub components and refer to it as M.3. For the sake of brevity, we do not present a comprehensive validity and efficiency analysis for these extensions. The results for the payment population 4 is presented. We have chosen scenario 4 as a representative of the overpayment scenarios without any partial overpayments and consider overpayment scenarios 4, 11, 12 and 13.
The average coverage probabilities with respect to the scenarios 4, 11, 12 and 13 for a sample size of 50 are computed as 90%, 95%, 96% and 95%, respectively. Table 7 lists the MAPE values of recovery estimates for sample sizes of 25, 50, 75 and 100. As expected, the MAPE values for both models improve and become smaller for increasing sample sizes. M.3 outperforms the CLT-based model (M.2) for the first two scenarios based on the smaller MAPE values. The MAPE results are comparable for the third and fourth scenarios.
MAPE of the Overpayment Estimates
MAPE of the Overpayment Estimates
In addition to the magnitude, an analyst may also be interested in the direction of the estimation errors. This can be measured by the mean percentage error (MPE) values. We compute MPE with T = 100 replications using the equations
Positive MPE values indicate that the overpayment is underestimated, whereas negative MPE values demonstrate the cases where the provider may unfairly be asked to pay back more than the actual overpayment amount. In order to have conservative recoupment demands from providers, we would prefer the MPE values to be positive rather than negative, if there is same amount of the error. This is similar to the reasoning that leads to the recommended use of conservative estimates such as lower bound of the 90% confidence interval. We note that ideally the estimates would be as close to zero as possible which would provide evidence for the unbiasedness of the estimates.
Figure 6 presents the box-plots of MPE values for both models when the sample size is 50 for all 4 overpayment scenarios. The long horizontal lines that accompany each box-plot in Figure 6 correspond to 0. For scenario 4, the proposed model has lower positive median values and lower inter-quartile range, therefore provides a better performance. For scenarios 11–13, the median values are comparable, albeit the variance and IQR of the proposed model are lower compared to M.2. In general, it can be argued that the smaller number of negative MPE values for M.3 indicates the conservative estimation by the proposed model compared to M.2. Overall, it can be suggested that even the simpler model M.3 provides at least as efficient estimates as M.2, if not better.
Mean percentage errors (MPEs) of M.1 and M.2 for all 4 scenarios
Next, we explore the feasibility of a modelling extension of the proposed model that utilizes covariate information. The main difference is the explicit modelling of the inverse mean overpayment ratio
We have selected the data of payment population 4 and scenario 13 for demonstration. The main characteristic of scenario 13 is the limited number of claims with partial overpayments in the sample. For instance, in a sample with size 50, the number of expected partial overpayments is 7.5. This results in poor estimation and lack of learning about model parameters. As a potential remedy, we utilize a binary co variate such as provider type Xp, while modelling partial overpayments. Table 8 presents the descriptive statistics of payment and overpayment values of different components and the co variate for the sample with size 50.
Descriptive statistics of the payment(overpayment)
Table 9 displays the descriptive statistics of the posterior parameters. The maximum number of mixture sub components U is assumed to be 8 since there are 8 partially overpaid claims in this particular sample. As can be seen in the table, K, the number of mixture sub components, has a median of 2, which corresponds to two partial overpayment patterns. Despite the fact that η is a vector of length 8, we only report the top 3 η’s descriptive statistics since they cumulatively correspond to a total of 91% of the mixture sub component probabilities. The standard deviation of the mixture sub components are also reported, where σk is the k’th mixture sub component's standard deviation. The σ does not exhibit much difference between its elements; however, considering the posterior value of K, and the median differences between σ3 and σ4, we can infer that there are two sub-mixture populations.
Descriptive Statistics of Posterior Parameters
We obtained the estimates of ρ as 0.53 and 0.16 for the cases of
This article proposes a Bayesian zero–one-inflated mixture-based overpayment estimation model. First, we show the validity of the model with respect to the governmental guidelines. Then, the model has been shown to be efficient for some overpayment scenarios, including the ones with partial overpayments. In addition, it can describe the uncertainty about the overpayment population. The learning aspect with the increasing sample size can be used by investigators to improve their understanding of a given claims dataset. For instance, the proposed model can be valuable for OIG when asked to investigate the potential fraudulent activities in a hospital, which has submitted claims within many different domains. Methodologically, this is one of the rare applications of gamma-inflated mixture models in the literature. We also present a discussion of model selection and discuss potential extensions.
We should emphasize that our approach requires explicit modelling of the overpayment probabilities and percentages, and it comes at an additional computational cost and some statistical knowledge. Therefore, we recommend the proposed model to be used for claims data with multi-modal overpayment patterns, for which the existing models may not be as efficient. For instance, one barrier to application can be model specification due to the relatively unknown aspects of Bayesian modelling by medical auditors. This article makes the prior selection with an assumption of relative lack of knowledge on the part of the modeller about overpayments. Although the use of Bayesian approaches is not common in medical audits, we believe this is a modest but crucial step to be able to represent overpayment better. Achieving MCMC convergence can also be an issue, especially for cases with a high number of partial overpayment patterns. It has been recognized that for scenarios with many legitimate claims, the model may not be expected to perform well. The main reason is that the amount of learning does not occur due to the lack of non-zero overpayments. We have presented modelling extensions as potential remedies. The model fit can potentially be improved by considering other covariates such as the monetary amount of the billing.
It would be of interest to incorporate the model within a dynamic sampling decision making procedure by utilizing posterior predictive distributions. As future research, a number of modelling extensions can be proposed. For instance, use of ordered Dirichlet parameters can improve the estimation of partial overpayment mean values and can resolve potential identifiability issues associated with the use of latent variables. Clustering algorithms can be considered to segment types of claims as part of the modelling framework. Lastly, the use of the Bayesian inflated mixture models in the domain of tax auditing may also be an interesting research topic.
