Abstract
Abstract
Poor performance of on-site wastewater treatment systems (OWTS) poses local and regional risks to public health and environmental quality. In the United States, local regulations control system design through permits issued at the time of installation. However, regulatory focus on one-time controls does not account for factors that influence performance after installation, notably asset management choices made by residential property owners. We develop a statistical method to predict performance over the OWTS life cycle to identify vulnerabilities and potential controls that reduce the risk of failure and contaminant release. A regression model based on Generalized Additive Models for location, scale and shape uses data from public records of reported OWTS failures, repairs and replacements, inspections, and assessed property values from Boulder County, Colorado, which has 14,300 OWTS. Severity of required system repairs and replacements over a 40-year period was associated with five factors: structural value, house square footage, number of required inspections, homeowner expenditures, and frequency of OWTS upgrades. Model results suggested that mandatory inspections through a mechanism such as renewable permits would significantly reduce life cycle repair/failure frequency and severity, lowering OWTS costs to owners and reducing public exposure to wastewater contaminants.
Introduction
I
POTWs and OWTS have a historical relationship. As the association of fecal waste with outbreaks of cholera and other water-borne diseases became known in the latter half of the nineteenth century, unregulated on-site systems for collection, passive treatment, and storage using privies, vaults, and cesspools were replaced by centralized sewer systems better able to handle increasing wastewater flows as residential piped water supply and use of flush toilets became widespread (Cosgrove, 1909; Burian et al., 2000). As a result, centralization and modernization of wastewater treatment have been linked, and while the technologies for on-site systems have improved, scientific knowledge, tradition, and prevailing public opinion lead many engineers, public health officials, policy makers, and the general public to favor centralized infrastructure over owner/user operated systems (EPA, 1997, 2002b; Burian et al., 2000; Etnier et al., 2005, 2007). Because of this history, OWTS may be inaccurately associated with isolated rural residences or considered a temporary solution to be replaced by centralized collection and treatment.
In fact, reliance on OWTS to provide a significant portion of global sanitation is growing especially in regions where no sanitation infrastructure exists. Currently, decentralized typically user-owned and maintained OWTS serve ∼25% of the population in the United States (EPA, 2003, 2008a), with over 25 million OWTS installed in suburban, high population density areas (EPA, 2008a). Due to continued land development outside existing collection system boundaries, limited treatment plant capacity, and the cost of new centralized wastewater infrastructure, ∼30% of new residential developments in the United States are constructed with septic systems (EPA, 1997, 2002a, 2002b, 2003). In 2005, it was estimated that OWTS served ∼26 million homes, many in urbanized communities, discharging four billion gallons of effluent per day (U.S. Census Bureau, 2006). The City of Los Angeles, for instance, has over 11,500 residential OWTS and many are located near impaired water bodies with elevated levels of nitrate and coliform indicator bacteria (“LA Sewers,” 2013).
However, unlike centralized facilities operating with NPDES permits, the performance of OWTS is not routinely documented, since monitoring of operations or discharged water quality is not required. The U.S. EPA estimates that 10% to 20% of OWTS in fact do not treat wastewater to acceptable levels (EPA, 2003) and some states estimate failure rates to be as high as 50% (EPA, 2002b). While one individual system failure may not pose a public health threat because the impacts are localized, the aggregate contaminant release from a cluster of poorly performing OWTS can have negative local and watershed-scale consequences. A survey of state water quality agencies ranked OWTS as the third greatest threat to groundwater quality, behind underground storage tanks and landfills (EPA, 1998).
Part of the challenge of ensuring OWTS functionality stems from their characteristic as impure public goods that have attributes of both common property resources associated with public goods, such as environmental quality and open access resources (OARs), which have undefined ownership and are often privatized. As an OAR, an OWTS creates externalities that generate public costs associated with environmental degradation or threats to public health (Turvey 1963; Hardin 1968). Without controls, the private responsibility of OWTS operation and maintenance required for the protection of the public resource, that is, the environment, creates an incentive to opt out of potentially costly maintenance activities, since opting in may offer no immediate benefit to the individual owner/user and opting out is seen as having a small and even undetected impact. Public Goods Theory implies that some regulation of the private right to use the environment as a waste sink is necessary to prevent the general degradation of environmental quality (Mohamed, 2009).
A comprehensive review of state-based OWTS regulations conducted by D'Amato et al. (2008) found that while most states regulate OWTS, regulations are by no means uniform. Furthermore, those states that have more extensive controls focus on installation, siting, and design factors contributing to OWTS failure, which have been widely studied. Existing OWTS performance models incorporate spatial, topographical, hydrologic, and other physical characteristics to identify locations where installing an OWTS may result in a high risk of failure (Hudson, 1986; Joubert et al., 2004; Cromer, 1999a, 1999b; Brown and Root Services, 2001; Kenway et al., 2001; Chen and Herr, 2002; McGuinness and Martens, 2003; Carroll et al., 2006; Oosting and Joy, 2011). However, a significant fraction of properly designed OWTS fail, indicating that technology-based design standards are not the only factors influencing system performance over what is presumed to be a very long life cycle on the order of 30 years or more (EPA, 2002b). OWTS may be designed and operated in a way that optimizes treatment performance and decreases both the frequency and resulting reparative costs of OWTS malfunctions, ultimately improving the overall system performance life (McKinley and Siegrist, 2011).
The premise of this research is that reliance on design-based standards is insufficient to insure OWTS performance over the system life and that the role of factors associated with system maintenance, owner knowledge, and usage must be considered. Moreover, as demand for use of increasingly sophisticated OWTS technologies for removal of nutrients or emerging trace contaminants grows, the number and complexity of owner-dependent operations will likely increase. Increasing dependence on OWTS as a global sanitation strategy underscores the need to understand how human/social variables such as organization, user motivation, and knowledge influence operation of OWTS. (Kaminsky and Javernick-Will, 2013).
Statistical modeling has been applied only recently to performance-based diagnostics of wastewater systems. Multivariate regression analysis using Generalized Linear Models (GLM) has been used to model treatment system response to water quality and wastewater variables (e.g., Weirich et al., 2011, 2015; Towler et al., 2013) adapting an approach used more widely in stochastic weather generation (e.g., Furrer and Katz, 2007; Kleiber et al., 2012, 2013; Verdin et al., 2015). Weirich et al. (2011) used GLM to predict the likelihood of POTW compliance with NPDES permit discharge limits using Discharge Monthly Report data in the U.S. EPA Integrated Compliance Information System (ICIS) (EPA, 2008b). Applying a similar inductive statistical approach to characterize OWTS performance would allow us to observe what practices, if any, differentiate well- and poor-performing OWTS based on the historical performance of real systems over a defined life.
Objectives
The objective of this study was to (1) identify user-associated factors that affect lifetime OWTS performance and (2) develop a predictive model to guide effective management of these systems by owners, environmental and public health agencies, and servicers.
Since no public data analogous to ICIS are available for on-site systems, the first component of this study is acquisition of data from OWTS permit and tax assessment records. In the absence of effluent quality data, information on costs associated with inspection, maintenance, repair, and replacement of OWTS components, which are public records, serves as a surrogate measure of poor performance. It should be noted that modeled outcomes based on this type of data also may provide highly communicable cost/benefit information to stakeholders, especially OWTS owners.
Because OWTS records are typically collected and maintained by county health departments; Boulder County Colorado was chosen as the site for data collection. The Boulder County Public Health Department oversees permits for 14,300 OWTS and also conducts comprehensive permit and public education programs (“SepticSmart Program,” 2015). Variables generated from collected data are then subjected to a regression modeling approach, Generalized Additive Models for Location, Scale and Shape (GAMLSS) (Rigby and Stasinopoulos, 2005) to select the postinstallation factors associated with the level of OWTS performance over the system's expected life. The predictive model will enable regulators to define actionable management guidelines for postinstallation practices that can be incorporated into county or state regulations to improve OWTS reliability and save owners the costs of catastrophic failure. In addition, the model will provide a means to communicate wastewater management alternatives and associated financial trade-offs to communities in a quantifiable and comparable way.
Methods
The methods of this study are discussed in the three following sections: (1) data collection; (2) selection of the performance indicator and ten user-associated independent variables; and (3) development of the GAMLSS method for simulation of OWTS performance.
Data collection
The Boulder County OWTS study site is located in the Boulder Creek-St. Vrain Creek watershed in northeastern Colorado, encompassing an area of ∼190,000 hectares and 300,000 residents (U.S. Census Bureau, 2015). There are over 14,300 OWTS in the County serving ∼50,000 residents as well as 21 POTWs with a total capacity of 110,000 m3/d serving the rest of the population (BCPH, 2013; EPA, 2008b). Estimated flow from the 14,300 OWTS is ∼27,000 m3/d, based on a residential flow of 2 m3/d (EPA, 1980). OWTS are located in a highly variable terrain, including mountain communities at elevations exceeding 2,600 m and residents on the eastern plans at 1,600 m. Approximately, two-thirds of the OWTS for which the County has any records received permits at the time of installation.
The OWTS sample consisted of failed or poorly functioning systems selected by searching the database of repair permits maintained by the Boulder County Public Health (BCPH) Department. Repair permits were screened to select for OWTS having conditions associated with visible failures such as wastewater surfacing, odor or mechanical malfunctions resulting in system repairs, and replacement or reported functional breakdown. The search returned 215 properties with reported OWTS failures from 2003 to 2013. While the permit database contains applications dating back over 50 years, records of specific reasons for permits did not begin until 2003, limiting the documented failures to after 2003 repair permits. That said, many of the early repair permits available documented visible failures, but the evidence exists in the form of hand written notes on the permit applications and therefore was not searchable in the database. From 215 OWTS in the original sample, only systems that had a County-approved inspection at the time of installation were selected for analysis. This reduced the number of properties in the sample from 215 to 120. Using only permitted sites provides a control for siting and design criteria set by the County. The fact that over half the failed systems in the County met initial design and siting standards supports the premise of this study that design standards are not sufficient to ensure performance.
The selected OWTS sample is distributed throughout the County area (Fig. 1) and captures variability in neighborhood/community affluence, housing density, and distance to professional OWTS servicers. The breadth of geographic and demographic attributes of the sample is selected to enable the application of results in other communities/regions.

Geographic distribution of sampled on-site wastewater treatment systems (OWTS) repair permits in Boulder County, Colorado (Boulder County Public Health, 2013).
Information in the repair permit applications for each of the 120 OWTS, primarily scanned hand-written originals, was coded to define a failure measure and quantified attributes of the properties and the owners' operation and maintenance history.
Variable definition
Dependent variable
Failure has been generally defined as not meeting some designated performance standard (Etnier et al., 2005). However, there is no recognized definition of OWTS failure. Failure has been commonly used in the OWTS industry and in the literature to specify major operational faults described above and loss of equipment integrity, such as cracking of septic tanks and/or piping. For this study, the failure measure for each OWTS is the sum of the estimated cost of all repairs requiring a permit, annualized to a 40-year period of record from 1973 to 2013. Thus, failure incorporates both the frequency of major repairs and their magnitude, producing the dependent variable (Y) of annual repair severity expressed in U.S. dollars.
Repairs were classified into minor, moderate, and major, using cost estimates provided by BCPH staff, as described in the next section. The 40-year life cycle was selected based on the average length of time between the installation inspection and the date of the most recent repair permit.
For this study, a minor repair, defined by BCPH, is any repair to the septic tank or pipes. Moderate repairs refer to extraordinary maintenance or replacement of the soil treatment unit (STU). Failure of both the septic tank and STU constitutes a major repair often requiring replacement of all OWTS components.
The cost estimates for minor, moderate, and major repairs are based on the results of an informal survey of OWTS installers and servicers conducted by the BCPH over 10 years ago. Due to property slope, size, water table levels, soil substrate, and location, the cost of any type of repair can vary widely. For example, the BCPH estimated cost of a moderate repair of the STU ranged from $4,860 to $21,800. However, the estimates provide relative benchmarks for minor, moderate, and major repair costs adequate for modeling the severity distribution, which can be updated as new cost information is available. The average of the range of estimated repair costs for each category was as follows: minor, $3,066; moderate, $9,173; and major, $14,866. The repair cost does not include the cost of the repair permit, which is uniform for all repairs, or the cost to hire an engineer for more significant restorations. Engineering costs were excluded from this study because they vary greatly based on attributes associated with the system's location and design complexity. While this information affects the total cost of repair, the infrastructure and labor costs of each repair type included in our analyses differentiate between the types of failures and their relative severity.
Figure 2 shows the sample distribution of the annualized repair severity variable as a histogram along with admissible probability density functions, which will be further discussed in the OWTS Repair Severity Model section. The data appear to be categorical in nature, even though cost of repair is a continuous variable. Approximately, 60% of the sample has a repair severity value estimated to be $372 per year.

Distribution of OWTS annual repair severity data measured as cost (USD) compared to the Weibull and gamma regressions for the 120 OWTS in the sample.
Independent variables
Using information from the Boulder County Assessor's property tax database, property inspection documents, and repair/replacement applications, ten independent variables were defined, which map to six categories that previous researchers have related to long-term OWTS performance (Kaminsky and Javernick-Will, 2013). The categories are technical, referred to herein as physical status (PHYS); organizational (ORG), related to the degree of community or institutional control; economic status of owner (ECON); owner knowledge of system operation (KNOW); user motivation (UM), related to interests affecting owner choices; and other (OTHER). The ten variables related to these categories are described in Table 1. The independent variable values for each OWTS for the 40-year period from 1973 to 2013 are coded and stored with the associated annual repair severity values and unique land parcel ID for each OWTS site.
OWTS repair severity model
We used a variation of the Generalized Linear Model (GLM) regression method to evaluate the relationship between the annual repair severity of the sample OWTS population and the independent variables in Table 1. The application of the identified relationships is then used to predict the estimated annual repair severity of the larger OWTS population (Dowdy et al., 2004).
The GLM provides a more flexible approach to regression than a standard linear regression model. In the GLM, the response variable, Y, is assumed to be a realization from any distribution in the exponential family.
where G(.) is any exponential type distribution and
where η(.) is the link function,
They are assumed to be normally distributed and uncorrelated as with a standard linear regression (McCullagh and Nelder, 1989). E(Y) is the expected value of Y determined from the model.
Standard linear models require Y to be from a normal distribution. Consequently, to model variables that are non-negative, positively skewed, discrete, or binary violates the normality assumption and thus cannot be readily modeled using a standard linear model. Assuming a binomial distribution for the response variable reduces the GLM to a logistic regression; a Poisson distribution makes it a Poisson regression model and so on (McCullagh and Nelder, 1989). Of course, a normal distribution assumption reduces this to a standard linear regression. The ability to model a variety of distributions in the exponential family is the major advantage of GLM (McCullagh and Nelder, 1989). Replacing G(.) with an extreme value distribution allows this approach to model extremes as shown in the modeling of extremes in water turbidity by Towler et al. (2013).
Since every OWTS in the sample was repaired at least once, the response variable Y for all 120 systems is greater than zero and a positively skewed distribution—as displayed in Fig. 2. The gamma distribution has been used in a GLM of POTW compliance data that also are positively skewed (Weirich et al., 2011). However, the OWTS data have long tails due to members of the sample with extremely high or low values of repair severity, and the Weibull distribution is better suited to capture tail behavior compared to the gamma distribution (e.g., Katz et al., 2002). While GLM provides computational flexibility, the assumption that the dependent variable Y must be represented by a distribution in the exponential family restricts its ability to model data using, for example, a Weibull distribution, which is not part of this family. Furthermore, the GLM framework is largely for modeling a single parameter of the distribution with a link function. To overcome this limitation, Rigby and Stasinopoulos (2001; 2005) and Akantziliotou et al. (2002) introduced generalized additive models for location, scale, and shape (GAMLSS). GAMLSS relax the exponential family distribution constraint for the response variable and allow a larger distribution family, including those with long tails such as Weibull. Furthermore, it can admit additive functions of the independent variable that can be linear or nonlinear, providing more flexibility in modeling.
Like GLM, for GAMLSS, a smooth link function, gk(.), transforms the expectation of each parameter in the representative Y distribution,
Each parameter, θik, is related to the set of explanatory variables,
To determine the impact of user operations on the expected annual repair severity of OWTS over a 40-year period, all independent variables and variable interactions were incorporated in the GAMLSS using the GAMLSS package in the open-source statistical program R (www.r.project.org), and all combinations of variables were tested to determine the distribution parameters, θi, using the Akaike Information Criteria (AIC, Akaike, 1974). AIC select a best model by considering all the possible subsets of the independent variables from the fit model. For each model, the generalized Akaike Information Criteria (GAIC), specific to GAMLSS, are calculated as follows:
where
Cross-validation
To evaluate the model performance on an independent data set, a random number of observations, ∼15% of the total data set or about 18 points, are dropped. The model is fitted to the remaining ∼85% of the data and used to predict the dropped values, and performance measures such as R2 (square of the correlation coefficient between the observed and model predicted values) and RMSE (root mean squared error) are computed. This is repeated a thousand times, and the measures are displayed as boxplots to provide insights into the variability of the model skill.
Sensitivity analysis
The range of cost estimates for minor, moderate, and major repairs motivated us to consider the sensitivity of the Weibull model to changing costs. Because the BCPH cost estimates are over 10 years old, the possibility of underestimating the annual repair severity is the greater concern. Therefore, the sensitivity analysis consists of generating two new models using the following repair cost conditions. Model II: sensitivity to the cost of major repairs only by retaining the average cost for the minor and moderate repairs used in the original model (Model I), but increasing the cost for the major repairs to the high value of the range; Model III: increasing the cost of repairs for all repair categories to the high value in the range.
Results
A GAMLSS belonging to the Weibull distribution family was fit to the dependent variables and annual repair severity data, using the log link function. The GAMLSS of repair severity is represented as follows:
where Y is the response variable, annual repair severity, and μ and σ are the scale and shape parameters, respectively, of the Weibull distribution (WEI). The best model of the scale and shape parameters based on GAIC resulted in the following expressions.
Equations (6) and (7) indicate that OWTS performance is related to five of the ten individual independent variables and four combinations of the variables. The individual factors selected are the assessed structural value of the home (SV), square footage of the home (LA)—both are considered proxies for household affluence or ability to pay, the number of complete required system inspections (PT) as a result of the 2008 Boulder County Property Transfer regulation for OWTS, the total number sales after 2008 where a property transfer inspection was not documented (PS), and the frequency and cost of OWTS upgrades resulting from adding a bedroom (UP).
The expected value of the shape parameter E(σ) is a constant estimated by fitting a Weibull distribution to the observed data. It is common in these models with a smaller number of data that, such as the case here, the GAIC keep the shape parameter constant, as varying the shape parameter can make the model fit highly variable and the results difficult to interpret. The nonlinear combination of the five variables describing both the scale and shape parameters creates the Weibull distribution for each observation; therefore, each OWTS observation has a unique scale with a constant shape.
The Analysis of Variance (ANOVA) table (model I in Table 2) shows the significance of each variable in the model based on its p-value. The significance threshold was set at 0.1 (i.e., 90% confidence) for this study. The four combined variables, UP*PT, LA*PT, SV*PT, and PT*PS, are significant at greater than 90% confidence. Of the individual variables, only the number of property transfers after 2008 when a full inspection was required met the significance criterion. While not all of the individual variables are significant at 90% confidence or higher (i.e., p-value <0.01) in the best model, the best model from GAIC or other such criteria selects the group of variables that jointly improve the estimation of annual repair severity. However, the variables that are not significant tend to have coefficient values close to zero. With one exception, an increase in the value of individual and combined factors, including the variable number of required inspections after 2008, PT, was associated with a decrease in annual repair severity, as reported under Model I in Table 2, column 2.
Shaded entries are the variables that are consistently associated with repair severity in all three scenarios.
GAIC, generalized Akaike Information Criteria.
Model diagnostics
The GAMLSS [Eqs. (6) and (7)] provide the best estimate of the two parameters of the Weibull distribution describing the repair severity for each observation as a function of the selected independent variables. Consequently, the median and the 95% confidence intervals can be obtained from the estimated sample distribution. While a predictive model encompassing both extreme and the median results would be ideal, the primary concern is whether the model is a good predictor of annual repair severity for the average OWTS. Average system performance as reflected by the expected annual repair severity over the system's life is a potential decision factor for homeowners and can aid community and regional planning and management decisions when comparison of the costs of on-site and centralized systems is desirable.
The observed values and the predicted annual repair severity costs have an R2 of 0.406, indicating that the model captures ∼40% of the overall variability. R2 values less than 50% are acceptable in behavioral and social science fields, where typically the percentage of variance accounted for is smaller given the inherent variability of human nature (Hunter and Schmidt, 2004). Furthermore, even with a low R2 value, the presence of statistically significant predictors still allows important conclusions to be drawn about how variations in the predictor values are associated with changes in the annual repair severity.
Figure 3 shows the annual repair severity in dollars for each system in the sample compared to the model expected value and 95% confidence interval. The figure shows that the model underpredicts the extremes and predicts well near the central values for annual repair severity. This suggests that there are additional variables that potentially contribute to the likelihood of highly performing systems—those requiring few repairs—and systems requiring frequent and costly major repairs that are ultimately more vulnerable to failure. The confidence intervals (Fig. 3) are asymmetric and shifted in the same direction as the repair severity, suggesting that they are able to capture the variability well—unlike traditional regression approaches, which provide symmetric intervals. Figure 4 shows the observed and modeled values for annual repair severity as boxplots, as a function of the significant and highly influential variable, of a number of property transfers after 2008 with required inspections (PT). No property had more than one required inspection after 2008; however, even one inspection had a clear beneficial impact on annual repair severity, as captured in the Weibull regression.

Predicted annual repair severity (USD) and 95% confidence limits using the fitted Weibull distribution parameters compared to the observed values for all 120 OWTS in the sample.

Relationship between predicted (Weibull) and actual annual repair severity costs and the number of property transfer inspections for 120 OWTS.
Figure 5 has spatial plots of the observed and predicted values of annual repair severity. While OWTS requiring moderate to major repairs are distributed throughout the County, the model does capture some spatial clustering of systems that had a history of higher cost repairs.

GIS map of observed annual repair severity
As mentioned in the model description, the model residuals have to satisfy the assumption of normality, independence, and constant variance (i.e., homoscedasticity). These provide a set of measures for testing the model adequacy shown in Fig. 6. The histogram supports the assumption of normally distributed residuals, and the ACF is minimal suggesting independence of the residuals. The heteroscedasticity plot shows no clear trend in residuals as a function of the estimated value of Y. Some of the structure in this latter plot is perhaps due to the categorical nature of the data. From the autocorrelation plot, it is apparent that while the correlation appears to be minimal, there is significant autocorrelation at the first lag, that is, one residual is correlated with the other, indicative of spatial correlation. This could be due to local factors such as topography and age of the neighborhood. Hierarchical models, where the residuals are modeled as a spatial Gaussian process, are attractive alternatives to capturing the residual structure.

GAMLSS diagnostic plots.
Model cross-validation
The variability of the predicted RMSE and R2 skill measures during cross-validation is shown as boxplots in Fig. 7. The relatively large variability in R2 and RMSE is to be expected since the extremes cannot be modeled well without including them in the model fitting. The spread of both the skill indicators illustrates that the model is best applied to prediction within the original sample range and is least efficient in predicting extreme values of annual repair/replacement severity. The R2 values of the cross-validation models range from approximately 0.1 to 0.8 and reflect the possibility that in each simulation some number of extreme values could be dropped resulting in under- or overprediction of annual repair severity. However, the low median RMSE value of ∼$134 indicates that a significant portion of the predictions differs only slightly from the annual repair severity observations, even after dropping 15% of the data.

Skill of the Wiebull model of annual repair severity model skill as distributions of R2
Model sensitivity to cost estimates
Table 2 has the results of the sensitivity analysis comparing the original simulation (Model I), based on the average cost in each repair category to results from simulations using cost estimates in Models II and III. The covariates consistently associated with annual repair severity are UP, PT, PS, and the interaction between the number of upgrades and property transfer inspections (UP:PT). In addition, their influence on the likelihood of a high-cost annual repair severity is consistently negative with one exception in Model II where the likelihood of a high annual repair severity increases with an increased number of system upgrades. As the severity weighting is shifted using higher repair cost estimates, some of the covariates relevant in the initial GAMLSS, for example, LA, become insignificant to determining annual repair severity. However, the coefficient estimates of those variables in the initial model (Model I) are approximately zero, indicating that even in the initial model they have a lesser influence on annual repair severity than the highlighted more robust variables. Figure 8 shows the adjusted annual repair severity distributions based on the different cost estimates as well as the representative Weibull fit to each distribution. Figure 9 illustrates the sensitivity of the residuals of the GAMLSS to changes in cost estimates. Positive residuals in Fig. 9 indicate an overestimation of annual repair severity, whereas negative residuals reveal an underestimation of repair/replacement costs. Residuals based around zero demonstrate where the predicted value is close to the real annual repair severity value. In general, both positive and negative residuals increased along with cost estimates for both major and all repairs implying greater uncertainty in predictions of repair severity. This is not unexpected, as the predicted annual repair severity values also increase as a function of higher costs.

Estimated cost distributions for Models I, II, and III and their Weibull representation.

Sensitivity of model skill to repair cost estimates used to calculate annual repair severity. Model I is the original Weibull regression using the average of BCPH cost ranges for all repair categories; Model II is the Weibull regression substituting a higher cost estimate for major repairs; Model III is the Weibull regression with the higher cost estimate for all three categories of repairs.
Discussion
Public Good theory provides a policy framework for recommending increased regulation of OWTS (Mohamed, 2009). Health and environmental impacts of appreciable levels of OWTS failure and repair, aggravated by the growing density of these systems, are an additional motivation for new regulations requiring regular inspection and maintenance. However, new regulations have associated costs for both individual owners and for local health departments charged with maintaining records and processing permits. Results of this study provide data-based support for mandatory inspections that reduce repair frequency and cost over the OWTS life and a credible estimate of the benefits of increased oversight. In fact, the predicted benefit of inspections is conservative, since the Boulder County inspection/repair requirement only takes effect when a property is transferred. It may well be the case that a universal requirement for regular OWTS inspections through means such as a renewable permit would have an even larger benefit in improving OWTS reliability.
Two factors, SV and LA, have been considered as indicators of household affluence (Harlan et al., 2009). The GAMLSS showed a slight positive association of both with annual repair severity. Interestingly, this result goes against the common assumption that an increased ability to pay increases the likelihood of homeowner attention to maintenance and system performance. This counterintuitive result could be attributed to the location of these homes. In Boulder County, many larger high-value homes outside POTW service areas are located in mountainous areas far from maintenance services. Some may have STUs located in terrain, where a failure may not be noticed by either residents or neighbors. In addition, more expensive homes may be sold less often so that structural value is negatively related to mandatory inspections.
Finally, UP was associated with a decrease in predicted annual repair severity for OWTS. One explanation is that when a home's size is increased through remodeling and construction, the County requirement for expanded or appropriately upgraded systems would have an effect similar to property transfer inspections resulting in less frequent and less severe repairs. The relationship suggests that increasing system capacity also may benefit performance.
While individual variables such as inspections and repairs associated with property transfers, sales after 2008 and system upgrades decrease the likelihood of a high annual repair severity, inclusion of combined variables, especially those containing after 2008 property transfers, and improved the Weibull model skill. An interaction between individual variables with the same influence on the likelihood of a high annual repair severity amplifies those individual performance effects and increases the overall skill of the model. However, the small coefficient values for some of the combined variables such as structural value and regulated property transfers indicate that their effect on the expected value of annual repair severity was small. In general, LA and SV in combination with PT allow the model to discriminate between what might be considered a moderate range and a high range of annual repair severity values. While removing the two variables is an option given their p-values, they not only improve the skill of the model but also minimize GAIC compared to model versions without them. This indicates that while household affluence may not directly relate to annual repair severity, both indicator variables provide a nonarbitrary amplification of the highly significant covariate, PT, and capture the nonlinear effect of the interaction on OWTS performance.
Some spatial clustering of OWTS repair severity (Fig. 5) and autocorrelation of residuals (Fig. 6) suggest that not all factors determining OWTS failure are captured in the Weibull regression. Both geographic clustering and autocorrelation may be explained by factors such as weather, soil and groundwater conditions, and distance from OWTS servicers and from other residences. As an attractive extension of this research, the relationship between OWTS repair severity and location can be explored using spatial modeling of the GAMLSS residuals in a hierarchical modeling approach or a Bayesian method (e.g., Verdin et al., 2015). Another potential extension of this study exists in the discretized characteristic of the data (Fig. 2), which lends itself to a categorical modeling approach using a binomial or multinomial logistic regression analyses (e.g., Towler et al., 2013) to estimate risk, particularly quantifying the likelihood of high repair severity occurrences.
Summary
As the use of owner-operated on-site sanitation technologies increases, life cycle costs and long-term sustainability, including environmental and health impacts, become more relevant to reducing the risk of human and environmental exposure to wastewater contaminants. This research identified factors unique to minimally regulated OWTS that may guide future planning to enable better OWTS management over the system life. In the absence of comprehensive monitoring data, the product of the cost and frequency of system repairs and replacement, annualized over a 40-year lifetime, denoted as “annual repair severity,” is proposed as a measure of system failure. Data from 120 OWTS in Boulder County, Colorado, are fit by regression (GAMLSS) modeling, with the best fit provided by the Weibull distribution. In general, variables associated with more frequent owner management of OWTS were predictive of long-term system integrity. The most important was the frequency of inspections by professional servicers, typically accompanied by maintenance and minor repairs. This result suggests that mandatory inspections through a mechanism such as renewable permits would significantly reduce life cycle repair/failure frequency and severity, lowering OWTS costs to owners and reducing public exposure to wastewater contaminants. The statistical model is skilled at predicting repair severity in the midrange of the data distribution, with an expected annual repair/replacement cost between $350 and $400 per year, over the 40-year life cycle. The observed and modeled annual repair severity values were correlated with an R2 value of 0.406, with larger discrepancies at the high values of annual repair severity, which fell in the range of $600 to $800 per year. The model dependence on inspection frequency as a principal determinant of repair severity was not sensitive to the cost estimates assigned to each category, indicating general applicability of the model results.
Footnotes
Acknowledgments
This material is based on work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE 1144083. The authors would like to acknowledge the assistance of the staff of Boulder County Public Health Department who provided data and other information on OWTS in the Boulder County.
Author Disclosure Statement
No competing financial interests exist.
