Abstract
Nonprofits compete in donation markets for resources and are expected to report on the financial stewardship of the organization. Without a clear comparative signal to differentiate organizations in this resource market, simple financial ratios have been used as proxy measures of relative organizational efficiency. Two conceptual models can be applied to the use of these ratios: first, as dichotomous conformance thresholds that identify poor performers who are unable to meet some minimum standard, or second, as directly comparable scales of performance where more optimized ratios can be used to distinguish the best performers. These two different conceptual models imply two different managerial approaches and potential organizational outcomes. This research assesses the extent to which nonprofits that are evaluated by an external evaluator appear to use the ratios as thresholds to pass or as scales to optimize.
Introduction
To be competitive in obtaining funds, nonprofits are expected to be accountable, and provide evidence of superior program performance both in terms of effectiveness and efficiency (Forbes, 1998; Krishnan & Yetman, 2011; Pina & Torres, 1992). On one hand, nonprofits provide public services, and thus must provide evidence of the effectiveness of their programs (Moulton & Eckerd, 2012). On the other hand, nonprofits compete for donor resources, so they also need to provide evidence of organizational efficiency and good financial stewardship (Brooks, 2004; Eckerd & Moulton, 2011; Froelich, Knoepfle, & Pollak, 2000; Verbruggen, Christiaens, & Milis, 2011). The focus of this study is on the latter set of evaluation techniques, in particular, investigating two different measurement approaches for nonprofit organizational financial stewardship: ratios of expense categories that can service as either conformance thresholds that distinguish between “good” and “bad” performers or as interval scales where more optimal scores are intended to identify the “best” performers.
As performance evaluation has gained importance in the nonprofit sector, financial ratios, usually measured by third-party organizations, have gained prominence (Bies, 2001; Carman, 2009; Ritchie & Kolodinsky, 2003). The ratios emerged from a regulatory structure in the early twentieth century, with an ideal use as straightforwardly comparable scores (Eccles, 1991; Neely, 1999) which could be used by donors to compare the relative financial management of organizations. The measures, such as the ratio of program expenses to total expenses (Carman, 2009; de Leon, Pettijohn, & De Vita, 2012), tend to be used by watchdog groups who conduct evaluations for two distinct purposes: as an indicator of attaining some minimal standard relative only to the threshold, or as a measure of comparative efficiency relative to other organizations (Ashley & Van Slyke, 2012; Cnaan, Jones, Dickin, & Salomon, 2011; Herman & Renz, 2008; Roa, 1998).
One of the first independent watchdogs, the Better Business Bureau (BBB) views its standards as conformance thresholds, with two possible outcomes: either a nonprofit meets or it does not meet each of the BBB standards. Other evaluators, most prominently Charity Navigator (CN), use an interval scaled rating system that rewards higher ratios, which is commonly how the ratios are viewed in the general media. 1 Given the prevalence of this latter category, there are pressures for organizations to score well (DiMaggio & Powell, 1983; Westphal, Gulat, & Shortell, 1997). In fact, responding to these pressures is perhaps the main reason why nonprofits use the ratios (Eckerd & Moulton, 2011).
Regardless of an evaluator’s intent, nonprofits could have an incentive to not only conform to a threshold and therefore meet the minimum qualifications to be worthy of a donation but to exceed a threshold by as much as possible to be more worthy of a donation. The problem is that, even as acknowledged by the evaluators, 2 spending more on programs and less on administrative overhead is indicative of superior performance only if programs are shown to be effective. Due to uncertainty of program outcomes, the complexity of outcome measurement, and the cost of high-quality program evaluation, good information on program effectiveness is difficult to come by and while there may be some threshold under which a charity is so inefficient that a donation is not worthwhile, there is scant evidence that the most efficient charities are also most effective (Herman & Renz, 2008; Rushing, 1974).
As these third-party evaluations become more prominent, they can influence both the management practices and ultimately the mission attainment of nonprofits. Even if donors are not especially aware of the evaluators and their approaches, nonprofits perceive an opportunity to gain credibility and legitimacy by receiving a recommendation from the evaluators (Eckerd & Moulton, 2011). The validity of this legitimacy-conferring role is dependent upon the extent to which the evaluation procedures are consonant with organizational effectiveness. This is far from clear, and given the different approaches of evaluators, the effect on the managerial priorities of evaluated organizations could be quite different if the goal is to achieve some minimal financial ratio score or if the goal is to optimize ratio scores. In the former case, typical thresholds are not especially onerous and meeting them may result in little substantive change in behavior (Bhattacharya & Tinkelman, 2009). In the latter case, while optimizing is unlikely to be the only organizational goal, incentives to optimize could alter the services that nonprofits are willing to deliver and potentially contribute to mission drift (Jones, 2007). To achieve better efficiency ratios, nonprofits may focus on routinized services or focus on outputs rather than mission-related outcomes (Bevan & Hood, 2006; Hofstede, 1981; Lune, 2002). They may also eschew important capacity building efforts (Kanter & Summers, 1994; Light, 1994), limit overhead by focusing on a small number of revenue sources potentially putting the organization in a dependent position, or they may game the system to ensure that what is reported signals an efficient organization (Hood, 2006; Keating, Parsons, & Roberts, 2006; Krishnan & Yetman, 2011). In any case, these shifts in priorities could lead to programs that look good on paper, but potentially put the organization at risk to resource constraints. More importantly, unless the organization’s programs are as effective as they possibly can be, maximizing a program expense ratio may be risking the organization’s mission rather than providing an indication of organizational efficiency (Jones, 2007).
This research assesses the potential prevalence of a ratio optimization strategy by investigating whether organizations that are evaluated by the BBB veer away from the BBB’s intended purpose of assessing for conformance and instead optimize expense ratios. Results show that evaluated organizations tend to report more optimized program and administrative ratios. In the sections that follow, the distinction between the conformance and optimization conceptual models is explained, and the implications for the incentives structures are discussed. Then, the data and analyses are described, followed by a discussion of the results and potential implications.
Nonprofit Organizational Evaluation
Nonprofits exist in a complex evaluation environment—the public is interested in determining how effective programs are at providing public services (Boyne, Meier, O’Toole, & Walker, 2005; Heinrich, 2003; Ring & Perry, 1985). At the same time, the resource environments for nonprofits have a market orientation, as nonprofits compete against a diverse set of other nonprofits for grants, donations, and volunteer labor (Venkatraman & Ramanujam, 1986). Competition for resources necessitates differentiation, innovation, and marketing (Barman, 2002; Froelich, 1999; Saidel, 1991). In contrast to the outcomes-based orientation of program evaluation, organizational-level evaluation tends to focus on the means, usually through an operationalization of an efficiency construct that can provide a signal of proper stewardship of donations (Eckerd & Moulton, 2011).
While donors and grantors consider personal experiences or congruence between the nonprofit’s mission and their individual values when making donation decisions, resource providers and evaluators have also pushed for more objective criteria to assess performance (Bies, 2001; Carman, 2009; Ritchie & Kolodinsky, 2003). In contrast, for-profit organizations use a common set of measurements that investors can use to assess the relative efficiency of one firm versus another, for example, the rate of profitability, or shareholder value (Venkatraman & Ramanujam, 1986). However, nonprofit missions are diverse, using a variety of different activities to serve a diverse set of stakeholders (Eckerd & Moulton, 2011). Easily comparable, aggregate measures of performance or effectiveness are elusive (Cameron, 1981). In fact, nonprofits may be the prototypical complex organizations for which the traditional evaluation techniques are difficult to apply (Cameron, 1980). With ill-defined goals, an ambiguous connection between means and ends, and differing criteria for success (Cameron, 1980), diversity has long been recognized as a defining feature of the sector (Frumkin, 2005). Although the financial ratios are intended to be widely applied, nonprofits have been assessed according to the clients they serve, major activities, and public service roles (Gronbjerg, 1994; Moulton & Eckerd, 2012; Salamon & Anheier, 1992). In short, there is no agreed upon measure of organizational effectiveness or performance, even within particular subgroups, let alone across the entire sector.
Nevertheless, the demand for generalized evaluative information remains. This has led to the emergence of external evaluation organizations that purport to enable donors to assess nonprofit performance in simple terms. These evaluators, like the BBB, CN, and the American Institute of Philanthropy (AIP), use some variation of a scorecard or set of quantitative and/or qualitative standards to assess the relative performance of nonprofits. These standards include ratios that compare the proportion of expenses that a nonprofit allocates to program services versus overhead (fundraising and administrative) expenses. The ratios are not embraced as valid measurements of performance; nevertheless, the measures are institutionalized indicators of which nonprofits (if not necessarily donors) are very aware and feel pressures to conform (Bhattacharya & Tinkelman, 2009; Carman, 2009; Cnaan et al., 2011; Eckerd & Moulton, 2011; Sloan, 2009). Depending upon the evaluator, the comparative approaches are different—in one case organizations are compared with some minimum baseline threshold, while in the other, organizations are compared with one another. Here, two conceptual distinctions are called the conformance model and the optimization model, respectively.
The Conformance Model
The conformance model, which has been the historic norm use of the financial ratios is illustrated by the BBB approach (Irwin, 2005; Steinberg, 1997). 3 The BBB assesses nonprofits on 20 distinct standards, covering governance/oversight, finances, fundraising practices, and evaluation policies. 4 All of the standards are weighted equally insofar as failing to meet any one standard results in a report indicating failure to “meet standards.” Although this research focuses on financial ratios (Standards 8 and 9), the BBB standards in total are conceived as indicating minimum thresholds of performance; an organization evaluated by the BBB has two possible outcomes on each standard: meet or fail to meet. The BBB explicitly states that its reports and evaluations are only intended to show whether an organization has met a minimum threshold and should not be used for comparative purposes beyond conformance/nonconformance (Murray, 2001). As the financial ratios are relatively easy to meet (Bhattacharya & Tinkelman, 2009), the BBB approach is intended to identify poor performers, but does not granularly differentiate between mediocre, better, and best. If an organization spends at least 65% of its resources on program expenditures, it meets BBB Standard 8 and if it spends no more than US$0.35 to raise US$1, it meets BBB Standard 9. Within this framework, as the quote below illustrates, nonprofits do not necessarily perceive a significant instrumental operational benefit of the standard, beyond conformance.
We clear that hurdle and then the real work starts. (Wayne Pacelle, President of the Human Society of United States [HSUS]; personal communication, November 11, 2012)
If the goal is to meet a dichotomous threshold, there is little implied incentive to exceed the threshold, except as a buffer against future nonconformance. Meeting the standards is a signal, but the standards themselves are not particularly credited with improving organizational performance (Eckerd & Moulton, 2011). The observed ease of meeting the standards (a finding corroborated in this study) suggests that a conformance approach would have little influence on the operational choices made by managers. The conformance thresholds are institutionalized such that nonprofits pass as a matter of course. Very poor performers can be identified, but for most nonprofits, there would appear to be little need to follow a conformance managerial strategy. Meeting the thresholds can largely be taken for granted.
The Optimization Model
The CN and AIP approaches demonstrate the optimization model. The larger of the two, CN has been evaluating nonprofits for 10 years and has altered its evaluation criteria several times during this period. Like the BBB’s, the CN rating system goes beyond financial ratios, to cover accountability and transparency issues, such as board composition and established conflict of interest policies. 5 CN’s financial ratios (Performance Metrics 1-4) acknowledge that different types of organizations conduct business under different constraints, and utilizes a set of financial ratios that vary with the National Taxonomy of Exempt Entities (NTEE) 6 classifications (Gronbjerg, 1994) in determining where an organization falls on its 0 to 4 star rating system. CN also uses minimum thresholds to identify poor performers (0, 1, or in some cases, 2-star charities), but the distinction between a 3- and 4-star organization is, in part, dependent on having higher efficiency ratios. To receive 4 stars, organizations must exceed the compliance threshold by, in some cases, a considerable margin.
As the BBB and CN allow nonprofits to publicize the results of an evaluation, and both are well known, meeting the BBB standards and receiving 4 stars from CN can be extremely important symbolically (Eckerd & Moulton, 2011). While the BBB logo can be used for any nonprofit exceeding thresholds, the CN logo can only be used by 4-star nonprofits—not by 3- or 2-star-rated organizations that may well meet thresholds established by the BBB. As illustrated in the quotes below, organizations likely have an incentive to publicize such information.
More than 83 percent of expenses are devoted to the Foundation’s mission—far exceeding the program service allocation standards set forth by the nation’s leading charity watchdog groups, including the Council of Better Business Bureaus. (Make a Wish Foundation
7
) The Scripps Research Institute has achieved a coveted 4-star rating on Charity Navigator. We are incredibly proud of this achievement and thrilled for what it means for our donors in terms of accessing important information about our organization. Scripps Research has earned 4 stars from Charity Navigator 9 out of the last 10 years. (Scripps Research Institute
8
)
As a majority of charities meet the full set of BBB standards on the national level, and only about one third of the organizations that CN evaluates receive the 4-star designation, 9 the CN 4-star designation represents a potential differentiator in the donation marketplace, and optimized financial ratios are a necessary (but not sufficient) condition to receiving 4 stars. However, the nature of an organization’s mission (Ebrahim, 2003) and task set (Moulton & Eckerd, 2012) may constrain how optimized a ratio can be for some types of nonprofits. For those nonprofits carrying out complex services or serving hard to reach clients, low overhead may be difficult to achieve, leaving them with a tough choice: accept a lower program ratio than other nonprofits in their donation market which may jeopardize resource acquisition, or offer services that are more routinized and less individualized and thus potentially of lesser quality to an individual client, but enable reporting of more optimized ratios. They could also react by focusing on a smaller set of donors who can provide more resources at less expense rather gathering a diverse donor base—a resource strategy that potentially exposes the organization to resource risks (Barman, 2008). Finally, organizations may game the system, strategically reporting their expenses (Hood, 2006) or asserting implausibly low administrative or fundraising expenses (Keating et al., 2006; Krishnan, Yetman, & Yetman, 2006) in which case the informational gains from having the standards in the first place may be counterproductive.
While following an optimization strategy may not be the only (or even primary) reason for an organization to have relatively “good” ratios (Westphal et al., 1997), over time, this incentive structure could still alter the nature of the nonprofit sector (Frumkin, 2005; Salamon, 2003). Taken to the logical conclusion, the nonprofits that thrive are those that focus on relatively easy-to-provide services often for easy-to-reach clients, or those that strategically report expenses. In either case, nonprofits could be failing to service the public purpose for which the sector exists (Frumkin, 2005). Although there is evidence that nonprofits, either for strategic purposes or for lack of knowledge of accounting rules, underreport fundraising expenses (thus minimizing, or in other words, optimizing one of the three key ratios—Keating et al., 2006; Krishnan et al., 2006), it is an open question as to whether organizations actually attempt to optimize the ratios together. This research aims to address this gap in the research.
Optimizing the Financial Ratios
Three of the most common standards seen across all the evaluators are variations on financial ratios the BBB has used for decades, and that originated with government regulations in the mid-twentieth century. The program expense ratio measures the expenses allocated for mission-related activities as a proportion of total organizational expenses. The BBB standard is a proportion of at least 65%. In contrast, AIP refers to 60% as a “C-level” indicator, while proportions over 75% are indicative of “highly efficient charities.” 10 Although CN uses different scoring systems for organizations in different subsectors, it also states that a higher proportion is better, and expects 3- or 4-star organizations to achieve a program ratio greater than 2/3 across all subsectors, with 4 stars being awarded for higher scores, albeit at different maximization points depending on subsector. Fundraising and administrative ratios function similarly, but inversely. The BBB and AIP ratios measure the total cost of fundraising as a proportion of public contributions (that is, the amount of money spent soliciting public contributions divided by total public contributions), while CN measures fundraising expenses as a proportion of total expenses as well as total public contributions. The BBB standard is less than 35%, and AIP and CN award higher scores for lower ratios. Finally, administrative expenses are assessed by CN, and were assessed in past incarnations of the BBB standards, with expectations that such expenses be no more than 10% of total expenses, and for CN, the lower the better.
Although the literature does not generally distinguish between the third-party evaluators (Szper & Prakash, 2011), the conformance model and the optimization model have potentially different implications for evaluation and, as previously stated, nonprofit strategies. If nonprofits perceive an incentive to follow an optimization strategy, the effect of the BBB evaluation process could be very different from the intent (Cnaan et al., 2011). The stated mission of the BBB is to evaluate according to its standards as a threshold:
The [BBB Wise Giving] Alliance does not rank charities but rather seeks to assist donors in making informed judgments about charities soliciting their support.
11
However, there is anecdotal evidence that some nonprofits use performance on the BBB standards not as a benchmark, but as evidence of superior performance. For example, the following quotes were taken from the web sites of two organizations in the study area:
Our performance exceeds the Better Business Bureau Standards for Charity Accountability, due to our stringent cost controls. Our extremely low administration costs ensure that the vast majority of your donation goes to [our program services]. An impressive 92 cents of every dollar that you donate goes directly to [our programs], far exceeding the BBB recommendations.
12
To further illustrate why an optimization strategy might be preferred to conformance, consider the two pie charts shown in Figure 1, which are similar to those used in both BBB and CN reporting.

Expense categories for two hypothetical nonprofits.
Both of these hypothetical organizations have ratios that meet the BBB standards. However, the visual depiction is rather stark. The charity on the left has the following expense ratios: 67% on programs, 23% on fundraising, and 10% on administrative. On the right, the breakdown is 92% on programs, 6% on administrative, and 2% on fundraising. Although the BBB would specify that there is no implied ranking and that both of the charities meet the financial standards, the visual depiction could tell a different story. This discrepancy results in an additional star from CN, and higher odds of a donation to the charity on the right (Sargeant, West, & Ford, 2004). While not suggesting that this is the only or primary information a donor may consider, it is not unreasonable to suspect for some donors, this difference may be key. Therefore, an incentive exists for nonprofits to optimize financial ratios regardless of the intent of the evaluator.
Data and Method
The main expectation driving this research is that organizations that undergo an external evaluation by a third-party evaluator will respond to an incentive to optimize expense ratios by reporting higher scores on program ratios and lower scores on fundraising and administrative ratios than organizations that are not evaluated by a third-party evaluator. As the data used are at the local level, the third-party evaluator in this case is the BBB, which was more active locally than CN at the time the data were collected. There are two ways that the BBB selects the nonprofits it evaluates: either because the nonprofit requests such an evaluation or because the BBB has received inquiries about the organization. In either case, evaluated organizations are those that actively solicit the public for donations or external organizations for grants and contracts—in other words, organizations assessed by the BBB are trying to differentiate themselves in the donation/grant market, and are the very organizations likely to see value in optimization of ratios. It is expected that the majority of organizations will meet BBB thresholds, and that optimization strategies may be indicated by the presence of consistently higher program and lower overhead ratios.
Nonprofit data were collected from the Columbus (Ohio) Foundation’s effort to create a central repository of nonprofit information for local donors. 13 This sample includes 290 organizations in the 2006 fiscal year. These data were supplemented with evaluations conducted by the Central Ohio BBB during the prior year, 2005. As can be seen in Table 2, about 18% of the organizations have been evaluated. Of the 53 organizations that were evaluated by the BBB, 24 (45%) met all 20 of the BBB standards, 18 (34%) did not disclose enough information for a full evaluation to take place, 14 and the remaining 11 organizations disclosed all information, but failed to meet at least one of the BBB standards.
The BBB evaluation can be thought of as a “treatment” that affects the outcome of scores on financial ratios. However, selection into this “treatment” is likely nonrandom, for example, human service organizations are overrepresented in the BBB evaluated group relative to the total sample, which may introduce a selection bias problem. Furthermore, the full sample is not representative of the full breadth of a nonprofit organizational environment. Besides being located in one geographic market, the data set is underrepresented by religious organizations, which are often among the largest nonprofit organizations at the local level (Gronbjerg & Paarlberg, 2001). However, the sample includes the largest, most well-known local organizations that are actively soliciting both the public and local foundations. That is, this is an illustrative set of organizations for which area donors will have information to compare relative merits of organizations against one another. However, the data may exclude nonprofits for which donors focus on congruence between an organization’s mission and their own values when making donation decisions. Under the latter circumstances, which one would expect is relevant for religious organizations, solicitation of the external environment is less critical to resource acquisition (Voss, Cable, & Voss, 2000). Finally, 2006 was selected as a representative year to study. The financial collapse in late 2007 and 2008 both increased demand for nonprofit services and decreased nonprofit donations, 15 so in 2006 nonprofits were less likely to be conducting business in an austere manner, which could affect ratios independent of environmental pressures to optimize.
The research question can be conceived as an experimental evaluation of the effect of a BBB evaluation on the following expense ratios:
Program expense ratio (program expenses/total expenses)
Fundraising expense ratio (fundraising expenses/public contributions)
Administrative expense ratio (administrative expenses/total expenses)
If there were an incentive to optimize financial ratios, then higher program expense ratios, lower fundraising, and lower administrative expense ratios would be expected for organizations that have undergone a BBB evaluation.
Given the potential selection bias issue, propensity score matching (PSM) methods are most appropriate. Considering the way the organizations are selected for a BBB evaluation, evaluated organizations may be more likely to have optimized ratios independent of the evaluation. The PSM procedure uses a comparison group to determine the average effect of the treatment on the treated (ATT) by matching organizations that are similar on key covariates but differ with respect to whether an evaluation was conducted (Rosenbaum & Rubin, 1983). A key assumption of PSM is that the covariates used in the matching procedure are exhaustive to the extent that the treatment assignment, while not random, is “as good as random” (Heinrich, Maffioli, & Vazquez, 2010, p.16). The ATT estimates the effect of a treatment, conditional to this set of covariates X; the covariates are used to match an observation that has been treated with one or several observations that have not been treated, but are statistically comparable in terms of X. The ATT can be expressed as
where E estimates the average change in one of the outcomes, Y (program ratio, fundraising ratio, or administrative ratio) given administration of the treatment, T (an evaluation by the BBB). The problem is that for any individual evaluated organization, Y1 is observed while Y0 is an unobserved counterfactual. In PSM, similar organizations are matched to create functional counterfactuals of Y0 such that unevaluated organizations’ outcomes can be compared with the true outcomes for organizations that have received the BBB evaluation, Y1.
Using X, the matching procedure estimates the propensity of organizations to be evaluated and creates stratifications of similar organizations with similar propensities to be evaluated. This is done through a logistic regression with the treatment variable (a dichotomous indicator BBB evaluation) as the dependent variable, and the aforementioned covariates X as the independent variables. Matched organizations have similar characteristics and similar propensities to be evaluated (as estimated by the predicted probabilities obtained in the logistic regression), but vary with respect to whether an evaluation has actually taken place or not. From these (one-to-one or one-to-many) matched pairs, the three ATT effects can be calculated.
The covariates, Xi, fall into three categories. First as ratio optimization is conceived as a signal to the resource market (Barman, 2008), revenue sources for organization i are included, measured by the proportion of annual revenue received from the following sources: foundations and corporations, federal, state, and local governments, donations, special events, membership fees, earned income, and in-kind contributions. Second, while classification has long been difficult in the nonprofit sector (Barman, 2013; Moulton & Eckerd, 2012; Salamon & Anheier, 1992), it is important to acknowledge the diversity of organizations. Xi also includes a set of indicator variables for the NTEE category for organization i. The major level of the NTEE classification system is used: arts, education, environmental, health care related, human service, and social benefit. The subsectors and number of organizations in each subsector are provided in Table 1.
Subsectors and Average Efficiency Ratios.
Finally, Xi also includes total organizational assets, total expenses for the fiscal year, the organization’s age, and a dichotomous proxy of professionalism indicating whether or not financial reports were professionally audited. 16 The full summary statistics for the outcomes, the treatment, and the covariates are provided in Table 2.
Summary Statistics (N = 290).
Note. BBB = Better Business Bureau.
Results
Several different PSM procedures can be used, including matching one-to-one or one-to-many cases, with or without replacement, and with nearest neighbor, kernel, or stratification procedures (Heinrich et al., 2010). The stratification procedure described by Dehejia and Wahba (2002) was determined most appropriate. 17 Organizations were divided into strata in which the mean propensity scores were statistically equivalent and then tested within stratum for statistical equivalence between the BBB evaluated group and nonevaluated group on each of the covariates in X. This process resulted in three strata and one unmatched case, which is a sufficient level of stratification given the sample size (Frumkin et al., 2009; Rosenbaum & Rubin, 1983).
Table 3 presents results of several different analyses. First, it includes the ATT for each of the three outcome variables from the PSM procedure. Second, as none of the covariates were especially strong predictors of an organization undergoing a BBB evaluation (thus indicating that selection bias, while present, is likely minimal) and also acknowledging the possibility that the covariates in X may not be exhaustive of all of the explanatory factors that predict whether the BBB conducts an evaluation, regression coefficients are also included for each of the three outcome variables. These regressions coefficients were derived from ordinary least squares (OLS) models with the ratio outcomes as dependent variables, and the BBB indicator and covariates as independent variables. Given the similarity of the ATT results and the regression coefficients, both are provided as indications of the robustness of results to different model specifications.
Estimates of the Average Treatment Effects of BBB Evaluated Organizations.
Note. Covariates were included in both models but are excluded from the results presented here. BBB = Better Business Bureau; PS = propensity score; OLS = ordinary least squares; ATT = average effect of the treatment on the treated.
p < .01 difference is statistically significantly different from zero.
Organizations that are assessed by the BBB tend to report about US$.06 or US$.07 (per US$1 spent) more on programs than organizations that are not assessed. Similarly, evaluated organizations report administrative expense ratios that are approximately 6 cents per dollar lower than nonevaluated organizations. Fundraising ratios are not statistically different in either the PSM or the regression models.
Table 4 provides a context for this result and also corroborates Bhattacharya and Tinkelman’s (2009) finding that most organizations meet the subset of financial BBB standards and the BBB governance-related standards as well. Virtually all organizations, whether evaluated or not, met board-related requirements (BBB Standards 2 and 3) and the fundraising standard (BBB Standard 9). Fewer nonevaluated organizations met the program expense (BBB Standard 8), asset accumulation (BBB Standard 10), and professional auditor standards (BBB Standards 11). 18
Proportion of Organizations Meeting BBB Standards.
Note. Not all BBB-assessed organizations meet the BBB standards; Administrative ratios are not included here as the BBB does not assess an administrative ratio standard currently, although it has in the past, and CN does currently. BBB = Better Business Bureau; CN = Charity Navigator.
Discussion
Given this widespread conformance, the more optimized ratios of the evaluated organizations are consistent with a hypothesis that evaluated organizations tend to follow optimization strategies. However, it is worth noting that this result could be due to any of a number (or combination) of factors beyond a strategic choice. Organizations that are evaluated by the BBB are, presumably, more aware of the standards and may optimize as a buffer against future nonconformance. Furthermore, although the models controlled for various contextual variables and selection bias, organizations that become known enough to be evaluated by the BBB may simply be more efficient, and thus report more optimal scores. However, nonprofits may also be following an optimization strategy, which even though only a partial explanation, has potentially important implications.
As shown in Table 5, most organizations in the sample are in no danger of failing to meet any BBB Standards for which data were available here, with the exception of the professional auditor requirement (which seems relatively easy, if not necessarily inexpensive, to remedy). If almost all organizations conform as a matter of course, then a conformance strategy confers no differentiation benefits in the donation market (Barman, 2002), and conformance may have more to do with institutional pressures (DiMaggio & Powell, 1983) than alignment between the standards and how nonprofits would behave absent the conformance intent (Eckerd & Moulton, 2011). For example, the mean fundraising ratio across the sample is 4%, the median 1.6%, the maximum is 40% and both the mode and first quartile are 0%. First, organizations are not close to the 35% threshold. Second, it is implausible that one quarter of the organizations accrued no fundraising expenses during the year. This raises concerns that fundraising costs are being calculated at best, in error, or at worst, to prevaricate (Keating et al., 2006; Krishnan et al., 2006; Tinkelman, 2006). Given this result and public attention that has been focused on fundraising, there may be unique aspects of fundraising expenses that merit further consideration.
Comparison for BBB Assessed Organizations.
Note. BBB = Better Business Bureau.
p < .05 that the two means are statistically equivalent.
The trends for the program and administrative ratios present interesting considerations about evaluation and management strategies. Despite limited evidence that either conforming or optimizing ratios results in increased revenue (Frumkin & Kim, 2001; Sloan, 2009; Tinkelman & Mankaney, 2007), organizations may perceive benefits of optimizing as a resource acquisition strategy. While evaluated and unevaluated nonprofits report similar fundraising ratios, evaluated organizations tend to categorize an additional US$.06 per US$1 spent in program expenses and a corresponding decrease in administrative expenses. The PSM model suggests that there is a systematic relationship between being evaluated and reporting higher program expense ratios and lower administration ratios. While there could be alternative explanations, this result is consistent with evaluated organizations perceiving some value in optimizing expense ratios.
The results have implications for the evaluators and their different perspectives on the utility of the expense ratios. The BBB and CN have tended to view each other as competitors. This research suggests that their roles may be different. Given the widespread compliance, the BBB approach may be best in helping donors identify very poor performers, while CN can help donors differentiate among the vast majority of organizations that conform to standards, assuming comparable levels of program effectiveness. However, there is no evidence that this is a reasonable assumption and, to be clear, assuming that program expense ratios are comparable across organizations and that a high program expense ratio is better is dependent on this assumption. Given the implausibility of the assumption, a conformance model is the best approach. The role of the evaluator is thus to raise red flags about particularly problematic organizations, but not to assert a difference in organizational quality based on “better” financial ratios or for that matter, “better” transparency, “better” governance, or “better” fiscal management. While we may all agree that efficient, transparent, well-governed, and well-managed nonprofits are better than the obverse, there is no particular reason to expect that a more efficient, more transparent, better-governed and better-managed nonprofit is comparatively superior unless two organizations are roughly similarly effective at achieving their missions. Given that this information is idiosyncratic and usually not available in any widespread manner, donors should be encouraged to investigate charities that meet the evaluators’ thresholds in more detail rather than relying on relative performance on a set of standards that does not reflect actual organizational performance.
Conclusion
Two conceptual models of nonprofit financial ratios have been used: as dichotomous indicators of minimum conformance or as quantitative scores of comparative performance. This research focused on whether nonprofits tended to treat the measures as minimum conformance thresholds or as metrics to optimize as a response to external pressures to differentiate themselves in the resource/donation marketplace. The results show that nonprofits evaluated by the BBB tend to have higher program expense and lower administrative expense ratios. While there could be many reasons for this difference (Westphal et al., 1997), this could be evidence that organizations are employing an optimization strategy. While the explicit use of such a strategy was not directly observed here, the efficiency measured by the ratios is only relevant as a performance indicator if the organization’s programs are effective. Optimization could run counter to an effective strategy as it could affect the extent to which nonprofits are willing or able to provide the complex services for which they are relied upon and that enable nonprofits to achieve their missions. An optimization strategy could, in part, encourage nonprofits to alter both their operational and resource acquisition strategies or, at the extreme, misreport activities—all activities that could lead to mission drift, or worse, put organizational missions at risk.
The questions raised by this study merit additional consideration. Nonprofit organizational evaluation has been a sector norm for two decades, but the empirical effect of these evaluations on the nonprofit sector has not been studied extensively. Many have argued that rigorous, valid, outcome-oriented program evaluation should be the norm for assessing nonprofit organizations. This normative argument is important and crucial, but this research is intended to start a conversation about the implications of how nonprofits are currently being evaluated. It is posited that nonprofits may be following a strategy to optimize performance on measures that were, at least initially, intended only to be dichotomous threshold indicators.
Footnotes
Acknowledgements
The author thanks Stephanie Moulton, Matt Dull, Erika Braunginn, and the anonymous reviewers.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
