Abstract
Surveys provide a critical source of data for scholars, yet declining response rates are threatening the quality of data being collected. This threat is particularly acute among organizational studies that use key informants—the mean response rate for published studies is 34 percent. This article describes several response enhancing strategies and explains how they were implemented in a national study of organizations that achieved a 94 percent response rate. Data from this study are used to examine the relationship between survey response patterns and nonresponse bias by conducting nonresponse analyses on several important individual and organizational characteristics. The analyses indicate that nonresponse bias is associated with the mean/proportion and variance of these variables and their correlations with relevant organizational outcomes. After identifying the variables most susceptible to nonresponse bias, a final analysis calculates the minimum response rate those variables needed to ensure that they do not contain significant nonresponse bias. Heuristic versions of these analyses can be used by survey researchers during data collection (and by scholars retrospectively) to assess the representativeness of respondents and the degree of nonresponse bias variables contain. This study has implications for survey researchers, scholars who analyze survey data, and those who review their research.
Surveys provide a critical source of data for scholars, yet declining response rates are threatening the quality of data being collected (Curtin, Presser, and Singer 2005; Groves and Peytcheva 2008; Massey and Tourangeau 2013). This threat is particularly acute among organizational studies—the mean response rate for published studies is less than 50 percent and declining (Anseel et al. 2010; Baruch 1999; Werner, Praxedes, and Kim 2007). Furthermore, response rates vary by the respondent’s position in the organizational hierarchy; the higher the position, the less likely the person will respond (Baruch and Holtom 2008; Cycyota and Harrison 2002). A meta-analysis of response rates for published studies that sampled organizational leaders found a mean response rate of 34 percent (Cycyota and Harrison 2006). These dismal response rates raise concerns about the representativeness of organizational studies and their heightened potential for containing significant nonresponse bias.
While data on response rates are ubiquitous, substantially less is known about the degree of nonresponse bias present in survey data (Groves and Peytcheva 2008). Even though scholars have developed several innovative methods to assess nonresponse bias within survey data, many of these methods remain underutilized (Werner et al. 2007). This is partly because many scholars have fixated on response rates to the neglect of assessing nonresponse bias (Bradburn 1992; Martin 2004). It is also partly because conducting nonresponse analyses adequately and identifying relationships between survey response patterns and nonresponse bias requires studies with sufficiently high enough response rates (Rogelberg and Stanton 2007).
Despite deteriorating response rates, technological advances have produced several cost-effective strategies researchers can use to increase response rates and improve data quality. This article describes several response enhancing strategies related to involving key stakeholders, leveraging technology, and appealing to respondents’ interests, and it explains how these strategies were successfully implemented in a national study of organizations that achieved a 94 percent response rate.
This article also increases knowledge about nonresponse bias by using data from the national study to conduct four different nonresponse analyses, which examine the relationship between survey response patterns and nonresponse bias. The first two analyses assess whether the early respondents differ significantly from the late respondents on several important individual and organizational characteristics. The third analysis regresses the number of days it took respondents to complete the survey on the key variables to identify which characteristics are related to response time. These analyses indicate that nonresponse bias is associated with the informant’s race, nativity, education level, and employment status, and with the organization’s revenue, number of employees, age, geographic scope, and technological sophistication, and with their correlations with relevant organizational outcomes. After identifying the variables most susceptible to nonresponse bias, the fourth analysis calculates the minimum response rate those variables needed to ensure that they do not contain significant nonresponse bias. Heuristic versions of these analyses can be used by survey researchers during data collection (and by scholars retrospectively) to assess the representativeness of respondents and the degree of nonresponse bias variables contain.
In summary, declining response rates among organizational studies threaten the quality of data being collected and the limited empirical attention given to assessing nonresponse bias undermines confidence in the data’s external validity. The response enhancing strategies and nonresponse analyses covered in this article advance survey research by helping to improve data quality and bolster confidence in its external validity. As a result, this article has implications for survey researchers, scholars who analyze survey data, and those who review their research.
Organizational Survey Research and Key Informants
In recent years, organizational studies have begun to rely more heavily on key informants to collect data (Cycyota and Harrison 2006). 1 Key informants tend to occupy a formal leadership position within an organization, and they are selected for studies because they often have access to proprietary information and they typically possess the most knowledge about the organization’s history and activities (Gupta, Shaw, and Delery 2000; Snijkers et al. 2013). Organizational leaders, however, are also the least likely to respond to organizational surveys (Anseel et al. 2010; Baruch and Holtom 2008; Cycyota and Harrison 2006). Their low responsiveness is a product of increasing requests to participate in studies (Baruch 1999), decreasing discretionary time (Cooper and Payne 1988), and a general reluctance to provide proprietary information (Falconer and Hodgett 1999). Consequently, surveys that rely on organizational leaders to respond, as key informants face unique challenges in achieving sufficient response rates (Cycyota and Harrison 2002).
Nonresponse and External Validity
Many social scientists use response rates to assess the quality of survey data because higher response rates tend to contain less nonresponse bias and thus produce more accurate estimates (Biemer and Lyberg 2003). 2 When scholars use survey data to make inferences about a population, the statistical methods they use assume that a representative sample is being analyzed. Not only does the sample need to be representative of the population, but the subset of sampled respondents who provided data also need to be representative (Blalock 1989). To make valid inferences using survey data, the distribution of characteristics of the respondents needs to reflect that of the target population (Fowler 2009). Surveys that fail to collect data from every respondent in the sampling frame jeopardize their representativeness and can undermine their capacity to provide unbiased population estimates and hypothesis tests. If a survey is lacking responses from a substantial number of respondents, it is likely to be less representative of the population and contain nonresponse bias (Groves 2006). Nonresponse bias, however, is not directly related to response rates; rather, it depends on the degree to which respondents differ from nonrespondents (Krejci 2010). When they differ significantly, nonresponse error will bias the sample and undermine its external validity (Rogelberg and Stanton 2007).
As a result, surveys with low response rates do not necessarily contain significant nonresponse bias (Holbrook et al. 2007; Peytchev 2013). 3 Lower response rates bias survey estimates only if respondents and nonrespondents differ in ways that are germane to the study’s objectives. If the reasons for nonresponse are uncorrelated with the variables being measured, then it is possible that the nonrespondents are essentially a random subset of the full survey sample—at least random with respect to the pertinent variables (Gelman and Hill 2007). If there are no systematic differences between respondents and nonrespondents, then the sample remains representative of the target population and can provide valid inferences (Anseel et al. 2010; Groves and Couper 1998). Thus, when using survey data to make inferences about a population, response representativeness is more important than the response rate (Cook, Heath, and Thompson 2000).
Although low response rates do not necessarily undermine the representativeness of a sample, the two can be significantly related (Groves 2006). People who respond to surveys often differ in significant ways from those who do not respond (Rogelberg et al. 2000), and as response rates decline, the risk of a sample becoming less representative and containing nonresponse bias increases (Groves 2006; Tomaskovic-Devey, Leiter, and Thompson 1994). As a result, many scholars contend that response rates can provide a reasonable estimate of a sample’s representativeness and external validity.
Conducting Nonresponse Analyses
Even though response rates have become a standard proxy for assessing the representativeness of survey data and strategies can be used to substantially increase response rates, most studies will continue to have some percentage of nonresponse. Because missing cases can threaten a sample’s representativeness, it is important for studies to conduct nonresponse analyses. Scholars would expect this to be a standard practice; however, a meta-analysis of articles using organizational survey data found that less than one-third of the articles reported conducting nonresponse analyses (Werner et al. 2007). The rarity of this practice persists even though several methods have been developed to assess nonresponse bias (see Table 1). Despite these methodological developments, it often remains difficult to demonstrate the presence of nonresponse bias in a survey because most methods attempt to assess how respondents differ from nonrespondents when (by definition) data for the nonrespondents were not collected (Rogelberg and Stanton 2007). In some instances, researchers can evaluate nonresponse bias by obtaining data on nonrespondents from other sources (e.g., Lin and Schaeffer 1995; Lynn et al. 2002; Teitler, Reichman, and Sprachman 2003). However, even when data on nonrespondents can be obtained, the information tends to be limited to a few demographic characteristics and this constrains researchers’ ability to perform a comprehensive nonresponse analysis on all of the variables of interest.
Methods for Conducting Nonresponse Analyses.
Being able to assess the nonresponse bias contained in each variable is critical because nonresponse bias occurs at the level of individual survey items rather than at the level of a survey (Assael and Keon 1982; Gile, Johnston, and Salganik 2015; Groves et al. 2006). Each variable possesses its own estimate of nonresponse bias and some variables may be more susceptible to nonresponse bias than others because they measure characteristics that are associated with nonresponse. Therefore, a nonresponse analysis becomes more comprehensive, as it increases the number of variables it analyzes.
Not only can the bias associated with nonresponse vary across variables, it can also have varying effects on different statistics of the same variable (Newman 2009; Peytchev, Carley-Baxter, and Black 2011). Many nonresponse analyses tend to examine only whether estimates of means/proportions contain significant bias; however, estimates of a variable’s variance can also contain nonresponse bias (Goodman and Blum 1996; Newman and Sin 2009). If respondents and nonrespondents have different distributions on the variables of interest, this can lead to biased significance tests. Furthermore, nonresponse bias can be independently associated with correlations between variables, and this can bias estimates of regression coefficients and hypothesis tests (Alexander et al. 1986; Lepkowski and Couper 2002). Because differences between respondents and nonrespondents can bias estimates of means/proportions, variances, and correlations independently, it is important for nonresponse analyses to examine each of these statistics.
Since most studies lack substantial data on nonrespondents, analyses of nonresponse bias tend to compare early respondents with late respondents (e.g., Curtin, Presser, and Singer 2000). For samples with high response rates, researchers can treat late respondents as proxies for nonrespondents (i.e., had data collection been stopped at an earlier point, the late respondents—those who responded after this point—would have been nonrespondents). Then they can simulate an analysis that compares variable estimates for respondents with those of “nonrespondents” and identify which are associated with nonresponse.
Key Informants and Nonresponse
Although several studies have analyzed survey response patterns, most have been conducted on surveys of individuals; only a few have been conducted on surveys of organizations (e.g., Gupta et al. 2000; Smith 1997; Tomaskovic-Devey et al. 1994). In a recent meta-analysis of 30 articles on survey nonresponse, only one was based on a survey of organizations (Groves 2006). Organizational respondents (i.e., key informants) and individual respondents weigh different factors when deciding whether to participate in a study and the processes affecting their response patterns differ as well (Snijkers et al. 2013; Tomaskovic-Devey et al. 1994). Thus, researchers need to consider how informants’ individual and organizational characteristics might uniquely influence their participation in an organizational survey.
Only a few studies have analyzed how key informants’ individual characteristics are associated with their response patterns (e.g., Gupta et al. 2000). Based on Tomaskovic-Devey and his colleagues’ (1994) organizational theory of survey nonresponse, individual characteristics that represent an informant’s capacity and motive to respond will be significantly associated with response outcomes. Informants who are full-time employees and have been in their position for several years will be more likely to have the capacity to complete a survey. In addition, informants who have college degrees are more likely to value academic research and be motivated to participate in a study (Groves and Couper 1998). With regard to organizational characteristics, Tomaskovic-Devey and his colleagues (1994) propose that an organization’s degree of complexity can influence an informant’s response patterns. Larger and older organizations as well as those with a wider geographic scope and greater technological sophistication tend to be more complex, which can inhibit an informant’s ability to collect and report information.
Data
To examine the relationship between survey response patterns and nonresponse bias, this study uses data from the National Study of Community Organizing Coalitions (NSCOC) (Fulton et al. 2011). The population for this study included every institution-based community organizing coalition in the United States that has an office address, at least one paid employee, and organizational members. Based on these criteria, the study identified 189 active coalitions by using databases from every national and regional community organizing network, databases from 14 foundations that fund institution-based community organizing and archived IRS 990 forms. The NSCOC surveyed the entire field of these coalitions during the second half of 2011 by distributing an online survey to the director of every coalition. Respondents were asked to provide extensive data on their coalition’s history, finances, and activities as well as detailed demographic information on their organizational members, board members, and employees. When directors completed their participation in the study, they received a payment that ranged between US$25 and US$100 depending on the size of their coalition.
This census study achieved a response rate of 94 percent—gathering data on 178 of the 189 coalitions in the country and demographic information on the 4,145 member organizations, 2,939 board members, and 628 employees affiliated with these coalitions (American Association for Public Opinion Research 2011, response rate 5). This article uses the NSCOC to illustrate the effectiveness of implementing response enhancing strategies, to conduct nonresponse analyses for several key variables, and to identify the minimum response rate at which those variables no longer contained significant nonresponse bias.
Response Enhancing Strategies
To obtain high response rates, researchers rely on people’s willingness to participate in surveys. However, with the widespread proliferation of surveys, it is assumed that people have become less willing to participate (Dillman, Smyth, and Christian 2014; Rogelberg and Stanton 2007). 4 Many prospective respondents—especially organizational leaders—are flooded with requests to fill out a survey, which can produce survey fatigue and result in greater nonresponse (Gupta et al. 2000; Porter, Whitcomb, and Weitzer 2004; Weiner and Dalessio 2006). At the same time though, advances in technology have made collecting data more efficient and have produced strategies for increasing response rates and improving data quality.
Although organizational leaders are the least likely to participate in an organizational study, they possess unique motivations for participating in a study about their organization (Baruch 1999; Dillman et al. 2014; Tomaskovic-Devey et al. 1994). As a result, researchers can increase their likelihood of participating by using strategies that tap the leaders’ positional and organizational interests. Throughout the development and implementation of the NSCOC, the research team used strategies designed specifically to increase participation levels among the coalition directors.
Involving Key Stakeholders
To promote the NSCOC, increase its exposure, and cultivate ownership, the research team involved key stakeholders in designing and pretesting the survey. The key stakeholders included national-level organizing leaders, foundation directors, and academics—all of whom are known and respected by the coalition directors. Collaborating with these stakeholders to develop the survey not only generated interest in the study, it also improved the overall quality of the study (Hinkin, Holtom, and Klag 2007). To bolster the credibility of the NSCOC, the research team obtained public endorsements from the key stakeholders and other respected leaders throughout the field. Organizational leaders tend to be more willing to participate in a study and disclose proprietary information when it is endorsed by someone they know and trust (D’Aveni 1995; Falconer and Hodgett 1999). To help appease directors’ concerns regarding the study’s objectivity and confidentiality, the research team emphasized that it was sponsored by a respected research university (Bruvold, Comer, and Rospert 1990; Dillman et al. 2014; Edwards et al. 2002; Fox, Crask, and Jonghoon 1988). Furthermore, many directors recognize that participating in a university-sponsored study can benefit their organization and field because such a study can help to credential, legitimize, and promote their work; especially among lesser known organizational fields (Bonaccorsi and Piccaluga 1994).
Leveraging Technology
In conjunction with involving key stakeholders in developing and promoting the NSCOC, the research team used advances in technology to create a user-friendly online survey. In the last decade, researchers have increasingly used the Internet to administer surveys, not only because it is the least expensive mode, but also because it offers unique functions that can increase response rates and improve data quality (Anseel et al. 2010). Compared to phone and face-to-face surveys, online surveys eliminate inefficiencies associated with scheduling interview times and they allow respondents to participate when it is most convenient. Unlike paper surveys, online surveys can include features that automate skip patterns, carry forward respondent’s answers, and provide an option to “save and finish later”—all of which enhance the user’s experience and can lead to higher completion rates (Couper 2008; Redline and Dillman 2002). Researchers can also prepopulate the survey with data from the respondent’s organization which can reduce data entry error and the amount of time it takes to complete the survey. Finally, when an online survey is completed, the returned data are already in an electronic format which facilitates error checking and eliminates the costs and potential risks associated with inputting responses into a database.
Appealing to Respondents’ Interests
The research team also used several response enhancing strategies and procedures when they implemented the survey. Their strategies focused on appealing to the respondents’ personal and organizational interests. They included establishing a personal relational connection between the researcher and respondent, sending an advance notice about the study, highlighting its endorsers, emphasizing the study’s topical salience, and providing monetary incentives (Cycyota and Harrison 2006). Their procedures involved using technology to optimize the data collection process. They included personalizing correspondence, maintaining digital correspondence records, monitoring respondents’ survey progress, sending timely reminders, and providing customized deadlines and extensions. Furthermore, after respondents completed the survey, their responses were already in an electronic format. This enabled the research team to efficiently scan for missing values, conduct internal validation tests, and cross-check responses with external sources of information. As soon as any issues were identified, respondents were contacted and asked to provide the missing data and/or to clarify discrepancies. Although the effectiveness of these response enhancing strategies were not empirically tested via a randomized study, it is likely that implementing them helped to increase the response rate and completion rate of the NSCOC and improve the quality of its data.
Nonresponse Analyses
The second part of this article uses data from the NSCOC to examine the relationship between survey response patterns and nonresponse bias by conducting four nonresponse analyses. Because the degree of nonresponse bias can vary across survey items, each variable is analyzed individually. In particular, the analyses examine the variables that measure the individual and organizational characteristics that are commonly used in organizational research and/or are theorized to be related to response patterns (Gupta et al. 2000; Tomaskovic-Devey et al. 1994). The individual characteristics include the respondent’s gender, race, nativity, age, and education level as well as whether the respondent is a full-time employee and how long he or she has been the director. The organizational characteristics include the coalition’s annual revenue, number of employees, age, geographic scope, and whether the coalition uses Twitter. Because nonresponse bias can have varying effects on different statistics of the same variable, the analyses examine whether estimates for the mean/proportion and variance of these variables contain nonresponse bias. To examine whether nonresponse bias is associated with correlations between the key variables and relevant organizational outcomes, the analyses include the following outcome variables: whether the coalition met with a U.S. Representative, the number of people who attended the coalition’s largest event, and the number of newspaper articles the coalition was referenced in. Table 2 displays descriptive statistics for each of the variables used in the analyses.
Descriptive Statistics for the Characteristics of Key Informants and Their Organization, and the Results of Comparing the Mean/Proportion and Standard Deviation of These Variables for Early and Late Respondents.
Source: 2011 National Study of Community Organizing Coalitions.
aDifference = Early Respondents – Late Respondents; Welch’s t-test used to test for equal means/proportions; Levene’s F-test used to test for equal variances.
bLogged values.
*p < .05. **p < .01. ***p < .001.
The first analysis examines whether the order in which respondents completed the survey is associated with their individual and organizational characteristics. It orders the cases based on the number of days it took the respondent to complete the survey and uses that response order to calculate the study’s cumulative response rate. Three graphs are made for each variable. The first set of graphs plot each variable’s cumulative sample mean/proportion as a function of the percentage of surveys completed (i.e., the cumulative response rate). Each point on the graph represents the estimated population mean/proportion after x percent of the respondents had completed the survey. The graphs also include the 95 percent confidence interval for each point estimate and a horizontal line indicating the variable’s final estimated population mean/proportion which is calculated using responses from 94 percent of the entire population (see Figures 1A–4A for example graphs). The second set of graphs plot each variable’s cumulative sample standard deviation as a function of the percentage of surveys completed (see Figures 1B–4B for example graphs). The third set of graphs plot the magnitude of each variable’s cumulative bivariate relationship with each of the selected outcome variables as a function of the percentage of surveys completed (see Figures 1C–4C for example graphs). 5

(A) Proportion of male directors. (B) Standard deviation of male directors. (C) Association between male directors and number of newspaper articles.

(A) Proportion of white directors. (B) Standard deviation of white directors. (C) Association between white directors and number of newspaper articles.

(A) Mean number of employees (Logged). (B) Standard deviation of number of employees. (C) Association between number of employees and number of newspaper articles.

(A) Mean organizational age (Logged). (B) Standard deviation of organizational age. (C) Association between organizational age and number of newspaper articles.
For some of the graphs (e.g., as displayed in Figure 1A, B, and C), the value for the cumulative estimate converges early in the data collection process to fluctuate randomly just above and below the final population estimate. This indicates that the order in which respondents completed the survey is unrelated to those variables and that their population estimates do not contain significant nonresponse bias, even when based on a relatively low percentage of completed surveys. However, for many of the graphs (e.g., as displayed in Figure 2A, B, and C), the value for the cumulative estimate is systematically higher/lower than the final population estimate. This indicates that the order in which respondents completed the survey is associated with those variables and that their population estimates would contain nonresponse bias, if they were based on a relatively low percentage of completed surveys. Yet, the graphs also indicate that for most of those variables the 95 percent confidence interval of the cumulative estimate contains the final population estimate by the time data had been collected from 34 percent of the sample—the mean response rate for organizational studies that use key informants. This suggests that although estimates for these variables contain nonresponse bias, the bias is not significant.
Based on this analysis, the only apparent drawback with obtaining a low response rate is that population estimates will have larger standard errors, and for most organizational studies this limitation can be overcome by increasing the original size of the sample (assuming the response rate remains the same). However, although increasing the sample size will decrease the standard errors and produce more precise estimates, if variables contain nonresponse bias (e.g., as indicated in Figure 2A, B, and C), it is likely their population estimates based on a relatively low percentage of completed surveys will be significantly different from their true population value since the larger sample size will produce a more narrow confidence interval that no longer contains the true population value. Moreover, the standard error and subsequent confidence interval calculated for the population estimates at each response rate level assumes that the people who had responded by that point are a random subset of the sample (Gelman and Hill 2007). If they are not a random subset and instead possess characteristics that distinguish them from those who had not yet responded, then the population estimates will contain nonresponse bias. More precisely, if the respondents differ significantly from the nonrespondents along certain characteristics, then inferences about the population based on these characteristics will be biased. Most organizational studies, however, lack extensive data on nonrespondents, which inhibits their ability to determine whether and how the respondents differ from the nonrespondents.
The second analysis addresses this limitation. It splits the NSCOC data set into two subsets—the early respondents and the late respondents—and simulates a nonresponse analysis that treats the late respondents as nonrespondents. The early respondents are defined as the first 34 percent of the sample to complete the survey and the late respondents as the remaining 60 percent of the sample that completed the survey after the early respondents (6 percent of the respondents never completed the survey). The late respondents represent those who would have been nonrespondents had data collection been stopped after a 34 percent response rate had been achieved (the mean response rate for organizational studies that use key informants). The analysis then assesses whether the early respondents differ significantly from the late respondents along the same characteristics analyzed in the first analysis. Table 2 reports the difference in means/proportions and standard deviations between the early and late respondents for each variable and indicates whether the difference is significant.
The results indicate that the mean/proportion values for the early respondents differ significantly from those of the late respondents for more than half of the key informant characteristics analyzed. The early respondents were more likely to be white, U.S.-born, college-educated, and full-time employees. Furthermore, the magnitude of the differences is substantial—the average difference for the binary variables is 9 percentage points, and the maximum is 14 percentage points. 6 With regard to the organizational characteristics, the mean/proportion values for the early respondents differ significantly from those of the late respondents for all of the characteristics analyzed. The early respondents were more likely to be leading organizations that have a smaller annual revenue and fewer employees, are younger, have a narrower geographic scope, and do not use social media technologies such as Twitter. On average, the early respondents had an annual revenue that was US$172,000 (43 percent) less than the late respondents and 20 percent fewer employees. Consequently, had the NSCOC stopped collecting data after 34 percent of the respondents had completed the survey, organizations with these characteristics would have been overrepresented in the study.
The results also indicate significant differences in the standard deviation values for the early and late respondents for more than half of the characteristics analyzed. For all but one of these variables, the distribution of values for the early respondents had significantly less variance than those for the late respondents. This means that analyses which use only the early respondents would understate variances for these variables and provide biased inferences about the population. In this case, most of the variables with biased variances also contained bias in the estimates for their means/proportions; however, nonresponse can also bias variance estimates when estimates of means/proportions do not contain bias (Peytchev et al. 2011). Importantly though, the significant differences in estimates demonstrate that the early respondents in the NSCOC are not a random subset of the sample with regard to the variables analyzed.
When respondents differ systematically from nonrespondents, this can bias not only estimates of means/proportions and variances but also estimates of correlations between variables. Given the importance of having unbiased regression coefficients, this analysis also examines whether correlations between the key individual and organizational characteristics and relevant organizational outcomes differ for the early and late respondents. Table 3 reports the differences in bivariate regression coefficients between early and late respondents for the relationships between their characteristics and each outcome variable. The results indicate relatively little difference between early and late respondents in their likelihood of having met with a U.S. Representative. On the other hand, several differences exist in both the significance and magnitude of the relationship between characteristics of early and late respondents and the attendance at the organization’s largest event and its number of newspaper references. The significant differences indicate interactions between the relevant characteristics and being an early respondent that moderate their relationship with these two particular organizational outcomes. The implications of these differences can be better understood by comparing the coefficients for early respondents with those of all respondents. Regression analyses using only early respondents indicate that the respondent’s employment status, years in the position, and the age of the organization are significantly associated with the attendance at the organization’s largest event; however, these relationships are not significant in analyses that use all respondents. Conversely, analyses using all respondents indicate that the respondent’s nativity, age, and the geographic scope of the organization are significantly associated with the attendance at the organization’s largest event; however, these relationships are not significant in analyses that use only early respondents. Similarly, the number of newspaper references is significantly associated with the respondent’s race, nativity, the age of the organization, and whether it uses Twitter in only one of the two samples. Even though the outcome variables selected for this analysis are not relevant to every organizational study, the findings highlight the need to examine whether nonresponse bias is associated with correlation estimates.
Differences in the Regression Coefficient Estimates between Early and Late Respondents in the Bivariate Relationships between their Characteristics and Three Organizational Outcomes.
Source: 2011 National Study of Community Organizing Coalitions.
aNo non-U.S.-born Early Respondents reported meeting with a U.S. Representative.
bLogged values.
cBased on logistic regressions.
dBased on Poisson regressions.
eDifference = Early respondents − Late Respondents; Wald test used to test for equal coefficients.
*p < .05. **p < .01. ***p < .001.
Overall, this analysis underscores the importance of examining several key variables and multiple statistics of those variables when assessing differences between early and late respondents. For example, although the mean and variance estimates for the organization’s size differ significantly for early and late respondents, the estimates for the correlation between the organization’s size and the relevant outcome measures do not differ significantly. Whereas the opposite is the case for estimates of the organization’s geographic scope and use of Twitter. This analysis also illustrates that if the NSCOC had a lower response rate, the significant differences between early and late respondents would have led to biased estimates for many of the variables analyzed in this study. Furthermore, this analysis provides evidence that these variables in organizational studies, which achieve a 34 percent response rate (or less), might contain significant nonresponse bias and thus produce biased estimates of means/proportions, variances, and correlations.
Another way to assess nonresponse bias is to analyze whether the number of days it took respondents to complete the survey is significantly associated with any of the respondents’ individual or organizational characteristics. Under conditions where respondents respond at random, there should be no relationship between the respondents’ response time and their characteristics. However, if the key informants who are more likely to respond early in the data collection process have certain characteristics, then one would expect to find a significant relationship between the response time and those characteristics. To test this, the variable days to complete was created to indicate the number of days the respondent took to complete the survey. The mean value is 41 days, and the median value is 32 days. Figure 5 displays the proportion of surveys completed as a function of the number of days it took respondents to complete the survey.

Days to complete survey.
The third analysis regresses the number of days to complete the survey on the key informants’ individual and organizational characteristics to identify which characteristics are significant predictors of response time. Table 4 reports the results of the multivariate Poisson regression model which includes all of variables analyzed in the previous analyses. Even when controlling for the other characteristics, survey response time remains significantly associated with the informant’s race, nativity, age, education level, and employment status. All else being equal, white directors and U.S.-born directors took 12 percent fewer days to complete the survey, college-educated directors took 25 percent fewer days, and directors who are full-time employees took 62 percent fewer days. Thus, if data collection for the NSCOC had been stopped once a relatively low response rate had been achieved, directors with these characteristics would have been overrepresented in the study. To illustrate the magnitude of these relationships, the model predicts that a U.S.-born, college-educated, white director will take 29 days to complete the survey whereas an immigrant, noncollege-educated, nonwhite director leading an equivalent organization will take 48 days to complete the survey.
Results of Nonresponse Analyses for the Variables Measuring Characteristics of Key Informants and Their Organization.
Note: b1, b2, and b3 represent the regression coefficient estimates for the likelihood of meeting with a U.S. Representative, the expected attendance at largest event, and the expected number of newspaper articles, respectively.
aLogged values.
bFor each variable, the underlined response rate indicates the minimum rate needed for all of its estimates to stabilize.
*p < .05. **p < .01. ***p < .001.
The model also indicates that survey response time is significantly associated with the organization’s size and age. When controlling for the other characteristics, organizations with larger revenues and those that are older took significantly more days to complete the survey; however, organizations with more employees completed the survey in fewer days. 7 This analysis demonstrates that the survey response patterns of the key informants in the NSCOC are significantly related to their personal characteristics and the characteristics of their organization. Moreover, it provides evidence that organizational studies, which stop data collection after achieving a relatively low response rate, might be more likely to overestimate the proportion of organizations with directors who are white, U.S.-born, college-educated, and full-time employees. On the other hand, such studies might be more likely to underestimate organizations that are older, have more revenue, and fewer employees.
Each of the nonresponse analyses indicate that important variables in the NSCOC and their correlations with relevant organizational outcomes would have contained significant nonresponse bias had data collection stopped earlier than it did. Given that studies with lower response rates have a higher risk of containing nonresponse bias, one way to eliminate (or greatly reduce) this risk is to achieve a 100 percent (or unusually high) response rate. In many instances, however, it is cost prohibitive to achieve an unusually high response rate. And it might not be necessary. Eliminating the presence of significant nonresponse bias in key variables does not necessarily require an unusually high response rate. Reducing nonresponse bias to a nonsignificant level oftentimes can be attained at a lower response rate.
Another way to think about reducing the nonresponse bias in a variable to a nonsignificant level is to identify the point in the data collection process at which the sample estimates for a variable stabilize around their population value (Hoegeman and Chaves 2008). In other words, as the individual and organizational characteristics associated with the response patterns of key informants become more known, researchers can more accurately estimate the minimum response rate needed to ensure that the respondents are representative of the target population with regard to the study’s variables of interest. Since nonresponse bias is a characteristic of individual variables, the minimum required response rate may differ depending on the characteristic being measured. Once researchers have calculated the percentage of cases needed for its key variables to stabilize and not contain significant nonresponse bias, collecting data on additional cases beyond that response rate threshold for the purposes of reducing nonresponse bias is not necessary.
The fourth analysis illustrates this diagnostic procedure by identifying the point in the data collection process at which the estimates for each variable stabilized. A sample estimate of a variable becomes stable when a sufficient percentage of cases have been collected such that its cumulative value comes within and stays within the 95 percent confidence interval of its final population estimate. 8 To identify the stabilization point for a particular variable, the analysis replicates the process conducted in the first analysis. It repeatedly calculates the variable’s cumulative sample estimates for each new case that is added to the data set. As new cases are added, the cumulative estimates for the variable change and may move in and out of the confidence interval of their final population estimate. Thus, it is important to identify the point at which the cumulative estimates stay within their confidence interval.
Table 4 reports the response rate level at which the estimates for each of the key variables stabilized and for each variable it indicates the minimum rate needed for all of its estimates to stabilize (see Figures 1A–4C for examples of this analysis). 9 Most of the variables stabilized by the time 83 percent of the respondents had completed the survey. This indicates that extending the data collection effort to obtain additional cases provided only a marginal reduction in nonresponse bias for these variables. Although the additional cases reduced the standard error of the population estimates for these variables, the nonresponse bias they contain had already been reduced to a nonsignificant level. However, the variables measuring the director’s nativity, education level, and employment status did not stabilize until a response rate of approximately 90 percent had been achieved, and since these variables are critical for the NSCOC, it was worth allocating additional resources to achieve a response rate of 94 percent.
Although this stabilization analysis was conducted retrospectively, it can be conducted during data collection if the researcher has accurate population estimates for each key variable. For example, a researcher could obtain certain demographic information about a population of organizations via archives, tax records, or previous studies and use these data to assess the respondents’ representativeness (Olson 2013). When relevant population estimates are not available, researchers can use a heuristic version of this analysis (see, e.g., Bengtsson et al. 2012; Gile et al. 2015; Groves and Heeringa 2006). As surveys are completed, they can plot the cumulative estimates for each key variable and visually inspect the plots for stability. Data collection would continue until the cumulative estimates for each key variable appear to stabilize. In this scenario, achieving stability is not based on when the cumulative estimate comes within and stays within the 95 percent confidence interval of the population estimate. Rather, stability is achieved when the slope of the cumulative estimate plot line becomes approximately zero.
To illustrate how this analysis could be implemented during data collection, imagine a researcher wants to conduct a study to test a hypothesis about the race of directors and the amount of media attention their organization receives. As the researcher is collecting data for this study, he or she could construct plots similar to the ones displayed in Figure 2A, B, and C in real time, as surveys are completed. In the early stages of the data collection process, the researcher would notice that the estimates contain nonresponse bias. The real-time plot in Figure 2A would display the cumulative estimate for the proportion of white directors decreasing as the proportion of surveys completed increases. The constant decrease in estimates indicates that white directors are overrepresented among the early respondents. Similarly, the plot in Figure 2B would display the cumulative estimate for the standard deviation of white directors increasing as the proportion of surveys completed increases indicating that the variance in the directors’ racial composition is being underestimated among the early respondents. Finally, the plot in Figure 2C would display the cumulative estimate for the correlation between the director’s race and the number of newspaper articles decreasing as the proportion of surveys completed increases indicating that the magnitude of the relationship is being overestimated among the early respondents. To help ensure that the final sample is representative of the population’s racial composition and its correlation with receiving media attention, the researcher would continue to collect data until the cumulative estimates appear to stabilize (i.e., the slope of each plot line becomes zero). In this case, the slope of each plot line becomes approximately zero by the time 61 percent of the surveys have been completed. By this point, each estimate appears to have stabilized, which suggests that the respondents who provided data are representative of the population in terms of their race and its correlation with receiving media attention. The researcher could replicate this procedure for each key variable and relevant outcome measure to help determine when enough surveys have been collected. With this approach, a researcher does not need prior knowledge of the variables’ population estimates in order to assess the representativeness of the respondents who have completed a survey by a certain point. Furthermore, any researcher who uses survey data can conduct this type of analysis to provide evidence that the respondents who provided data are representative by showing the stabilization plots for each key variable.
Discussion
Understanding how individual and organizational characteristics are related to survey response patterns can help survey researchers achieve representativeness among sampled respondents and reduce nonresponse bias. This study is among the first to examine those relationships within organizational studies (see also Gupta et al. 2000; Smith 1997; Tomaskovic-Devey et al. 1994). 10 Although the population surveyed for the NSCOC is a particular organizational field, its diverse social composition is characteristic of the growing diversity within organizations (Braunstein, Fulton, and Wood 2014; Wood and Fulton 2015). The individual characteristics most consistently associated with nonresponse bias in this study are those related to the informant’s race, nativity, education level, and employment status. 11 These results support the theory that informants without a college education will be less motivated to participate in a study and informants who are part-time employees will have less capacity to complete a study (Tomaskovic-Devey et al. 1994). Given these constraints, survey researchers could provide informants with these characteristics special attention to increase their likelihood of participating. It remains less clear why an informant’s race and nativity are associated with nonresponse bias. One possible explanation is that although the NSCOC used the strategy of involving key stakeholders in developing and promoting the study, most of the stakeholders were white and almost all were U.S.-born. Had the NSCOC involved stakeholders that more accurately represented the demographics of the field, it might have improved the response times of the nonwhite and immigrant key informants. Further research is needed to develop theories to explain the relationship between informants’ individual characteristics and their survey response patterns.
The organizational characteristics most consistently associated with nonresponse bias in this study are those related to the organization’s size and age. This finding is consistent with the patterns observed by Gupta and her colleagues (2000)—that informants from larger and older organizations take longer to respond to surveys. 12 This finding also supports the theory that organizational complexity can inhibit an informant’s capacity to respond (Tomaskovic-Devey et al. 1994), and it highlights the importance of implementing response enhancing strategies that account for organizational characteristics. For example, since larger organizations are slower to participate in studies, researchers can structure financial incentives to correspond with the size of the organization being surveyed. In the NSCOC, the financial incentive for completing the study was tiered in such a way that the largest organizations received an amount that was four times greater than what the smallest organizations received. Every respondent was initially informed that they would receive US$25 for participating in the study. Once the size of their organization was confirmed, the respondents representing larger organizations were told that they would receive up to US$100. Although the actual dollar amount is not particularly substantial, knowing that they would receive four times the base incentive likely provided additional motivation for larger organizations to complete the study. 13 Had the NSCOC not used this tiered incentive structure, the nonresponse bias among larger organizations likely would have been even greater (Singer and Ye 2013). In general, a better understanding of the mechanisms underlying survey nonresponse among key informants is needed in order to develop customized response enhancing strategies that target informants who are less likely to respond.
In addition to identifying the individual and organizational characteristics most susceptible to nonresponse bias, this study illustrates how survey response patterns can bias correlation estimates between these key variables and organizational outcomes. For example, had the NSCOC achieved a response rate near the mean for organizational studies, analyses would have inaccurately indicated significant relationships between the director’s employment status and tenure and the number of people attending the organization’s largest single event. Conversely, they would have failed to indicate significant relationships between the directors’ nativity and age and attendance. Because scholars often test hypotheses and make policy recommendations by conducting multivariate analyses of survey data, it is important for studies to assess nonresponse as potential a source of error in its significance tests. Even though this study is based on data from a single survey, the consistent patterns of nonresponse bias it identifies in estimates for the mean/proportion and variance of key variables and their correlations with organizational outcomes warrants the need to conduct similar analyses on other surveys.
Conclusion
Knowing that surveys provide a critical source of data for scholars, it is important to address issues that threaten the quality of data being collected and to conduct analyses which bolster confidence in the data’s external validity. In light of declining response rates among organizational studies and the limited empirical attention given to assessing nonresponse bias, this article seeks to improve the quality of survey research by (1) describing several response enhancing strategies, (2) identifying key variables susceptible to nonresponse bias, (3) estimating the minimum response rate needed to ensure that those variables do not contain significant nonresponse bias, and (4) offering methods researchers can use to assess the representativeness of respondents.
Despite the trend toward declining response rates, surveys that implement response enhancing strategies can achieve significantly higher response rates. Even though the NSCOC had a relatively small sample size, its strategies are applicable to studies of varying sizes. By involving key stakeholders in developing and promoting the study, leveraging technology in designing and implementing the survey, and appealing to the respondents’ personal and organizational interests, researchers can increase response rates and improve the quality of organizational survey data.
This study underscores the importance of determining an adequate threshold for response rates—or more precisely, an acceptable amount of nonresponse bias a study can contain. Keeter and his colleagues (2000) suggest that a study’s nonresponse bias is acceptably low if fewer than 15 percent of the variables contain bias. This approach, however, ignores differences in the substantive importance of the variables. Furthermore, Keeter and his colleagues test for nonresponse bias in only the estimates for the variables’ mean/proportion, even though nonresponse bias can have varying effects on different statistics of the same variable. Nonresponse bias can be associated with a correlation estimate between two variables and not be associated with the mean and variance estimates of those variables and vice versa. An alternative approach begins by identifying a study’s key variables of interest and then analyzes multiple statistics of those variables to assess whether they contain significant nonresponse bias.
Although nonresponse bias can be reduced greatly by achieving very high response rates, this goal needs to be weighed against the additional cost required to collect data from a very high percentage of the sample. Moreover, achieving a very high response rate is not necessary if the variables of interest tend to stabilize once data have been collected from a smaller percentage of the sample. The important point is for scholars to know which variables are critical for their analysis and then to determine if the response rate is sufficient to ensure that those variables do not contain significant nonresponse bias. This study indicates that a response rate ranging between 60 and 83 percent was needed for most of the NSCOC’s key variables to stabilize, and it suggests that similar variables in other studies would require a similar minimum response rate. In order to improve estimations for the minimum response rate needed for particular variables to stabilize, similar analyses need to be conducted on other organizational studies. In addition, this analysis illustrates how researchers can use responsive survey designs in which they adjust design features during data collection depending on survey response patterns (Groves and Heeringa 2006; Schouten, Calinescu, and Luiten 2013). Using a responsive design, especially when surveying a relatively unknown population, can both improve data quality and reduce costs.
Relying on survey data to make inferences about a target population rests on the assumption that the respondents who provide data are a representative sample of the population. Low response rates can undermine the external validity of the data because having a large proportion of sampled respondents not complete a survey increases the risk of variables containing nonresponse bias, which will produce biased population estimates and hypothesis tests. Even though editors and reviewers have a responsibility to ensure that studies have sufficient response rates and have been adequately tested for nonresponse bias (Campion 1993; Sullivan, Crocotto, and Carraher 2006), journals are publishing articles that use survey data with low response rates and do not report nonresponse analyses. While this practice may be a tacit recognition of the increased difficulty in obtaining high response rates, it does not explain why studies are not required to provide supplemental analyses assessing the representativeness of respondents. The nonresponse analyses presented in this article illustrate methods scholars can use to estimate the degree of nonresponse bias variables contain.
Footnotes
Acknowledgment
The author would like to thank Richard L. Wood, Mark Chaves, Martin Ruef, Jake Fisher, and the three anonymous reviewers for their helpful comments on earlier versions of the article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by funding from the Center for Philanthropy and Volunteerism at Duke University. Funding for the NSCOC was provided by Interfaith Funders, along with secondary grants from the Hearst Foundation, the W. K. Kellogg Foundation, and the Louisville Institute.
