Abstract
Questions on voluntary association memberships have been used extensively in social scientific research for decades. Researchers generally assume that these respondent self-reports are accurate, but their measurement has never been assessed. Respondent characteristics are known to influence the accuracy of other self-report variables such as self-reported health, voting, or test scores. In this article, we investigate whether measurement error occurs in self-reports of voluntary association memberships. We use the 2004 General Social Survey (GSS) questions on voluntary associations, which include a novel resource: the actual organization names listed by respondents. We find that this widely used voluntary association classification scheme contains significant amounts of measurement error overall, especially within certain categories. Using a multilevel logistic regression, we predict accuracy of response nested within respondents and interviewers. We find that certain respondent characteristics, including some used in research on voluntary associations, influence respondent accuracy. Inaccurate and/or incorrect measurement will affect the statistics and conclusions drawn from the data on voluntary associations.
Questions on voluntary association memberships have been used extensively in social scientific research for decades. Researchers in sociology, political science, religious studies, development, public health, and medicine employ these questions as both predictors and outcomes. For example, researchers have used voluntary associations to predict individual health outcomes (Kawachi, Kennedy, and Glass 1999; Rönnerstrand 2014) and entrepreneurship (Kwon, Helflin, and Reuf 2013), community post-disaster recovery (Aldrich 2012), and country-level economic growth (Beugelsdijk and van Schaik 2005; Knack and Keefer 1997) and governance (Baggetta 2009; Lee 2007; Paxton 2002; Putnam 1995). Given their benefits, a flourishing subfield of sociology also investigates how participation in associations can, in turn, be predicted by factors such as the life cycle (Knoke and Thomson 1977; Oesterle, Johnson, and Mortimer 2004; Rotolo 2000a) or features of the state (Curtis, Baer, and Grabb 2001; Schofer and Fourcade-Gourinchas 2001). Some scholars use respondents’ overall number of memberships (e.g., Brehm and Rahn 1997; Horowitz 2015; Knack and Keefer 1997), while others make distinctions between types of associations (e.g., Alexander et al. 2012; Bekkers 2005; Coffe and Geys 2007; Cornwell and Harrison 2004; Häuberer 2014; Painter and Paxton 2014; Rotolo 1999; Stolle 2001; Stolle and Rochon 1998).
Despite this extensive scholarly engagement with voluntary associations, there has never been a systematic assessment of their measurement or the measurement of their features (see McPherson and Rotolo [1995] for the sole exception in a small, Nebraskan sample). Although some call for the inclusion of more categories in the GSS questions on voluntary associations (Baumgartner and Walker 1988; Paxton and Rap 2016), researchers generally assume that these self-reports of association memberships are accurate. Respondent characteristics influence the accuracy of other self-report variables. For example, self-reported health is influenced by gender, age, education, and psychological distress (Hughes et al. 1993; Kehoe et al. 1994; Kriegsman et al. 1996). Recent research suggests that the GSS featured substantial interviewer effects in its network generator module (Paik and Sanchagrin 2013; see also Eagle and Proeschold-Bell 2015; Josten and Trappman 2016), which may indicate that measurement error takes place elsewhere, including in the questions on voluntary associations. The potential for measurement error in these self-reports raises appreciable concerns, given that self-reported voluntary associations are the foremost measure of associations in use by social scientists.
In this article, we investigate whether measurement error occurs in self-reports of voluntary association memberships. We use the 2004 GSS questions on voluntary associations, which include a previously untapped data source: the actual organization names listed by respondents. We analyze the accuracy of respondent self-reports of voluntary association memberships compared to memberships as corrected by the researchers. We find that one of the widely used voluntary association classification schemes contains significant amounts of measurement error overall, especially within certain categories. Many organization names were miscategorized by respondents or suffer from other kinds of errors. Using a multilevel logistic regression, we predict accuracy of response nested within respondents and interviewers. We find that certain characteristics of respondents, including characteristics used in research on voluntary associations, influence respondent accuracy. We conclude by reiterating that the enduring importance of voluntary associations for social scientific research makes it crucial that scholars utilize a correct and accurate categorization. Inaccurate and/or incorrect measurement will affect the statistics and conclusions drawn from the data on voluntary associations.
Approaches to Respondent Measurement Error
Under the total survey error approach, error can occur at any stage of a survey, including in the respondent’s answer (Groves et al. 2009; Weisberg 2009). A respondent’s answer to a survey question is further understood to be a two-stage process: First, respondents must retrieve (remember) an initial response, and they then decide whether to edit the initial response to be more in line with a socially desirable answer (Cannell, Miller, and Oksenberg 1981; Sudman, Bradburn, and Schwarz 1996; Tourangeau, Rips, and Rasinski 2000; Tourangeau and Yan 2007). In the following paragraphs, we discuss both retrieval and editing processes generally and how they may pertain to survey questions on voluntary associations, focusing on measurement error.
Retrieval Errors
In the process of retrieving an answer, respondents must first understand the survey question and response categories the way the researcher intended, remember the information they need to answer the question, and accurately map the retrieved information onto one of the response categories (Strack and Martin 1987; Tourangeau et al. 2000; Weisberg 2009). And yet, “researchers often overestimate the familiarity of people with the topic they are studying, and as a result they word questions over the heads of respondents” (Weisberg 2009:74). Words in surveys are open to respondent interpretation (Bailey and Marsden 1999; Sudman et al. 1996), and respondents may “answer questions even when they do not know the meaning of the terms” (Weisberg 2009:74; see also Bishop et al. 1980). Moreover, answering survey questions about events can be cognitively demanding and respondents may be unable to remember all relevant information (Cannell et al. 1981; Krosnick 1991, 1999). The cognitive load of answering a survey question may be lighter for some. Respondents with higher levels of education have been shown to have lower survey error rates (Holbrook, Green, and Krosnick 2003). But answering survey questions may become increasingly fatiguing as the interview goes on, regardless of respondents’ initial cognitive abilities (Krosnick 1999). Finally, respondents can judge incorrectly how to map their answer onto the available answers, especially when answer categories are complicated (Fowler 1995; Weisberg 2009:84).
Retrieval errors in self-reports of voluntary associations
Typical voluntary association questions ask respondents not only to recall their memberships but to categorize them appropriately across many separate response categories. First developed by Verba and Nie (1972), the current GSS voluntary association question asks respondents whether or not they are a member of 16 types of organizations. Respondents are presented with the list of 16 organizations including fraternal groups, veterans’ groups, sports groups, hobby or garden clubs, school fraternities or sororities, and church-affiliated groups among others (exact wording appears in the Data section).
This question relies heavily not only on respondents comprehension of the concept of being a member of a voluntary association and recalling all of their memberships but also comprehension of all the provided categories (e.g., fraternal). In addition to typical fatigue, education, or comprehension issues with the question, therefore, there may be a disconnect between the association types as conceived of by researchers and how respondents understand them. Do researchers and respondents share an understanding of a nationality group or a service group? What about a fraternal group? We expect some disconnect between the categories employed by the GSS and how respondents interpret those categories. Respondents may recall memberships that are not voluntary associations or may place a voluntary association in the wrong category. Prior research has in fact documented a difference between a bottom-up view of the voluntary association landscape and researchers’ top-down categories (Baumgartner and Walker 1988:914; Minkoff, Aisenbrey, and Agnone 2008; Walker, McCarthy, and Baumgartner 2011; Paxton & Rap, 2016).
What features of respondents or their voluntary associations should influence the retrieval process? To begin, accurate recall of memberships by respondents should be associated with their years of experience with the association. Frequency of occurrence improves memory performance (Hasher and Zacks 1979) and longer-term members are more likely to have cemented an identity as a group member (Tajfel and Turner 1979), making recall less effortful. Thus, respondents with more years in a group should be better able to remember it when prompted. Respondents who are meeting locally with an organization rather than holding a passive membership in a nonlocal organization (Putnam 2000; Skocpol 2003; van Ingen and Dekker 2011) should also exhibit more accurate recall.
We further hypothesize that respondent experience with the overall association landscape in the United States will improve understanding of the question and question categories and increase retrieval and accuracy. There are three ways experience matters. First, the “…characteristics of the town in which a person resides significantly shape levels of social affiliation” (Rotolo 2000b:272). Considering that childhood socialization is an important factor in voluntary association participation in adult life (Glanville 1999; McFarland and Thomas 2006; Verba, Schlozman, and Brady 1995), we expect that where a respondent lived during childhood may affect recognition of the association categories in use by the GSS. Specifically, we expect people who lived in rural settings to have participated in voluntary associations at smaller rates (Bell and Force 1956; Rotolo 2000a; Wright and Hyman 1958) and have less familiarity with the question categories than those who lived in small towns or cities.
1
Similarly, respondents born outside of the United States may be less familiar with the categories in the U.S. GSS, especially considering U.S. exceptionalism in levels and extent of associating (Tocqueville [1840] 1990; cf. Curtis, Grabb, and Baer 1992). Thus, exposure to different forms of associations outside of the United States may influence respondent accuracy on the U.S. categories.
2
Or, immigrants may make an error due to language issues. Finally, given that the Verba and Nie question was developed in the mid- to late 1960s, we might expect generations that came of age before the questions were written to have a particularly good understanding of the response categories. Indeed, some researchers argue that the question is progressively less accurate as an indicator of Americans’ affiliations over time (Baumgartner and Walker 1988).
A related issue concerns question order effects—whether preceding questions influence the content of current questions (Schuman and Presser 1981). According to Schwartz (1999:103), respondents use information in a survey to help them “arrive at a useful and informative answer” to be “cooperative communicators.” But researchers are often unaware of how information in our questionnaires may be used by respondents. Respondents looking to insert their voluntary associations into categories may helpfully attempt the first category that seems plausible, even it is ultimately not the most accurate.
Response Editing and Social Desirability Effects
After respondents initially consider a given question response, they may adjust their response to tailor it to social norms (Tourangeau et al. 2000; Tourangeau and Yan 2007). Social desirability bias in surveys expects that respondents will overreport what they consider to be pro-social behaviors and underreport antisocial behaviors (Bekkers and Wiepking 2011; Bernstein, Chadha, and Montjoy 2001; DeMaio 1984; Tourangeau and Yan 2007). Researchers have found social desirability effects in activities such as donating to charities (Bekkers and Wiepking 2011), church attendance (Presser and Stinson 1998), and drug and alcohol use (Aquilino 1994). It is well established, for instance, that survey respondents overreport voting in elections compared to actual voter turnout (Belli, Traugott, and Beckmann 2001; Bernstein et al. 2001; Brenner 2012; Sigelman 1982).
Self-reports of voluntary association memberships are also likely to induce some overreporting due to social desirability bias. Association memberships have long been a staple of American life (de Tocqueville [1840] 1990) and Americans value participation in civil society, such as having “habits of the heart” (Bellah et al. 1985) or being other oriented (Riesman, Glazer, and Denney 1950). Americans generally understand that “isolation in civic life is rare: a matter of deviance, not normalcy” (Fine and Harrington 2004:353).
The survey research literature notes that social desirability effects are due to either respondents “trying to present themselves to the interviewer in a positive light (impression management) or trying to preserve their own self-esteem (self-deception)” (Weisberg 2009:85). Both are possible in the voluntary association self-reports.
Self-deception in self-reports of voluntary associations
Respondents attempt to maintain their self-esteem when answering survey questions. Certainly people act to maximize their self-esteem in many social behaviors (Solomon, Greenberg, and Pyszczynski 1991; Steele 1988; Tesser 1988). Individuals embrace preferred “ideal selves (Baumeister 1982) or “possible selves” and avoid feared possible selves (Markus and Ruvolo 1989). Therefore, any discrepancy produced between the real self and the ideal self-image will lead to cognitive and behavioral efforts to reduce the discrepancy (Baumeister and Tice 1986; Higgins 1989), including modifying a response to a survey question.
In self-reports of voluntary associations, we expect people who have fewer social ties or worse mental health to overstate their association memberships in order to present a more socially active sense of self. This should appear as error after these respondents are prompted for the name of their association and are unable to produce one.
Impression management in self-reports of voluntary associations
Respondents will also work to appear agreeable and avoid embarrassment in front of interviewers (Tourangeau et al. 2000, chap. 9). Since Cooley (1902) and Mead (1934), social scientists have understood that individuals anticipate other’s reactions to their actions and therefore normalize their presentation of social information. Goffman (1973:259) explains that “people attempt to regulate and control, sometimes consciously and sometimes without awareness, information they present to audiences, particularly information about themselves.” To a respondent, the audience is the interviewer and we expect some response editing based on impression management. Indeed, survey researchers have shown that when there is an interviewer in a survey (as opposed to a paper or electronic self-administered survey), respondents are less likely to admit embarrassing behaviors or opinions (Tourangeau and Smith 1996).
In self-reports of voluntary associations, we expect people who report a pro-social orientation, such as valuing good citizenship behavior, to be more likely to overstate their association memberships in order to present a more socially active persona to the interviewer. Overstatement is likely to reveal error during the follow-up phase when these respondents are prompted for the name of their association. Conversely, we would expect greater accuracy when we have evidence of respondent unconcern with impression management. For example, respondents deemed uncooperative by the interviewer were probably relatively unconcerned with managing impressions. Or, respondents who resist normative and social desirability pressures on other questions, such as voting, may be less susceptible to overstatement on the voluntary association questions. That is, someone who is willing to admit that they did not vote in the most recent presidential election is probably also less likely to overstate their voluntary association memberships.
Data
To investigate self-reports of voluntary associations, we use the GSS. Run by the National Opinion Research Center, the GSS is one of the most widely used surveys in the social sciences. It has been conducted since 1972 and consists of a 90-minute interview drawn from a random sample of American households. We use data from the 2004 GSS, focusing on the subset of respondents that were asked questions about their participation in voluntary associations (n = 1,467). The GSS voluntary association question reads: “We would like to know something about the groups and organizations to which individuals belong. Here is a list of various kinds of organizations. Could you tell me whether or not you are a member of each type?” These types, in order, are fraternal groups; service clubs; veterans’ groups; political clubs; labor unions; sports groups; youth groups; school service groups; hobby or garden clubs; school fraternities or sororities; nationality groups; farm organizations; literary, art, discussion, or study groups; professional or academic societies; church-affiliated groups; and any other groups. The GSS question is similar to questions asked by the World Values Survey (WVS), the Social Capital Community Benchmark Survey (SCCBS), and other surveys, although correspondence is not exact. 3
The 2004 GSS questions on voluntary associations are novel for two reasons. First, respondents were allowed to provide multiple organizations within each category. Second, for the first time, respondents were asked to give the actual names of their organizations in an open-ended response. Of the 1,467 respondents, 909 answered that they belonged to at least one voluntary association and were then asked: Do you belong to more than one [type of group]? How many [type of group] do you belong to? What is/are the name(s) of the group(s)? How many years have you been a member of [name of group]? Does [name of group] meet in this area?
These listed names allow us to investigate and assess the accuracy of respondents’ self-reports. 4 There are several possibilities. Respondents could accurately report a group membership. Accuracy requires both (1) providing a voluntary association membership and (2) accurately categorizing the association membership. Conversely, respondents could inaccurately report a group membership by either listing a for-profit membership (e.g., a gym membership) or an occupation (e.g., listing the armed forces). Respondents could also miscategorize an actual voluntary association membership, such as reporting a union within the fraternal category and not the union category. Finally, respondents could fail to list an organization name in the open-ended prompt.
To determine whether or not a voluntary association was accurately self-reported, we coded the open-ended responses into the following categories. First, we coded association names as accurate if they were a voluntary association and had been correctly categorized by the respondent. Listing “Masons” under the fraternal category or “Girl Scouts” under the youth category would both be coded as accurate voluntary association self-reports. 5 Second, we differentiated responses that were not voluntary associations, such as listing a Bally’s gym membership, Sam’s Club, or “colleagues.” Third, we recorded where associations had been miscategorized by respondents and noted where they ought to have been categorized instead. For instance, “Softball team” would be coded as miscategorized if listed in the service category with a note that it belongs in the sports category. Fourth, we distinguished responses indicating interviewer error or note-taking. For example, some responses suggest a miscommunication between the respondent and interviewer, such as when an interviewer diligently wrote one frustrated respondent’s response, “I answered this question NO! Why is this follow-up asked?” as an entry for a service organization. Or interviewers recorded notes about the response without providing an accurate response, for example, “doesn’t remember she doesn’t do anything with them anyway.” It is important to recognize that these responses were, in fact, counted as association memberships in the counts created by the GSS and used by researchers. Thus, the frustrated respondent quoted above is counted in the GSS as belonging to a service organization. Fifth, we developed a category for organizations that had “no name” according to the respondents, approximately 2.16 percent of the sample. Sixth, we coded those associations with traditional survey error, including when respondents gave a nonresponse or “don’t know.” Nonresponse and don’t know codes were already recorded by the GSS as 9 and 8, respectively. Finally, organizations that we were unable to identify were given a distinct code, as we could not verify whether or not they were correctly categorized. Unidentifiable organizations featured vague names such as “**** Club” or “Wednesday Club,” unverifiable acronyms such as “FNGA,” or groups with unique names like “Beep,” or “the Ritual,” that we could not find with the help of a search engine. These constitute only 4 percent of the sample. 6
We wish to examine respondent accuracy when self-reporting voluntary association memberships. Thus, we create a dichotomous variable measuring accuracy of response. This variable is coded 1 when an association name is accurate, has no name, or was unverifiable. We give respondents the benefit of the doubt by assuming that organizations that could not be verified or reported as “no name” were correctly classified by respondents. 7 The accuracy variable is coded 0 when measurement error of some type is present. According to the total survey error approach, measurement error at the question level can include nonresponse, not knowing the answer to a question, not being familiar with researchers’ categories (i.e., miscategorization), and other miscommunications between respondent and interviewer (Groves 2004; Weisberg 2009). Responses that were miscategorized, were not actually voluntary associations, featured nonresponse, or where the respondent did not know the name of the organization are coded 0, as error.
Method
Simple descriptive statistics and tables provide basic information on levels of accuracy overall and within each category of association. Studies of measurement error also employ multilevel analyses to determine the effects of interviewer (Paik and Sanchagrin 2013) and respondent (Hox 1994) characteristics on answers. In this article, we use a three-level multilevel logistic regression to predict response accuracy with characteristics of the answer or response (level 1) and respondent (level 2), with fixed effects for interviewer (level 3; see Hox 1994; Marsden 2003; Paik and Sanchagrin 2013; Van Tilburg 1998). Response at level 1 is the association and therefore features the category in which the organization had originally been placed (fraternal groups, service clubs, hobby clubs, etc.). Responses (coded as accurate or inaccurate) are nested within respondents and then within interviewers. The mean number of responses per respondent is 3.2 with a standard deviation of 2.5. The mean number of respondents per interviewer is 7.3.
We estimate four models to capture response and respondent characteristics that may increase or decrease accuracy. Model 1 serves as a baseline model, which includes level 1 association categories to assess differential accuracy across category. Further, we include three more level 1 variables: the number of years (capped at 5) respondents belonged to the organization, a dummy variable signifying whether the organization met locally according to the respondent, and a measure of respondent fatigue measured as an inclusive count of how many organizations have already been named by the respondent in the open-ended response (range: 1–21). The baseline model 1 also includes several respondent-level characteristics at level 2. We include dummy variables for people of color (with white as the omitted category), female, employed, whether a respondent has obtained a college degree, and is married, and a continuous variable for their number of children (truncated at five). 8 There is no specific reason to expect interviewer effects for these questions. However, with known interviewer effects in related questions in the 2004 GSS (e.g., Paik and Sanchagrin 2013), all models conservatively include fixed effects for interviewer at level 3.
Model 2 includes measures of respondent experience with and exposure to the U.S. voluntary association landscape. We include two dummy variables measuring whether a respondent lived in a rural area or large city at age 16 (with living in a small or moderate size town as the omitted category) and whether or not the respondent was born in the United States. We also include a series of dummy variables for generation, in which the silent generation (born 1915–1924) and greatest generation (1925–1942) are combined as a civic generation (omitted), compared to baby boomers (1943–1960), Generation X (1961–1980), and early millennials (1981–1986).
Model 3 adds measures of mental health and social ties to address hypotheses of social desirability effects based on self-esteem maintenance. We include two variables that measure mental health and one measuring social ties in our third model. First, respondents were asked whether “A person who often feels sad and blue” was a good description of themselves. We employ a dichotomous measure coded 1 when the respondent stated it was a poor descriptor. Second, we include a variable measuring positivity as the mean of several questions on optimism and reverse-coded pessimism. 9 Finally, we include the number of alters (individual connections) listed by respondents in the network generator module, ranging from 0 to 6. 10
In our final model, we include measures of respondents’ presentation of a normative self to the interviewer. First, we employ a dummy measure flagging when the respondent was deemed uncooperative by the interviewer. Second, since normative and social desirability pressures may lead to overstatement on other questions for susceptible respondents, we include a dummy variable for whether or not someone admits that they did not vote. Someone who is willing to admit that they did not vote in the most recent presidential election is expected to be less likely to overstate their voluntary association memberships as well. Relatedly, someone who highly values good citizenship will likely want to self-present as belonging to various associations, regardless of whether or not this is actually the case. We measure this variable with a question asking, “There are different opinions as to what it takes to be a good citizen. As far as you are concerned personally, how important is it to be active in social or political associations?”
Results
The first row of Table 1 reports the percentage of self-reported voluntary association memberships that were correctly or incorrectly categorized. Overall, we could verify that respondents accurately reported and categorized 68.6 percent of the organizations listed with another 2 percent listed as having no name. This leaves substantial levels of error, with 25 percent of responses suffering miscategorization, nonresponse, or other errors. We could not verify 107 organization names (4 percent) as being correctly or incorrectly categorized.
Accuracy of GSS Voluntary Association Questions.
Source: 2004 General Social Survey.
The bulk of the measurement error comes from miscategorization. Examples of incorrect categorization include “**** Military Academy Cadets,” which was listed by the respondent as a “school fraternity or sorority,” and “AARP,” which was listed by the respondent as a “professional or academic society.” Commonly analyzed types of measurement error also appear in this sample; 5.6 percent of the organizations are a nonresponse and 3.4 percent a “don’t know” or “unknown”; 4.1 percent of organization names are clear examples of interviewee or interviewer error. For example, “DON’T belong” was recorded as an entry for a service organization. This respondent is counted in the GSS as belonging to a service organization.
Rates of miscategorization vary by category. The remainder of Table 1 documents accurate categorization and alternatives for each of the 16 categories of voluntary association. Some categories feature relatively high levels of accuracy in self-reports. Categories such as “farm organizations” and “sports groups” fared relatively well, each with over 80 percent accuracy. Other categories clearly exhibit more measurement error.
Perhaps most worrisome are those categories with less than 50 percent accuracy. The “fraternal groups” category was one such category. Table 2 shows results for fraternal groups. Of the 123 organizations respondents reported, we had enough information to classify 98 of them. What we found was dismaying: 38 percent of these groups were incorrectly categorized. That is, roughly half of reported fraternal organizations should have instead been reported under another category. The largest source of miscategorization was respondents listing a Greek organization when prompted for a fraternal one (13 percent). For an individual unfamiliar with traditional fraternal organizations, this makes sense: fraternal sounds like “fraternity.”
Measurement Error in the Fraternal Category..
The “service clubs” category also suffered considerable measurement error. Of those group names we could verify (n = 151), respondents miscategorized over half of them (n = 77), as shown in Table 3. The service category represents a clear disconnect between the organizations as intended by researchers and how respondents understand the meaning of “service.” The original Verba and Nie (1972) prompts included examples of organizations as clarifiers: “service clubs, such as Lions, Rotary, Zenta, Junior Chamber of Commerce.” The measurement intent is therefore organizations dedicated to secular, collaborative community service of all types. Respondents, however, frequently included groups in which they may have felt they were doing community service (i.e., volunteering with Boys and Girls club) or in the service (i.e., Veterans of Foreign Wars or American Legion). Nearly a quarter of the miscategorized names ought to have been in the “veterans’ groups” category (n = 18). Because the prompt for “service groups” occurs before “veterans groups,” respondents may list those groups that represent being “in the service” (i.e., the armed forces) without realizing that a later category would request their veteran’s memberships. Another reason for the miscategorization may be that while the organizations listed by respondents may entail unpaid service work, these organizations, on the surface, fit considerably better into other categories. This suggests a definitional difference between survey researchers and their respondents.
Measurement Error in the Service Category.
Multilevel Models
Table 4 presents multilevel models that include respondent-level (level 2) variables. These models include the dummy variables for each organization category (church-affiliated groups omitted), whether the organization meets locally, how many years the respondent has been in the organization, and respondent fatigue, and fixed effects for interviewers. Plus, each model includes a set of respondent characteristics. (The coefficients in Table 4 present each organization category in reference to church-affiliated groups. To better understand error rates across all the GSS voluntary association categories, Online Appendix A displays the odds ratio of correct classification for each category compared to every other category.)
Multilevel Logistic Regression Odds Ratios of Accurate Classification.
Source: 2004 General Social Survey.
† p<0.10. *p < .05. **p < .01. ***p < .001 (two-tailed test).
Beginning with the level 1, association-level variables, as already discussed, several voluntary association categories are associated with higher levels of measurement error, even controlling for respondent characteristics. Fraternal, Greek, “other,” school service, youth, and service all feature substantially lower odds of successful categorization compared to church-affiliated groups. But the level-1 results also reveal features of membership that can shape the probability of correctly classifying an organization. When a respondent reports that their organization meets locally, the odds that their answer is accurate are 3.7 times higher than if the organization is reported as extra-local, holding other variables constant. An alternative interpretation focuses on the change in the predicted probability of accuracy. A non-college-educated, unmarried, unemployed man with average number of children reporting on a church association that does not meet locally with average years in the organization, and so on, has a predicted probability of an accurate response of .72. If that respondent instead reported that the association met locally, the predicted probability would be .90, leading to a change in the predicted probability of .18—a large effect. Furthermore, for each additional year in an organization, we expect a 13 percent increase in the odds of accurate categorization. The coefficients (not shown) and odds ratios for these variables are very similar across all four models. Thus, both organization-level hypotheses are confirmed: years of experience with an association and belonging to association that meets locally both increase accurate recall of that association. Our measure of respondent fatigue is also statistically significant in all four models, with approximately a 6 percent decrease in the odds for each additional organization name listed by the respondent.
Respondent demographic characteristics (level 2) were generally not significant predictors of correct categorization. Number of children and being married were the main exceptions. With each additional child, the odds that a respondent would accurately categorize an organization rose by 10–11 percent. Being married increases the odds of accurate categorization by 27–39 percent in three of the four models. This is a .03 increase in the average predicted probability for those who are married (in model 1). We suspect that being married or having children increases the number of organizations respondents come into contact with through their alters. They may also be more likely to participate in organization categories with higher respondent-researcher correspondence, such as school-service clubs.
Model 2 incorporates experience with the voluntary association landscape and produces several notable effects on accurate self-reports on voluntary association membership. Those respondents who were not born in the United States have around a 50 percent decrease in the odds of accurately reporting their voluntary association memberships compared to those born in the United States (a reduction in the average predicted probability of .10). Living in a rural area at age 16 decreases accuracy in the second and last models. Living in rural areas at age 16 reduces the odds of accurate categorization by about 30 percent compared to those who lived in medium-sized towns (.04 difference in average predicted probability of accuracy). These findings support two of the hypotheses suggesting that a respondent’s knowledge of or mental categorization of voluntary associations may differ depending on where they were brought up. Native-born American respondents who grew up in towns and cities appear to have knowledge of American associational life that allows them to more often accurately answer than other respondents. Finally, model 2 also included controls for generation effects, which were not significant. Our findings therefore do not support the hypothesis that members of the so-called civic generation were any more familiar with the GSS association categories. 11
The social desirability: self-esteem model incorporates mental health and isolation variables into model 3, with some significant results. Support for the hypothesis that better mental health increases accuracy was partial. People who felt strongly that “sad and blue” was a poor description of their orientation to life had 59 percent greater odds of being accurate compared to respondents who felt this was a fair or good description of themselves. In terms of the average predicted probability, this is a .06 difference in probability of accuracy across those with good and poor mental health. But a positive worldview did not predict accuracy in response. As for Hypothesis 6, reported social ties matter to accuracy as well. Each additional alter listed by respondents in the social network generator questions increases the odds of correct reporting by 12 percent. It may be that people with more social ties come into contact with more voluntary associations via their networks, increasing knowledge, enabling better categorization, and reducing error. Or, it may be that people who are more socially isolated are more likely to overstate their number of voluntary associations when asked, resulting in more “don’t knows,” or nonresponses in the follow-up. Given known interviewer error in the “important matters” network generator (Marsden 2003; Paik and Sanchagrin 2013), it is important to remember that these models also include fixed interviewer effects. Even with these fixed effects, the alters finding should be viewed with caution. 12
Model 4 addresses impression management processes on the part of respondents. To begin, it supports findings in other areas that respondent cooperativeness can influence the accuracy of self-reports. An interviewer’s assessment that the respondent was noncooperative significantly decreases the odds of accuracy by 81 percent (a decrease of .25 in the average predicted probabilities of accuracy). This was not what our impression management Hypothesis 9 predicted, however. We had hypothesized that uncooperative respondents would be less concerned with managing the interviewer’s impressions and therefore more likely to be accurate in their responses. Instead, we find that noncooperative respondents were less accurate. In contrast, a respondent willing to admit that they did not vote in the last election is expected to have 84 percent greater odds of accuracy compared to others. This finding suggests that someone who is willing to admit honestly to nonnormative behavior will also be more honest when asked their number of voluntary association memberships, even if that number is low. Put another way, respondents who resisted normative or social-desirability pressures when answering the voting question are also likely to be able to resist such pressures in the voluntary association question. That was the only impression management hypothesis that was supported. In contrast, the importance of being active in civic associations is a nonsignificant predictor of accuracy in model 4. We had hypothesized that a pro-social orientation, as evidenced by answering that being active in civic associations is important, would cause respondents to overstate their association memberships and increase error. But, if a pro-social orientation also increases the odds of honestly answering, it could explain the lack of such an effect.
Discussion and Conclusion
Obtaining accurate estimates of voluntary association membership is far from simply an academic exercise. Scholars have discussed America’s long history of participation in voluntary associations since Tocqueville [1835, 1840] and have tracked membership rates since the advent of modern large-scale surveys (Verba and Nie 1972). In addition to perennial worry about declines in social capital (Paxton 1999; Putnam 1995, 2000), researchers and the general public are recently concerned about social isolation in the United States (see Parigi and Henson [2014] for a review). Voluntary associations play an important role in social connectivity in the United States (Putnam 2000). If our measurements of voluntary associations are flawed, “it is not clear whether levels of social connectedness are stable, changing, or even reliable” (Paik and Sanchagrin 2013:2).
Our study does suggest significant levels of measurement error in respondent self-reports of voluntary association membership. Overall, a respondent who answers that they belong to a voluntary association is only likely to be accurately reporting a membership about 70 percent of the time. Accuracy varies across categories as well, with some categories providing quite accurate assessments of membership and others quite inaccurate assessments. This finding has rather disconcerting implications for research that analyzes the distribution of types of voluntary associations (e.g., Cornwell and Harrison 2004; Painter and Paxton 2014; Paxton 2002; Rotolo 1999). The actual distribution of memberships across the 16 categories would be quite different if there was no error, especially errors of miscategorization. Figure 1 provides the original and researcher-corrected counts of memberships by category.

Original vs. corrected estimates of voluntary associations in the 2004 General Social Survey, by type. The corrected estimates include associations with “no name” but do not include associations we were unable to verify.
The results suggest that some categories are especially problematic and may not be meaningful to respondents. Fraternal organizations are one such category. They have been at the center of many of the debates about social capital (Putnam 2000), associational participation (Skocpol 2003), and organizational segregation (Kaufman 2002). But they appear to be measured extremely poorly by the self-report questions. Even giving respondents the benefit of the doubt by allowing “no name” and unverified organizations, only 44.7 percent of associations were correctly categorized by the respondent in the fraternal category in the GSS. What’s more, this miscategorization was not offset by fraternal organizations being incorrectly placed in other categories, as only 11 fraternal organizations were misplaced in other categories. It is therefore quite possible, if not likely, that we have been overestimating the number of fraternal memberships in the United States for some time. Mismeasurement implies that observed declines in fraternal organizations are even steeper than previously estimated.
While we have focused on those categories featuring the most measurement error in this article, the good news is that many of the categories featured little to no measurement error. For instance, it was straightforward for respondents to determine whether or not they belonged to a farm association. This category featured no miscategorization at all.
The measurement error we have identified in the voluntary association questions will also affect both simple counts of voluntary association memberships and binary measures of membership/nonmembership. The original and researcher-corrected codes we used to create Figure 1 allow us to compare an original overall count to a revised count for each respondent and to see how many respondents who reported one or more organization memberships actually have no memberships. The average original count across respondents ranged from 1 to 21, with a mean of 3.2 and a standard deviation of 2.5. A corrected count ranges from 0 to 15 with a mean of 2.6 and a standard deviation of 2.2. On average across respondents, researchers using the original, uncorrected counts would overcount by two thirds of an organization. Further, 77 respondents (8.5 percent of the sample) who reported at least one membership have no membership in the corrected counts. Thus, both counts and binary measures are also affected by this measurement error.
It is also likely that the measurement error we have identified in the GSS voluntary association question is present in other surveys that ask similar questions such as the WVS or the European Social Survey (ESS). However, the extent of error cannot be determined at this time. These surveys use different stem questions, include different numbers and types of categories, and measure intensity of membership differently. All these differences might yield higher or lower error rates. But neither the WVS nor the ESS ask follow-up questions about the names of the organizations. Determining error in other surveys, and how best to ask questions on association memberships (Maloney, Van Deth, and Roßteutscher 2008) are useful future research questions.
Helpful to researchers using the voluntary association questions is the fact that, across categories, organizations that met in the respondent’s area of residence were much more likely to be correctly categorized by the respondent. Those associations that did not meet locally were more often miscategorized or prone to nonresponse. Given the rise of tertiary (or “checkbook”) associations (Painter and Paxton 2014; Skocpol 2003)—which, by definition, are not local, we may expect to see measurement error increase going forward.
Also important for voluntary association or social capital researchers (Coleman 1988; Lin 2002; Putnam 2000) is that many of the basic demographic variables were not significant. This is heartening for research using such variables to predict association membership, as our findings suggest there is unlikely to be systematic measurement error or bias in results using such demographic variables. But other more intangible variables such as mental health were significant predictors of accuracy in self-reports of association membership. More research is needed to understand the implications of the relationships we identify here for existing knowledge on voluntary association membership.
Even though we look only at the 2004 GSS, our finding of significant error in self-reports of voluntary association membership has implications for researchers using other years of the GSS or other surveys that also use such self-reports. 13 We have little reason to believe that the WVS, SCCBS, or other surveys relying on respondent self-reports drawn from fixed categories would be better. It is best for researchers using self-reported voluntary associations to proceed cautiously. 14
Suggestions Going Forward
This article’s assessment of error serves to measure the reliability of the voluntary association questions. We suggest that researchers using these questions adjust estimates for accuracy of response, especially for the most problematic categories such as fraternal or service. Researchers should consider incorporating reliability estimates into their models. This could be done through structural equation models with some fixed parameters (Bollen 1989; see, e.g., Paxton 2002). As discussed above, counts and binary measures are also affected by measurement error and could also be corrected for reliability. Without adjustment, researchers are working with measures in which there is interpretational disconnect between researchers and respondents, sometimes severe miscategorization by respondents, or that suffer from other kinds of errors.
Several changes to the way we ask self-report voluntary association questions could reduce measurement error considerably (McPherson and Rotolo 1995) not just in the GSS but in other, similar, surveys such as the WVS or the ESS. Most important, surveys should ask respondents the names of their organizations going forward so that researchers can verify and/or correct them. Interviewers could also probe respondents to give the full name of an organization, instead of an acronym, as obscure acronyms preclude a researcher’s ability to verify respondent affiliations.
If categories are retained, one simple change would be to reorder the categories in such a way to preempt misunderstandings. For instance, a handful of respondents listed their veterans groups as a service club; however, no one made the opposite mistake, and veterans groups were generally well categorized. Ordering the veterans category before the service category could prevent such miscategorization. Similarly, prompting for Greek organizations before fraternal clubs could prevent these groups from winding up in the fraternal category. Thoughtful attention to ordering could produce more accurate responses.
Another small change would be to include examples of associations in the category prompts. In the original Verba and Nie (1972) questionnaire that introduced the self-report voluntary association format, the question on service clubs reads, “Service clubs, such as Lions, Rotary, Zenta, Junior Chamber of Commerce,” rather than simply Service clubs. For whatever reason, these examples were not included in the GSS version of the questionnaire began in 1972. By giving examples of relevant associations, surveys may prevent measurement error in those categories where respondents are uncertain.
Finally, a more radical change would be to simply ask respondents the names of any groups, clubs, or organizations to which they belong. Several researchers have argued that the current categories are systematically missing important kinds of affiliations such as environmental groups (Baumgartner and Walker 1988) and informal associations (Paxton and Rap 2016). The most effective way to avoid systematic bias and measurement error would be to drop the category prompts and allow researchers to categorize respondents’ voluntary association memberships according to their own needs.
Supplemental Material
online_appendix - How Accurate Are Self-reports of Voluntary Association Memberships?
online_appendix for How Accurate Are Self-reports of Voluntary Association Memberships? by Robyn Rap and Pamela Paxton in Sociological Methods & Research
Footnotes
Authors’ Note
Both authors contributed equally to this article.
Acknowledgment
We thank Anthony Paik and Marc Musick for their comments on an earlier version of this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study has been supported by the Science of Generosity (University of Notre Dame/Templeton Foundation) and from the Population Research Center at the University of Texas (5 R24 HD042849, NICHD).
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
