Abstract
Social networking sites and online questionnaires make it possible to do survey research faster, cheaper, and with less assistance than ever before. The methods are especially well-suited for snowball sampling of elusive sub-populations. This note describes my experience surveying thousands of Catholics via Facebook in less than a month, at little expense, and without hired help. Although the respondents were disproportionately female, young, educated, and religiously active, their responses preserved key correlations found in standard surveys conducted by Gallup and the GSS. I relate my methods to existing web-based methods and offer concrete suggestions for future work.
Online social networking sites (SNSs) offer new ways for researchers to conduct studies quickly, cheaply, and single-handedly—especially when seeking to construct “snowball” samples for exploratory work. Facebook is currently the SNS best suited for this type of research, thanks to size (currently exceeding 845 million users worldwide), features, intensive use, and continuing growth. Each Facebook user is directly linked to his or her personal “friends,” while also having access to membership in one or more of the millions of Facebook groups that connect other users throughout the world. Facebook groups are virtual communities linking people with some shared interest, attribute, or cause. Researchers can readily sample populations of interest by working through existing groups or creating new ones.
Although researchers and journalists devote much attention to social networking, I have yet to locate any work that exploits SNSs as a tool of research. Existing SNS research focuses on questions related specifically to the phenomenon of online social networking: What functions do SNSs serve for those who use them, and what benefits do users derive (Joinson 2008)? Is the accumulation of social capital one of those benefits (Ellison, Steinfield, and Lampe 2007)? Do SNS users behave differently or look differently than nonusers (Hargittai 2007)? What privacy concerns do the rise of SNSs raise (Dwyer, Hiltz, and Passerini 2007; Jones and Soltren 2005), and to what extent do these concerns influence online behavior (Tufekci 2008)? Can online social interactions predict tie strength (Gilbert and Karahalios 2009)?
My work shifts the emphasis from research about SNSs to research through SNSs. By working through existing networks of Facebook users, I recruited nearly 4,000 baptized Catholics to participate in an exploratory investigation of the role of affective bonds in religious commitment. Because existing surveys lack questions that adequately gauge these affective bonds, I needed to construct a new instrument. And because I was testing a new theory, a nonrepresentative sample was appropriate for my investigation (and necessary because of my lack of financial means).
This methodological note describes my experiences using Facebook for survey research and argues for its usefulness in certain contexts. Within five days of releasing a 12-minute online survey to a Facebook group of potential volunteers, I harvested 2,788 completed questionnaires. Within a month, the total number of respondents increased to about 4,000. Total monetary costs averaged less than one cent per survey—vastly less than the cost of surveys obtained through mail, phone, or even email. 1 Moreover, the responses became available for review the moment they were entered. Hence, if the survey turned out to contain any substantial errors or omissions, I could repair the damage within minutes. Although the respondents by no means constituted a random sample of the relevant (Catholic) population, their responses preserved many of the statistical relationships obtained by traditional means. These and other advantages described below suggest that, in some contexts, Facebook may be a useful tool for exploratory work and for rapid pretesting of surveys destined for dissemination via traditional methods. It may likewise be an effective tool for reaching some hidden populations. And while the present study was quantitative in nature, the recruitment method can easily be used for qualitative studies as well.
The article proceeds as follows. The first section reviews the relevant literature on chain-referral sampling and electronic survey methods, highlighting strengths and limitations of both of these methods. The second section follows with a description of the Facebook features that make it an effective tool for snowball sampling. The third section discusses attempts to recruit study volunteers, and the fourth section details the results of those efforts by offering survey and sample statistics. The fifth section addresses issues related to sample representativeness, and the sixth section offers suggestions for others interested in replicating this method.
Related Research
Chain-referral sampling first emerged in response to the neglect of social structure and interpersonal relationships in survey research methods. As Coleman (1958:28) notes, most early analyses overlooked the role of relationships, “never including (except by accident) two persons who were friends.” Snowball sampling is a chain-referral technique that accumulates data through existing social structures. The researcher begins with a small sample from the target subpopulation and then extends the sample by asking those individuals to recommend others for the study. Chain-referral techniques have the added benefit of providing relatively easy access to “hidden” subpopulations that are almost impossible to sample by standard (phone, mail, or door-to-door) methods because of their small size or distrust of outsiders. Examples include studies of prostitutes (Faugier and Sargeant 1997), the homeless (Anderson and Calhoun 1992), AIDS victims (Martin and Dean 1990), members of the LGBT community (Browne 2005), drug users (Griffiths, Gossop, and Strang 1993; Biernacki and Waldorf 1981), and religious “cults” (J. R. Lewis 1986).
Sample bias is the principal downside of the chain-referral approach. On one hand, study volunteers may try to protect their friends by not referring them, a problem known as “masking.” On the other hand, “referrals occur through network links, so subjects with larger personal networks will be oversampled, and relative isolates will be excluded” (Heckathorn 1997:175). Thus, Faugier and Sargeant’s (1997) study of prostitutes undersampled women who were new to the business or who had been ostracized by their peers. Participants may also recruit inappropriate volunteers, especially if they misinterpret the study’s design or purpose (Biernacki and Waldorf 1981). And response rates are difficult to define, much less estimate, when participation spreads through forwarded surveys and undocumented invitations.
Despite these limitations, no one disputes the value of chain referral methods for studies of elusive subpopulations and exploratory work (Penrod et al. 2003; Faugier and Sargeant 1997). Moreover, new techniques can help to overcome some of the problems discussed above. 2
Facebook and other SNSs allow us to carry chain-referral methods into the age of the Internet, while also exploiting the strengths of web-based questionnaires. The elimination of labor, paper, and postage drastically reduces survey costs, making large studies affordable to conduct (Weible and Wallace 1998; Schmidt 1997). Although postal delivery time slows the pace of studies conducted with paper surveys, instantaneous data transmission greatly reduces the time needed to carry out web-based surveys (Evans and Mathur 2005; Wilson and Laskey 2003). Web-based surveys are faster because they eliminate the need to manually input data into data analysis programs (Evans and Mathur 2005). Technological advances allow researchers to construct complex skip patterns that reduce response burden and perhaps increase response rates (Shropshire, Hawdon, and Witte 2009). Online surveys can also increase willingness to answer sensitive questions (Tourangeau 2004) and reduce socially desirable responding (Chang and Krosnick 2009).
SNS sampling shares most of the limitations associated with other forms of web-based research. We cannot reach those who lack the requisite computer skills and equipment (Couper et al. 2007; Best and Krueger 2002). As a result, studies that use online surveys underrepresent those with limited financial resources, members of some racial and ethnic groups, older people, and the less educated (Couper et al. 2007; Best and Krueger 2002). Nor are we likely to reach many people with serious concerns about Internet privacy (Evans and Mathur 2005). The layout and readability of surveys can vary across hardware and software (Evans and Mathur 2005). Electronic surveys can easily reach unintended recipients and are more readily taken multiple times (Smith and Leigh 1997). And response rates tend to be lower than those associated with phone, mail, and interviews (Converse et al. 2008; Greene, Speizer, and Wiitala 2008; Cole 2005; Evans and Mathur 2005; Griffis, Goldsby, and Cooper 2003; McDonald and Adam 2003).
Other factors mitigate some of these concerns. By using software that logs the IP (Internet protocol) address of the computer on which the survey was taken, researchers can identify repeat responders (Gosling et al. 2004). Moreover, as the Internet penetration rate continues to increase, web-based samples become increasingly representative of the population of interest. Finally, researchers can capitalize on the strengths of various methods by utilizing mixed-mode approaches (Greene et al. 2008).
Bearing in mind all these considerations, let us turn to a specific SNS-based project.
Facebook as a Sampling Frame
Facebook’s size, growing popularity, and features make it the preferred SNS for constructing a snowball sample in the United States. According to the web analytics company Compete, Facebook had 171 million unique visitors to its website in December 2011, more than four times as many as its closest competitor, Twitter, which had just over 40 million unique visitors during the same month. 3 In fact, according to this metric, Facebook has been the most popular SNS in the United States since December 2008, when it surpassed MySpace in unique site visitors (Compete 2009).
The value of Facebook as a snowball sampling frame extends beyond its size. Equally important is how quickly, easily, and diffusely users communicate information with each other—both directly and indirectly. The average adult user has 229 “friends” on the site (Hampton et al. 2011), and interactions occur through private messages and public (“wall”) postings. When Jack posts a message to Jill’s wall, that information is visible to those with access to Jill’s page.
But thanks to other Facebook features, other users might acquire the information in the wall posting even without visiting Jill’s page. Every Facebook user has a “News Feed” that contains aggregated content posted by friends, along with photo tags, friend requests, event RSVPs, and group memberships. Depending on one’s privacy settings, the message that Jack posts to Jill’s wall might show up in the feeds of Jill’s friends, even if Jill’s friends are not friends with Jack. See the discussion section for more information on privacy settings.
Another Facebook feature that is relevant to constructing snowball samples is the Facebook group. Users can create new groups and join existing groups based on anything, ranging from specific interests to special events or shared workplaces, regions, high schools, or colleges. There is no monetary cost to joining or creating a group, and a given user can belong to as many as 200 groups at one time. Group administrators control the content and the membership of the group. Among other things, they decide whether a group is “open” (anyone can join and invite anyone else to join), “closed” (administrators must approve requests of nonmembers who desire to join the group), or “secret” (membership is by invitation only). Administrators also have the ability to send mass messages to all group members in groups that do not exceed 5,000 members.
Starting with one or more groups or network of friends, researchers can create snowball samples by gathering respondents via links to additional friends and groups. To illustrate the potential of this simple approach, consider the result of one enterprising Facebook user who administers the open group “Six Degrees of Separation: The Experiment.” To maximize the number of group members, he invited all his friends to join and encouraged all of them to do likewise ad infinitum. The group recently numbered more than six million.
Recruiting Study Participants
How, then, might a researcher navigate this elaborate web of relationships to recruit study volunteers? To some extent this depends on the population of interest and the nature of the study. In this section, I share details of my approach, based on my needs.
My study investigated the role of affective bonds in the religious commitment of baptized Catholics in the United States. The strategy was straightforward and contained just three basic steps. First, I created a new Facebook group that potential study volunteers could join. Second, I populated the group by soliciting the help of my personal network of friends and the administrators of existing Facebook groups of Catholics. Third, once the group reached sufficient size (not knowing what to expect, I was modestly hoping for 500–1,000 members), I sent the survey link to the study volunteers via the “message all group members” feature. In the message, I encouraged volunteers to send the survey link to other friends who were not members of the group, including those who do not have Facebook profiles.
I began my search in December 2008 by creating a new group named “Please Help Me Find Baptized Catholics!” The group’s description explained the purpose of the group, outlined eligibility requirements, and provided instructions on how to be involved. Though I wished to survey only baptized Roman Catholics, the text of my group page invited any viewer to join the group and likewise encouraged everyone to forward invitations to all their Facebook friends and groups. This strategy was designed both to maximize sample size and to avoid the biases associated with sampling down social chains composed entirely of Catholics.
The group page also provided a platform where I attempted to legitimize and humanize the study, two ingredients shown to increase survey response rates (Dillman 2007; Kittleson 2003). On the group’s page, I included my institutional affiliation and my university email address. I also included the names, email addresses, and university websites of the members of my dissertation committee. I posted a scholarly review of the research that justifies nonprobability sampling to the group’s discussion board, along with a list of answers to anticipated questions. I also responded to additional questions that individuals posted to the group’s wall, and this information was available for all to see. Each post that I made included a picture of myself, putting a human face on the project.
With the research group in place, I then turned to administrators of existing groups of Catholics for help in advertising my group and recruiting study volunteers. The keywords Catholic, Catholics, and Catholicism returned thousands of results, so I selected the 50 largest groups that best represented Catholics generally. To that end, I excluded groups that appeared to have large proportions of foreign members and groups with narrow membership criteria, such as those created for specific Facebook networks, college alumni groups, or ethnic groups.
I then contacted the administrators of this subset of existing Catholic Facebook groups, soliciting their help in recruiting volunteers for the study. All administrators received a personal message that explained the purpose of the research and asked them to send a message to the members of their groups with an invitation to join the research group. I also encouraged them to contact me with any questions or concerns (Figure 1).

Message to group administrators
Over the course of three days, I sent personal messages to 43 of the 50 administrators of the Catholic groups. On the first day of solicitations, I contacted 9 administrators, 4 and 6 of them responded positively. As a result, my group initially grew quite quickly. Over the next few days, however, my requests for help yielded fewer responses. Of the 25 administrators contacted on the second day, 18 failed to follow up, and no one responded to messages sent on the third day. I suspect that this rapid decline in responses was a consequence of my own initial success. As my group grew, the administrators of other groups may have concluded I no longer needed their help. In any case, in light of the rapid growth of my own group and the rapid decline in responses from other group administrators, I decided not to contact the last 7 of 50 administrators.
Table 1 summarizes the outcome of these efforts. Of the 43 administrators contacted, 15 agreed to help. Assuming each member of each group became aware of the study, and assuming no single person was a member of more than one group, the maximum recruitment potential from these 15 groups was 37,463.
Results of Group Solicitation Efforts
In all, 43 administrators were contacted, but one group was misclassified. The group appeared to be composed of Catholics, but it was not.
The number of members who actually learned about the study, however, was almost certainly much lower. Although I asked administrators to send messages to their members, five elected to post the information to their groups’ pages instead. Based on personal communications with these administrators, I learned that they did so for one of two reasons: either they considered messaging their members obtrusive, or their group size exceeded 5,000 members, inhibiting their ability to send mass messages. Because viewing the posting required users to visit the group page, and because many Facebook users appear to join groups that they rarely if ever return to, 5 this approach left users less likely to learn about the study. Nevertheless, I preferred some assistance to no assistance. Moreover, one person who posted the link administers a group with more than 30,000 members. Even if the rate of awareness in his group was low, the potential for recruitment from his group in absolute numbers was substantial.
After working through existing Catholic groups, I sent a mass message to all of my 200 Facebook friends, seeking their help in recruiting volunteers. This message went directly to each person’s inbox. Though I failed to collect systematic information on the results of this process, the feedback that I received via their reply messages suggests that many of them passed my group link along to others. A few also posted information about the study on their personal Facebook pages.
As mentioned above, Facebook is designed in such a way that group administrators lose the ability to send mass messages to their membership if the group size exceeds 5,000. Because I planned to use mass mailings to direct people to the online survey, I closed the initial group when membership approached 4,500 and opened a second group for additional volunteers.
After 2,500 people joined the second group, 6 I closed it and opened a third and final group. Altogether, nearly 7,500 people joined one of the three groups over the course of one month and received the link to the survey.
Figure 2 traces the recruitment process for the study. Recruitment occurred in two stages: I first corralled volunteers into my Facebook research groups; I then disseminated the survey link via mass message. Boxes with double lines indicate action taken directly by me. My 200 Facebook friends and the 43 administrators of existing Catholic Facebook groups were the only individuals I directly solicited for assistance. Volunteers arrived in the research groups through several paths. Some were members of the Catholic Facebook groups from which I recruited. Others were my personal Facebook friends. Presumably, still others arrived via additional degrees of separation: they were Facebook friends of my friends, they were friends of others who were already in (or aware of) the research group, they were friends of friends of friends. It is also possible that some arrived independently. For example, suppose a new Facebook user was looking for a Catholic group to join. That person might stumble on my research group by doing a keyword search of existing Facebook groups.

Where did the survey takers come from?
After populating the research group, I released the survey to group members by direct message (Figure 3). Some—likely many—of the survey takers were members of my research groups, but others were not. Presumably, some group members forwarded the link to other Facebook friends. Others forwarded the link via email to friends who did not have Facebook accounts.

Message to research group members
Figure 2 also aims to clarify why I recruited study volunteers in two stages, rather than one. With so many existing Catholic Facebook groups to tap into, why not simply circulate the link to the survey? Why bother with the intermediary step of creating the research group?
Creating the research group allowed me to capitalize on the indirect information channels built into Facebook. Much information is circulated among networks very passively through public wall posts and feeds. When an individual joins a group, that information appears in that person’s profile and, depending on one’s privacy settings, may appear in other users’ feeds. 7 If Jill joins my research group, her friend, Jack, might learn about the study when he visits her page and views her wall. He might also learn about the study without even going to Jill’s page if her status update appears in his feed. Thus, simply by joining my group that individual might recruit friends on my behalf. If instead I opted to merely circulate the link to the survey, that individual would have to actively forward the link to others. And as previously noted, by using the research group as a platform where I tried to both legitimize and humanize the study, I hopefully increased the survey response rate (Dillman 2007; Kittleson 2003).
Survey and Sample Statistics
As with most studies that employ chain-referral sampling, I am unable to calculate a response rate. One might be tempted to compare the number of members in the research groups with the number of survey takers, but this would create an inflated figure of uncertain size. We would not only ignore those who took the survey without being in the research group, but we would also ignore all those individuals who were invited to the research group but who chose not to join.
However, I am able to calculate the study’s completion rate. The web-based survey software QuestionPro collects basic statistics that indicate the response rate of those who start the survey. The survey was started 4,709 times, yielding 4,016 completed surveys for a completion rate of 85.3 percent, comparable to rates achieved in other web-based studies. 8 Only 18 responses came from ineligible participants, leaving a total of 3,998 usable surveys. On average, respondents completed the survey in 12 minutes, and the high completion rate suggests this was a reasonable request of their time.
Speed of response represents a significant advantage of web-based surveys, 9 one that this research captures quite well. Figure 4 plots the number of completed surveys within the first month of its release. Although I kept the survey link active for 100 days, the vast majority of completed surveys arrived within days. 10 Members of Groups 2 and 3 received the survey link at Time 0; members of Group 1 received the survey link on Day 3. Just five hours after releasing the survey, I tallied 426 completed surveys, a little more than 10 percent of the total number of completed surveys. Within 5 days, the number grew to 2,788, 70 percent of the total. Within 10 days, the figure increased to 80 percent. Participation continued to taper off, and 90 percent of the data arrived by the end of the first month. 11

Number of completed surveys during the first month of survey release
To assess the representativeness of my sample, I compared the Facebook adults who were raised Catholic to their counterparts in the 2008 General Social Survey (GSS). Compared to the general population of U.S. adult Catholics and ex-Catholics, the Facebook sample is younger on average (44.7 years vs. 30.5 years), more female (53.0 percent vs. 69.9 percent), and less likely to identify as Latino or Hispanic (32.5 percent vs. 5.7 percent). The Facebook sample is also much better educated and more religious. Two thirds of Facebook respondents have at least a bachelor’s degree (vs. almost one fourth of GSS respondents), and 65.9 percent claim to attend Mass at least once per week (vs. 26.7 percent in the GSS). Table 2 summarizes these comparisons.
Sample Characteristics: Facebook Sample Versus General Social Survey (GSS) 2008 (adults raised Catholic)
Religious activity was especially striking among the Facebook males—27.6 percent report attending Mass more than once per week, versus 21.6 percent for females (Table 3). This result directly contradicts not only the GSS data but also a huge and varied body of research demonstrating greater religiosity among women than men for all aspects of religious behavior, all regions of the world, and all known eras (Miller and Stark 2002; Beit-Hallahmi 1997).
Attendance at Religious Services by Gender: Facebook Samples Versus General Social Survey (GSS) 2008 (adults raised Catholic)
Column totals may not equal 100.0 percent because of rounding.
The sampling frame and sampling technique help explain the bias. Like Facebook users generally, the research sample is younger and better educated than the general population. The religiosity bias, however, likely stems from targeting Facebook groups of Catholics. While some groups cater specifically to inactive or “lapsed” Catholics, the majority appear to attract more devout individuals. The latter also appear to be larger and have more group activity. 12 From a sociological standpoint, this is to be expected. To the extent that a group exists to bind together those who share a common identity, the notion of a Facebook group of indifferent Catholics is an oxymoron: they have little to rally around. Thus, because I used the groups as my starting point for the snowball sample, I obtained volunteers who were disproportionately religious. This bias persisted even after weighting the data by age and gender (results not shown).
Preserving Correlations of Interest
In short, the Facebook respondents cannot possibly serve as a representative sample of the general Catholic population. (Pollsters should view Facebook findings with extreme caution.) However, like many researchers, I wanted to understand the relationships between certain population characteristics rather than the prevalence of individual characteristics. For example, rather than wanting to know how frequently Catholics attend religious services on average, I hoped to expose the factors that influence this behavior.
As others have shown, biased samples drawn from the web often preserve measures of statistical relationships quite well. For example, Best et al. (2001:143) separately analyzed a random telephone sample and an Internet sample and found that the same factors influence political attitudes in both. Consequently, the authors believe they “would have reached the same conclusions about the determinants of particular political attitudes by relying on a diverse convenience sample of Internet users as [they] would have by using a more expensive, time-consuming, probabilistic telephone sample.” Bainbridge (2007) also replicated correlations of interest found in the GSS with those in a nonprobability web-based sample.
But as both Best et al. (2001) and Bainbridge (1999) point out, not all nonrandom samples preserve correlations of interest. Specifically, researchers must be certain that the source of the bias in the data does not correlate with the relationships of interest. With Internet surveys, for instance, researchers must assume that the relationships of interest are not influenced by patterns of Internet use—an assumption that obviously fails when studying something like attitudes toward technology or Internet privacy.
Therefore, the relevant question of the current investigation is this: does the relationship between Catholic commitment and other variables of interest—the topic of my dissertation—correlate with Facebook usage? To address this question, Table 4 reports the Facebook correlations between variables of interest in my study and compares them to corresponding correlations derived from two high-quality probability-sample surveys: the GSS and the Gallup Poll of Catholics. As noted above, my dissertation explored the role of affective bonds in sustaining religious commitment. Because existing surveys do not gauge affective bonds (hence the Facebook study), I could not compare this exact relationship across samples. However, we might understand affective religiosity as a dimension of religiosity, much like religious orthodoxy and religious social networks are dimensions of religiosity (Lenski 1961). Consequently, the first nine variables in Table 4 offer the best available proxies for affective bonds. The outcome variable is a common measure of religious commitment—a dichotomous variable indicating whether the respondent claims to attend Mass at least weekly. The table also correlates Mass attendance with standard demographic controls, which have also been shown to influence religious commitment.
Correlations With Current Religious Attendance
To make the data comparable to those of the Gallup study, Facebook and GSS analysis restricted to those who currently identify as Catholic.
Childhood salience, Catholic parents, and Catholic spouse were asked only in the 2008 GSS.
The Facebook survey posed these questions somewhat differently than did Gallup. For each item, Gallup asked, “Please tell me if you think a person can be a good Catholic without performing these actions or affirming these beliefs.” Response categories included yes, no, don’t know, and refuse. The Facebook respondents were asked to indicate the extent to which they believe Catholics are obliged to engage in these acts or hold these beliefs.
In the GSS, childhood religious salience is measured by frequency of Mass attendance at age 12; in the Facebook sample, it is measured by response to the question, “How important was religion in your family when you were growing up?” Studies show these measures of childhood religious salience correlate very highly.
p < .10. *p < .05. ***p < .001.
As shown in Table 4, the Facebook data preserved the correlations between the variables of primary interest. In both Facebook and Gallup samples, respondents are more likely to attend Mass weekly if they expect Catholics to adhere to traditional church teachings, if they are registered in a Catholic parish, and if they believe Catholicism contains a greater share of the truth than other religions do. The matchup is equally good for Facebook and GSS.
The correlations between Mass attendance and many demographic variables are more difficult to evaluate because of frequent lack of agreement between the GSS and Gallup data. For example, the GSS data capture the tendency for women to be more religious than men, but both the Gallup and Facebook data fail to preserve this relationship. Gallup and the GSS similarly disagree over the relationship between both education and being Latino and religious attendance, again making it difficult to evaluate the corresponding correlations in the Facebook data.
One area where GSS and Gallup agree while Facebook differs is the relationship between age and Mass attendance. Although a strong positive correlation exists in both probability samples, the corresponding Facebook correlation is zero.
Closer analysis reveals a curvilinear relationship between age and attendance in the Facebook data (Figure 5). Although overall rates of Mass attendance are much higher in the Facebook sample than in the GSS and Gallup, the correlation for those aged 30 and older is positive and significant in all three samples (the Facebook correlation is .14, p < .001). A statistically significant, negative correlation (−.10, p < .001) exists for those between age 18 and 29 in the Facebook sample.

Relationship between weekly mass attendance and age: Facebook, Gallup, and General Social Survey (GSS) samplesSmoothed means created in Stata using lowess.
Social network theory may explain this unusual finding among young respondents. Contemporary research has repeatedly demonstrated that social ties and interaction are key to religious commitment (Cornwall 1989; Welch 1981). Applying this insight to the college setting, we might expect heightened religious involvement among those who join faith-based campus groups where social ties can flourish. However, after graduation religious commitment may suffer as these social networks dissolve. Anecdotal data lend credence to this hypothesis. An extensive study of young adult Catholics revealed that significant numbers of Catholic college students who were active in campus ministry were often frustrated after graduation because they found it difficult to find a parish experience as vital and engaging as the collegiate one (Hoge et al. 2001). This frustration may lead some to attend Mass less frequently. Traditional surveys may not detect this pattern of religious involvement if members of these faith-based campus groups compose a relative minority of the population. Perhaps the pattern is evident in the Facebook data because the sample overrepresents these individuals.
Certainly this is just a working hypothesis, but the data illustrate a case where Facebook usage does correlate with a pattern of Catholic commitment. If age were central to the investigation, Facebook sampling would not be a viable approach (at least for those younger than 30).
For the present study, however, the correlations that held were more important than those that failed. As noted, although the demographic variables were control variables in the model, the religion variables (the first nine variables in the table) offer the best available proxies for affect, the variable of interest in the study. Moreover, because of the unusual relationship between age and attendance among people younger than 30, I ran the dissertation model separately for younger (younger than 30) and older (30 and older) respondents and found very similar results. 13
Table 4 also reveals the limitations of probability-based studies and underscores the importance of considering cost–benefit trade-offs when selecting the best method of data collection. Although the Facebook data failed to preserve the correlation between gender and Mass attendance found in most empirical studies, this relationship was likewise absent in the Gallup data. Correlations between Mass attendance and both education and being Latino likewise differed in the GSS and Gallup samples. While these findings do not make the Facebook method superior to a probability-based technique, they highlight the fact that even national samples often fall short of true randomness (Bainbridge 1999, 2002), yet they are also quite expensive to conduct. In some studies of certain populations, a nonprobability approach may therefore offer an attractive cost–benefit trade-off.
In short, the viability of Facebook sampling depends on both the population of interest and the particular research question. Those interested in replicating this method are advised to first consider whether they can reasonably expect that Facebook usage does not influence the relationship between the variables of interest. If, ex ante, they anticipate that a Facebook sample will produce the same correlations found in the population, the Facebook approach may be worthwhile. Those who then choose to move forward can subsequently evaluate the veracity of this assumption by including questions from probability-based surveys. If the assumption holds, the researcher has a useful alternative to more traditional means of pretesting surveys or for conducting exploratory studies. Moreover, the preliminary results may provide good evidence in support of a working hypothesis that helps the researcher secure a grant for a traditional study. In the unfortunate case that the assumption fails, the monetary cost to the researcher is small because Facebook sampling is relatively inexpensive.
Discussion
Because my method of data collection appears to be unprecedented, I could not avail myself of any established procedures or “best strategies.” I relied instead on standard insights, personal intuition, and pure luck. I have summarized my experience in the hope of helping others exploit the unique strengths of sampling through SNSs while avoiding or at least understanding their limitations.
The researcher should bear in mind that the suggestions below are based on Facebook’s current functionality and popularity. At the present time, Facebook offers a useful platform for social-scientific research, but any one of many factors could easily undermine its utility. Although Facebook has experienced incredible growth since its inception, there is no guarantee that this will continue. Just as Facebook once usurped MySpace, so might another innovation undermine Facebook’s popularity. Furthermore, Facebook frequently changes its feature set, often with little or no announcement, such as when it removed its geography-based networks in 2009 (Facebook 2010). If Facebook were to similarly remove the “message all members” feature that is available to group administrators, this would eliminate what was the primary recruitment method for my study.
Facebook users can also alter the viability of the site for research by changing their privacy settings. In fact, concern about privacy has been a direct result of some of the changes to the Facebook site, including the addition of the “News Feed” in 2006 (Boyd 2008) and, more recently, changes to users’ default privacy settings (Boyd and Hargittai 2010). Although studies show considerable variation in how frequently users alter their privacy settings (Madden and Smith 2010; Stutzman and Kramer-Duffield 2010; K. Lewis, Kaufman, and Christakis 2008; Ellison et al. 2007), one longitudinal study finds that engagement with privacy settings is on the rise, having increased significantly between 2009 and 2010 (Boyd and Hargittai 2010).
For all of these reasons, scholars who plan to pursue a similar method are advised to evaluate the current features of the site before undertaking their investigations.
Recruitment
Keeping the size of my groups under 5,000 members was critical to the success of the study. As noted above, group administrators lose the ability to send mass messages to members when the group size exceeds this number. Had this occurred, I would have needed to rely on my volunteers to return to the group page to access the survey link when it became available. Like many Facebook groups, however, “Please Help Me Find Baptized Catholics!” had little activity. The groups that keep members coming back are those that keep the content fresh—frequently updating the group’s “Recent News,” posting new photos and videos, adding current events and links—so members have a reason to return. Even had I attempted this, there is little chance I would have captivated the interests of all 7,500 members, prompting them to return to the page on a regular basis. Therefore, closing the group well before the size reaches the 5,000 member threshold is a vital strategy for a successful study. 14
The size restriction also suggests the method of corralling study volunteers into a research group is not viable for large studies. I was able to gather 7,500 members only by creating multiple groups, but the process of managing membership quickly became inefficient. In theory, I could have acquired more volunteers by continually opening new groups and moving members across them, but I found this unmanageable. Therefore, I do not recommend this approach to those who seek more than a few thousand respondents.
Those with financial means might circumvent this problem by taking out a targeted Facebook ad. Researchers can target study volunteers by location, sex, age, keyword, relationship status, job title, workplace, or college. Facebook ads appear in the right-side column on Facebook pages, and up to three ads may show at one time on any given page. There is no set cost for Facebook ads; rather, advertisers compete with one another by bidding for advertising space. Rather than creating a research group, one could directly link the target audience to the survey by clicking on the advertisement.
The experience of at least one researcher suggests targeted ads provide a useful way to recruit survey respondents, although it remains unclear whether this approach leads to a more or less representative sample (unpublished work at The Association of Religion Data Archives http://www.theARDA.com). This researcher was interested in attitudes toward fate, and he set up a daily budget, which determined how many people would receive the ad. He budgeted $1,000 each day for 30 consecutive days and always reached the budget by day’s end. Although his sample was even more biased toward females and young respondents than was the sample in the Catholic study, 15 this disparity may reflect the topic of the investigation more than the recruitment method. For instance, perhaps women are relatively more interested in fate than in Catholicism and hence more willing to complete a survey about fate. In any case, the experience of the fate researcher points to a potential for heightened bias of which others should be cognizant, should they choose to use Facebook ads for recruitment.
Assessing and Mitigating Sample Bias
Researchers planning to use Facebook as a sampling frame can also undertake measures that enable them to assess—and perhaps mitigate—bias that will necessarily arise. Above I noted that including several items in my survey pulled directly from two probability-based surveys enabled me to assess the extent to which Facebook usage correlated with relationships of interest. These benchmarks also enabled me to measure the extent of the bias in the data as well as to evaluate the ability of surveys weights to improve its representativeness. Although weights were ineffectual in the present study, other researchers might achieve different results. Indeed, if the experiences of those who undertake more traditional web-based studies provide any guide, we should expect mixed success (Dever, Rafferty, and Valliant 2008; Loosveldt and Sonck 2008; Valliant and Lee 2005).
The diversity of Facebook groups lends itself to a stratified sampling approach that may increase the representativeness of the sample. Earlier I mentioned that I sampled only groups that appeared representative of the target population: all individuals baptized in the Catholic Church in the United States. In retrospect, this was an ineffectual approach because the sampling frame itself (i.e., Facebook users) is biased in several ways, and my sample reflected that bias. Had I employed a stratified approach, I might have mitigated that bias by targeting specific groups, such as Latino Catholics and groups composed of older respondents.
Other SNSs?
One might also wonder whether or not expanding the sampling frame to include other SNSs would lead to a more representative sample. Although Facebook is currently the world’s largest SNS, many others are also quite large and appeal to a different demographic. According data from compete.com, LinkedIn received more than 24 million unique to visitors to its site in December 2011, while MySpace received 21 million. A recent study by Lenhart (2009:6) summarizes demographic variations in the types of people who tend to use each of these three SNSs. While both Facebook and MySpace appeal to a younger audience (median age of 26 and 27, respectively), MySpace users are more likely to be Hispanic or Black, to be female, and to have a high school education or some college. Facebook users tend to be White and to have a college degree. LinkedIn users are also likely to be White, male, and well educated, but this SNS appeals to working professionals who are relatively older (median age of 40). Perhaps I could have increased the representativeness of Latinos and older respondents by including MySpace and LinkedIn users, respectively.
How effectively I would have reached those individuals, however, remains uncertain. A variety of factors leave me pessimistic. Perhaps most important, neither LinkedIn nor MySpace equips group administrators with the “message all members” feature as Facebook does. Even individuals who expressed interest in the study would have to return to the group page to receive the survey link. Both LinkedIn and MySpace also lack the equivalent of Facebook’s “feeds” that transmit information passively and diffusely among networks. Facebook has more than seven times as many groups as MySpace, 16 leaving the researcher many fewer sources of potential volunteers on MySpace. Finally, I am unsure how interested LinkedIn users would have been in my study, given that the site exists primarily for professional networking.
In short, while in theory opening the sampling frame to include users of other SNSs should have increased the representativeness of my sample, in practice I believe I would have achieved limited success, given the nature of my study and its design. The extent to which other SNSs present viable options for other researchers depends largely on their needs and the scope of their studies.
Other Best Strategies and Recommended Uses
Releasing the survey in multiple waves can mitigate traffic-related problems and can offer an opportunity to make changes to unforeseen problems. My experience offers an imperfect attempt to avoid these pitfalls. As displayed in Figure 1, when I first released the survey, I sent the link to approximately one third of all group members; the remaining volunteers received the survey after three days. Within minutes of disseminating the link in the first wave, I received an email from a volunteer identifying a problem with one of the questions that I did not catch, even after pretesting the survey. 17 I corrected the problem immediately, but several minutes passed before the change took effect, presumably because of the large volume of individuals on the website at the time. Fortunately, the erroneous item was not critical to my primary analysis (I did not use the variable in my model), so it was not necessary for me to exclude these cases. Future researchers can learn from this mistake by releasing the survey to only a few dozen respondents initially. This experience offers another lesson as well—researchers conducting surveys via the web must be mindful of the limitations of the web-based software they choose.
The Facebook method has other potential uses besides exploratory studies. Given its low cost, the quick turnaround time, and the ability to revise items on the fly, I strongly recommend the use of Facebook for pretesting survey items.
Facebook sampling may also be a viable option for research of some hidden populations. Because it is a chain-referral method, Facebook sampling provides access to some populations that are absent from standard samples because they are too small or too difficult to reach. Consider a hypothetical study of members of one Protestant denomination: the Lutheran Church–Missouri Synod. A researcher who turns to the 2008 GSS will find just 30 members of this group, but I found more than 14,000 members spread out over 16 Facebook groups. Suppose instead that a health researcher is trying to locate victims of thyroid cancer and their families. A quick keyword search returned several dozen thyroid cancer groups on Facebook composed of thousands of members. Thanks to widespread use and niche groups, Facebook offers researchers a way to easily reach many otherwise hidden populations.
However, the Facebook approach may not work when the hard-to-reach population is stigmatized. Many of these groups—prostitutes and drug users, for example—are unlikely to have a public presence on Facebook or other SNSs, so it would be difficult for a researcher to locate them. Even if one could identify the population of interest, establishing trust in an online medium would likely pose quite a challenge to the researcher. For example, I found several groups of Jehovah’s Witnesses, but the administrators of these groups may be skeptical of an unsolicited message like the one I sent to my Catholic groups. However, if the researcher had a preexisting relationship with a member of the stigmatized group, then Facebook or another SNS may still provide a useful platform to navigate close, intimate networks.
Although the present study restricted the sampling frame to the United States, Facebook is an increasingly attractive option for international research as well—particularly in nations with diffuse Internet use. Among today’s 825 million active Facebook users, 80 percent live outside the United States (Facebook 2012). Although international research greatly increases the cost of conducting survey research via traditional methods like mail and telephone, the Facebook method poses no additional costs. The best candidates for international research are nations with high rates of Internet diffusion generally and high rates of Facebook use specifically. Coverage error poses an increasing concern the lower the penetration rate.
As a final note, my research project aroused no serious concerns from the Institutional Review Board (IRB) at my university. Anonymity, confidentiality, and data security transmission present challenges to researchers who collect data via the Internet. The potential to record the IP address threatens to reveal the identity of respondents, and the data are susceptible to hacking and corruption (Benfield and Szlemko 2006). I was able to allay human rights concerns with limited effort. In terms of the survey itself, my IRB simply required that my survey begin with an informed consent page and that I ask respondents to indicate that they were at least 18 years old. They also required me to submit information on the privacy and data security efforts taken by QuestionPro, the company that provided the web-based survey software for the study. Presumably, the ease with which I received IRB approval reflected the safety and security of QuestionPro software rather than a lax review process. 18
Conclusion
Over 35 years ago, Mark Granovetter (1973:1371) illustrated how weak ties transmit information more quickly and more diffusely than do strong ties. Those to “whom we are weakly tied are more likely to move in different circles from our own and thus will have access to information different from that which we receive.” Weak ties are the bridges between small clusters of close friends, linking us together to form an elaborate web of social relationships.
Facebook offers researchers a way to capitalize on the strength adult Facebook of these weak ties. According to a recent study, the average user has 229 friends (Hampton et al. 2011). While some of these relationships constitute “strong ties,” many are also acquaintances, old friends from high school or college, even total strangers (DiMicco et al. 2008; Joinson 2008; Lampe, Ellison, and Steinfield 2008). Millions of Facebook groups also exist on the site, linking users to countless others whom they do not even know.
The present study offers a modest attempt to exploit these features, which make Facebook a useful tool for snowball sampling. By navigating through existing groups of Catholics and by tapping into personal friends, I recruited nearly 4,000 individuals to participate in my research study. Data collection was extremely fast and incredibly cheap. Within five days 2,700 individuals completed the survey, and a modest license fee for the web-based software was the only expense. The data also preserved the statistical relationships between the variables of interest, despite being biased in several ways. While snowball sampling via Facebook is no substitute for probability-based techniques, the fact that the relevant correlations among variables hold suggests Facebook may be a valuable tool for exploratory research of certain populations.
Because the present study is the first of its kind, it remains unclear how broadly this technique can be applied. As similar studies of other populations are conducted, we can better evaluate the merits of Facebook as a general tool for social-scientific research.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
