Abstract
Mixed methods researchers share a commitment to knowing their sampling frames and minimizing discovery failure, especially when using surveys. Notwithstanding advances in sampling strategies, the geographic clustering of perceptions has not been fully considered for its relevance to sampling. This article examines the value of spatial autocorrelation analysis to guide sampling decisions. Spatial autocorrelation refers to the clustering of (dis)similar phenomena and signals the likely existence of perception subgroups. Through a spatial autocorrelation analysis of Dallas, Texas, the authors identify sampling frames for collecting data about perceptions of West Nile Virus eradication measures. They furnish some empirical confirmation of the geographic clustering of perceptions and argue for designs that identify perception clustering, which can affect qualitative sampling as well as advance the integration of quantitative and qualitative research.
Although significant progress has been made in advancing mixed methods (MM) designs across the social and behavioral sciences (Albright, Gechter, & Kempe, 2013; Creswell, Klassen, Plano Clark, & Smith, 2011; Creswell & Plano Clark, 2007; Fielding & Cisneros-Puebla, 2009; Greene, Benjamin, & Goodyear, 2001; Hesse-Biber, 2010; Johnson & Onwuegbuzie, 2004; Palinkas et al., 2011; Terrell, 2012), efforts to understand the geographic clustering of perceptions, the theoretical mechanisms driving this clustering, and the implications of this clustering on approaches to sampling have been limited (Curtis, Gesler, Smith, & Washburn, 2000; Trotter, 2012). MM researchers originating from different paradigms (positivist, constructivist) share a commitment to knowing their sampling frames and increasing their confidence in minimizing discovery failure. This commitment is strongest in survey-based research (nested in different MM designs) crafted to elicit the widest range of possible views about topics relevant to the general public.
Efforts to create strong MM research designs have been aided by new taxonomies that help envisage the ways in which qualitative data could “mix” with quantitative data to furnish the best answer to a given research question or set of questions. Among a variety of options, for example, qualitative data could be collected with a purposive sample to refine measures that would be used to collect a probability-based sample. Alternatives include using qualitative data to explain quantitative findings or to complement quantitative findings by answering the same research question qualitatively (Aarons, Fettes, Sommerfeld, & Palinkas, 2012; Creswell & Plano Clark, 2007; Johnson & Onwuegbuzie, 2004; Palinkas et al., 2011). Alongside the development of such taxonomies, several influential methodologists have elucidated a range of sampling approaches (Sandelowski, 2000; Tashakkori & Teddlie, 2010). Teddlie and Yu (2007), for example, provide guidance about various probabilistic and purposive designs, and the time ordering of each (i.e., sequential or concurrent), acknowledging that research resources are always limited and “[s]ampling issues are inherently practical” (Kemper, Stringfield, & Teddlie, 2003, p. 273).
Notwithstanding this richness of methodological guidance, the potential for a geographic understanding of perception clustering to inform sampling decisions has not been fully explored. This knowledge gap exists despite widely known theories in urban ecology (Schelling, 1969, 1971) and empirical analyses in geospatial sciences demonstrating the existence of perception clustering in particular attitudinal domains, including political ideology. Building on this latter work, this article describes and applies a new technique for identifying potential perception clusters at the outset of MM studies with survey components. We further identify the untapped potential for qualitative elements within MM designs to explain the underlying mechanisms determining perception clustering.
The empirical focus of this article is a Dallas-based pilot study aimed at exploring whether analyses of spatial autocorrelation (SA) could aid in the identification of perception subclusters. SA refers to the geographic clustering of (dis)similar phenomena. In quantitative geography, phenomena ranging from socioeconomic/demographic characteristics of residents, to health/disease, to crime are tagged to locations on the earth’s surface. Whereas positive SA describes similar phenomena clustering together, negative SA describes dissimilar phenomena clustering together. Overwhelmingly, the most commonly observed SA for socioeconomic/demographic phenomena is moderate and positive in nature. This clustering is in contrast to a random mixture of phenomena across a geographic landscape. The clustering can be indexed by a correlation coefficient. Because this clustering is in geographic space, this correlation is attributable to relative locations, and hence is spatial. Because a geographic distribution involves a single phenomenon, this geographic relationship between nearby values is self-correlation (i.e., autocorrelation; Griffith, 1992). The Nobelist Schelling’s (1969, 1971) model furnishes one of the principal conceptualizations for expecting subgroup perceptions to cluster in geographic space, as indexed by positive SA.
The next section provides a brief review of the established sampling literature relevant to research with survey components. It then explains the potential value of geographic analyses of perception clustering to the growing body of guidance about MM sampling designs. The subsequent section illustrates this value through application of a SA analysis to a study of public perceptions in Dallas, Texas, surrounding government efforts to eradicate West Nile Virus (WNV). The findings of this exploratory study confirm that geography is not determining people’s perceptions, but rather, that a particular autocorrelation mechanism often is inducing patterns in the geographic distribution of perceptions. We therefore argue that MM designs incorporating qualitative components either sequentially or concurrently need to acknowledge this context and could further illuminate and explain such mechanisms and their impacts across a range of topics in areas such as public health and public safety.
Geographic Perception Clustering and Its Implications for Sampling
Survey research seeks to elicit the maximum number of views about a topic, often in terms of either percentages holding or discovery of these views, in reference to a clearly specified population. In efforts to achieve both representativeness and comparability in their results, researchers strive to identify all theoretically relevant attributes of the subject pool to delineate a sampling frame. In turn, decisions are made among a range of sampling strategies designed to ensure the judicious use of research resources (Teddlie & Tashakkori, 2009). Such strategies include simple random sampling, stratified random sampling (proportional and nonproportional), systematic sampling, and cluster random sampling (Lee & Forthofer, 2006, Chap. 2).
Geography plays a varying, if not indirect, role in these extant strategies. It is the least relevant to simple random sampling, where each individual in the population has an equal chance of recruitment. Stratified sampling focuses on theoretically relevant attributes at the group level (e.g., gender), which then establish strata from which individuals are recruited either proportionately (based on their level of representation in the wider population) or nonproportionately. Systematic sampling can exploit geography by selecting every kth subject along a street (i.e., sequencing the population by address). Cluster sampling moves closer to geography as the unit of analysis with its focus on groups that emerge organically at a particular spatial unit, such as neighborhoods or organizations (e.g., schools). In both simple cluster and multistage cluster sampling, the clusters are selected, in the first instance, through randomization. Within each cluster, all individuals within a group may be recruited, or people may be recruited randomly (Teddlie & Tashakkori, 2009).
Building on these strategies, the approach discussed here moves closer to the link between geography and the clustering of perceptions to improve the precision and efficiency of sampling. In particular, it offers a systematic technique for estimating the spatial clustering of perceptions that, in the future, could be integrated with qualitative research to explain the mechanisms determining such clustering. Tobler’s (1970) first law of geography is that “[e]verything is related to everything else, but near things are more related than distant things” (p. 234). Perceptions, too, are not randomly mixed in local areas. Rather, people with similar perceptions tend to concentrate in geographic space. This notion is becoming more widely recognized by social science researchers. Motyl, Iyer, Oishi, Travwalter, and Nosek (2014, p. 1) argue that “individuals choose to live in communities with ideologies similar to their own”—signaling the presence of positive SA. In particular, studies by these researchers indicate that residential mobility tends to produce an increase in the degree of positive SA and ideological clustering. They note that, given the geographic distribution of ideology by ZIP codes, individuals’ views tend to better conform to local views in their current than in their past locations. The Economist (“The Politics of North and South,” 2013) recognizes this geographic concentration of perceptions in England by commenting that
The north [of England] has wealthy suburbs, like South Wirral, west of Liverpool. They vote Labour. The south has impoverished pockets, like north-east Kent. They vote Conservative. It is as though political opinions derive from the air people breathe. (p. 16)
The Schelling model describes this situation: Simulations of geographic distributions of households that begin with a random mixture of them across locations, and a household preference function of locating near other households with similar attributes, results in an increasing clustering of similar households with the passing of time. Griffith (2013) outlines some qualitative sampling design implications based on this model.
Geography, in terms of proximity, appears to eclipse socioeconomic/demographic differentiation. But geography is not determining people’s perceptions. Rather, some positive SA mechanism often is determining the geographic distribution of perceptions. This perception clustering exists beyond the domain of political ideologies. For instance, the return of outbreaks of nearly eradicated childhood diseases (e.g., scarlet fever, measles, polio, whooping cough, and mumps) is attributable to growing geographic clusters of parents who delay or refuse standard vaccinations for their children,
creating communities where vaccination rates may have dropped below the levels needed to keep infectious diseases at bay. Some of the lowest rates occur in affluent, well-educated communities like Boulder, [CO], and Marin County, [CA], where parents [often focus] on being environmentally conscious and paying close attention to every aspect of their children’s development. (Mnookin, 2012, p. 9)
The statement that “Sometimes, when you’re surrounded by people who think the way you think, it’s easy to believe” (Mnookin, 2012, p. 9) highlights positive SA at work here, too.
The presence of positive SA has implications for MM research with survey components designed to elicit the widest range of possible views about themes relevant to the general topic as well as understand the mechanisms driving perception differences. The procedure discussed here seeks to build on and advance existing MM sampling strategies designed to achieve representativeness or comparability by offering a systematic approach to identifying where perceptions cluster and why. This procedure addresses a larger concern of MM researchers originating from different paradigms (e.g., positivist, constructivist) about understanding their sampling frames and increasing levels of confidence pertaining to when, where, and how discovery failure is minimized. The following sections describe an effort to use SA analysis as a basis for sampling designs seeking to reduce the probability of discovery failure. The notion of interest here is that as groups become more homogeneous, the range and diversity—and hence variability—of their perceptions tend to shrink; positive SA furnishes one index of this shrinkage. Logan and Zhang (2004) also suggest this perspective. Goals of other purposeful sampling designs, such as ones based on maximum variation, have this same concern, and sampling in homogeneous neighborhoods may be ill-advised for them, too.
Quantifying Spatial Autocorrelation: The Pilot Study in Dallas, Texas
The authors conducted an experiment concerning the role SA plays in sampling designed to uncover perceptions, in this case, about local government interventions to control the spread of WNV infections, a controversial topic in Dallas, Texas, during 2012 to 2013. This experiment begins with a census tract resolution factorial ecology of Dallas County socioeconomic/demographic variables extracted from the 2010 U.S. Census and its equivalent American Community Survey data to quantify SA. Next, local indices of SA (LISAs) calculated for factor scores from a factorial ecology 1 allow the use of concomitant geographic clusters to identify potential neighborhoods for survey administration. Overlaying Dallas County street maps on selected clusters furnishes the basis for selecting neighborhoods to canvass. Finally, an analysis is presented of the sequencing of perceptions collected with the survey to highlight the potential for MM designs to further illuminate mechanisms driving such sequencing on this topic or a range of others of significance to the public.
The study summarized here sets out to quantify SA as a basis for determining a sampling frame designed to generate a range of prevailing perceptions about measures to eradicate the outbreak of WNV. Geospatial scientists employ many procedures to analyze spatial data, some of which are multivariate statistical techniques used to display and analyze the principal dimensions measured by a suite of variables (Rees, 1971). Factorial ecology (Irwin, 2010) is one methodology built on this approach. Its goal is to summarize a massive amount of multicollinear data (where predictor variables are correlated) for urban spatial units, and relate the uncovered empirical dimensions to theories of urban spatial structure. The factorial term refers to the factor analysis multivariate statistical technique used in this type of analysis to identify latent variables that explain variability exhibited by a set of correlated variables (Johnson & Wichern, 2002). Factors, which are explanatory or concise descriptive elements of a population, may describe patterns of association among variables or attributes across measurements for observations such as areal units. These patterns are identified in a way that groups subsets of variables, with these groupings tending to maximize correlations within the groups and minimize correlations between the groups. The ecological term refers to an analysis of summary statistics for grouped data in general, geographically grouped (i.e., areal unit) data in particular, and urban census tracts in this case. The next section summarizes details of the factorial ecology for Dallas, Texas (see the appendix). For this study, SA in the factor scores supports the identification for sampling purposes of neighborhoods with certain homogeneous features.
Factorial Ecology
Factor scores are observation (e.g., census tracts) measures for artificial variates constructed as weighted sums of original variables that adjust for correlations among these original variables. Moran coefficients were calculated to index the global nature and degree of SA latent in the Dallas, Texas, factor score census tract maps. Because the census tracts comprise a set of contiguous polygons, neighboring census tracts were defined as those sharing a common nonzero length edge (i.e., contiguity edges only in ArcMap jargon, and the rook’s definition in terms of a chess move analogy). Table 1 summarizes these global SA measures, which tend to be moderate in degree and positive in nature; this is a common result for socioeconomic/demographic data. All of these index values are statistically significant.
Global Moran Coefficients for the Rotated Factors Based on the Box–Cox Transformed Data.
Because the ultimate goal here is to identify neighborhoods for surveying, once factor scores were calculated and choropleth maps of them constructed for the 529 Dallas census tracts, LISAs 2 were computed in order to identify clusters of extreme positive and negative SA (Anselin, 1995). A LISA essentially is the individual observation calculation that goes into a global Moran coefficient calculation for the entire Dallas map; for Dallas, a Moran coefficient essentially is an average of 529 LISAs. This type of analysis is a favored hot spot identification method, identifying clustering pairs of neighboring values that are HH or LL. Respectively, these are clusters of census tracts that contain high factor scores surrounded by high factor scores, or low factor scores surrounded by low factor scores. Both indicate positive SA, and clusters of census tract values that bolster the nature and degree of SA indexed by the global Moran coefficient value for a map (i.e., they closely align with a positive sloping scatterplot trend line). Substantively, these clusters identify regions on a map with similar aggregate statistics. For the density data (Figure 1), HH clusters tend to be near the center of the map (near downtown Dallas), whereas the LL clusters tend to be on the periphery, but with a preponderance of them in the southeast corner of the county. Meanwhile, for the percentage data (Figure 2), clusters of high factor scores (HH) still tend to be near the center of the map (near downtown Dallas), whereas clusters of low factor scores (LL) tend to be on the periphery, but without displaying a tendency.

Factors for transformed density data. Top left (a): Factor 1. Top right (b): Factor 2. Bottom left (c): Factor 3. Bottom right (d): Factor 4.

Factors for transformed percentage data. Top left (a): Factor 1. Top right (b): Factor 2. Middle left (c): Factor 3. Middle right (d): Factor 4. Bottom left (e): Factor 5.
Survey Topic Selection
The survey administration experiment required the identification of contrasting neighborhoods with substantial degrees of SA, and a survey topic of which people are aware, and for which people should hold a variety of perceptions. The chosen topic was the recent Dallas, Texas, disease outbreak of WNV. The U.S. Centers for Disease Control and Prevention recognizes WNV as a very costly disease (Staples, Shankar, Sejvar, Meltzer, & Fischer, 2014). A sizeable 2012 outbreak highlighted Dallas as an epicenter (Murray, Ruktanonchai, Hesalroad, Fonken, & Nolan, 2013), creating considerable local media coverage of this disease (Merchant, 2013). Given this context, the subject of WNV prevention appeared to furnish a good topic for surveying perceptions about government intervention to control and prevent it.
WNV is a mosquito-borne neuropathogen (Petersen & Marfin, 2002) that first appeared in the United States in 1999, with the first case in New York City. Since then, this disease has diffused across most, if not all, of the country (Centers for Disease Control and Prevention, 2014; Griffith, 2005). The main method of controlling the spread of WNV is mass spraying of pesticides to kill its mosquito vector (Karpati et al., 2004), an approach that has not been carried out without controversy. Spraying controversies are one reason appearance of this disease receives considerable media coverage and creates a range of perceptions about government intervention to control and prevent it.
Neighborhood Selection
Logan and Zhang (2004, p. 124) comment that “neighborhoods typically extend across many census tracts, and little error is introduced by spatial variation within tracts,” concluding that census tracts furnish a convenient areal unit resolution for identifying neighborhoods. These researchers also note that LISAs provide a useful tool for identifying such neighborhoods, a tool yielding similar results to those obtained with traditional tools used by urban sociologists. 3 Consequently, calculating and mapping LISAs for the preceding factorial ecology, which is based on census tracts, produces a sound basis for identifying neighborhoods with particular degrees of SA for a survey administration.
Those neighborhoods of interest here have the highest concentrations of clustered high factor scores (HH) and clustered low factor scores (LL). Moreover, these two clusterings identify neighborhoods at opposite ends of the synthetic household attributes measurement scale. The factorial dimensions, which are adjusted for the presence of correlation among the original variables, were combined to create a composite LISA map identifying neighborhoods exhibiting simultaneous clustering of extreme factor scores (i.e., HH and LL clusters; Figure 3).

Tracts for survey administration. Left (a): overlay merging of factor dimension LISA maps to identify the combination with the most HH and LL scores. Right (b): A zoom-in to the part of Dallas County with the selected tracts (circled), overlaid on a street map, for survey administration.
Figure 4 highlights the candidate neighborhoods from which two tracts were selected for survey purposes. Of the low factor score cluster (LL) tracts, none were identified as having the characteristics of all nine factors. Some tracts contained four of the nine possible combinations of factors. Three factors for both data sets (i.e., percentage and density) were identified for the high factor score cluster (HH) tracts, with three tracts containing that number of factors. One tract was selected from the subset of three overlaid high factor score clusters (HH), and one tract was selected from the subset of four overlaid low factor score clusters (LL). The survey was administered in these two selected neighborhoods (Figure 3).

Google images of the selected neighborhoods for surveying. Left (a): a HH neighborhood slightly north of downtown Dallas. Right (b): a LL neighborhood near the Cotton Bowl.
The factorial ecology–SA analysis revealed two neighborhoods suitable for the sampling experiment, one a high factor score (HH) cluster and the other a low factor score (LL) cluster (Figure 4). The former neighborhood contains mostly condominiums, duplexes, townhomes, apartments, and a few small lot homes. This neighborhood houses a primarily Caucasian population, with more than half of the inhabitants being working class and having an age ranging between 20 and 40 years.
This neighborhood’s population experiences a relatively low unemployment rate together with relatively high median income earnings and education levels. In contrast, the low factor score cluster (LL) neighborhood contains a larger African American population, mostly houses with a few apartment complexes, and a lower population density. A higher percentage of the population is millennials under 20 years of age, with a more uniform spread of persons over the full range of ages. This neighborhood also has a relatively higher unemployment rate and relatively lower income and education levels.
For the survey instrument, a variety of media sources and local newspapers were consulted to identify the set of perceptions one would expect to encounter when surveying Dallas residents. Once county health departments declared a WNV epidemic, we scoured metro-wide and local neighborhood distributed hardcopy newspapers and web-available digital news sources for coverage of government responses to this crisis, including The Dallas Morning News (August 13, 2012; August 17, 2012), The Huffington Post (November 4, 2013), and Neighborsgo.go—Allen|Frisco|McKinney (June 5, 2013). This approach is in keeping with the tradition of content analysis of newspaper articles. But, because perceptions are of primary interest here, our review involved both articles and “letters to the editor.” A first screening was based on titles, focusing on the following keywords/phrases: WNV and mosquito spraying. We laboriously reviewed the content of every news article or published letter we could identify during the time frame of the crisis. Having only two researchers involved in this process simplified coordination to ensure that both employed the same selection rules and content coding. Twelve perceptions were gleaned from stories and surveys about perceptions of how government agencies dealt with the 2012 and 2013 WNV outbreaks. Next, for survey purposes, these perceptions were noted on a questionnaire, 4 which also included an additional option of other perceptions not identified during the examination of media articles (the presentation here is as it appears on the survey instrument):
_____ ineffective and leaving the public vulnerable to contracting WNV.
_____ does not matter because nothing can be done if I am predestined to contract WNV.
_____ adequately protecting my family and me from getting WNV.
_____ adequately protecting others from getting WNV.
_____ spraying harmful chemicals that hurt my family.
_____ spraying harmful chemicals that hurt my pets.
_____ spraying harmful chemicals that hurt the homeless.
_____ spraying harmful chemicals that pollute my landscaping and family/community gardens.
_____ harming the economy (e.g., terminates outdoor dining at restaurants, harms bees that produce honey).
_____ implementing an intelligent intervention that avoids spraying harmful chemicals.
_____ fails to adequately publicize how to protect against contracting WNV.
_____ mosquitoes have a right to live, too.
_____ other ______________________________________________________________
This survey was administered in the two selected neighborhoods to see what perceptions were uncovered. Respondents were presented with a page that listed the question and the 13 responses, and asked to indicate the ones with which they agreed, including “other” (“do not know” occurred as a response here; either a respondent entered this response or had the survey administrator enter it). Because the question is open-ended, respondents were able to provide oral responses other than those exactly stated on the survey form. These open-ended responses are noted in Table 2, but were not analyzed explicitly for our purposes here. The following is the open-ended question posed to respondents:
How do you feel about the local government and health department’s intervention actions taken last year and again this year to protect the public from contracting West Nile Virus, and why?
Survey Results.
Note. HH = high–high; LL = low–low. Left-hand columns: HH neighborhood results. Right-hand columns: LL neighborhood results. Selection 13 denotes other; specific comments furnished with this response appear in parentheses, whereas the lack of a parenthetical statement indicates a response of “do not know.”
The government should do more spraying off mosquitoes. bThis response was discarded because of the respondent did not seem to comprehend the survey questions. cThe government allows too much standing water (e.g., old tires, driveway puddles, unfiltered fish ponds, empty flowerpots; any item that can hold water for more than a few days at a time) to exist.
During the months of November and December 2013, both neighborhoods were surveyed. Twenty-nine responses were obtained in the HH neighborhood, and 31 in the LL neighborhood. 5 In both neighborhoods, only the neighborhood, the within-neighborhood sequence number of a response, and the perceptions expressed were recorded for each respondent.
Results
Survey results (Table 2) from the HH neighborhood show duplication with the second respondent of the perception that government intervention was adequate, and duplication with the sixth respondent of the perception that government intervention was ineffective. Five respondents filled in the additional option with a response of “do not know.” Three respondents selected responses indicating that they were against the spraying for varying reasons. Over half of those surveyed (15) indicated that mosquito spraying adequately combated WNV, whereas only a third of the respondents indicated dissatisfaction with the spraying strategy. Three of the 12 possible responses were not expressed by any of those surveyed in this neighborhood, and 4 possible perceptions were expressed once but never duplicated.
Survey results (Table 2) from the LL neighborhood include a supplemental question to help index the degree of social desirability bias. To ascertain this bias, the research question was supplemented with the following preliminary question 6 for survey administration in the second neighborhood:
Is the U. of Texas at Dallas (UTD) football team better known than the Southern Methodist U. (SMU) football team? 7
_____ YES _____ NO _____ DON’T KNOW
The expectation is that inhabitants of Dallas neighborhoods would be aware of the Southern Methodist University (SMU) football team, especially given SMU’s relatively close location to the surveyed neighborhoods. The University of Texas at Dallas has no football team. Consequently, a response of “yes” is not sensible. Nevertheless, duplication of a response of adequacy of government intervention occurred in the LL neighborhood after the 2nd survey as well, duplication of inadequacy was reached after the 10th survey, and duplication occurred with the fifth respondent with regard to the perception that government intervention was adequate but also harmful to pets and homeless people. Only one respondent expressed an opposition to mosquito spraying, which is contrary to perceptions expressed in the HH neighborhood (three respondents noted that, for varying reasons, they did not like spraying). Also, the LL neighborhood had more respondents characterizing the spraying plan as being inadequate. Seven respondents believed the spraying was appropriate; this is the second most prevalent answers expressed in the LL neighborhood. In addition, four respondents indicated a mixture of satisfaction and dissatisfaction or concern, indicating that they were both happy with the spraying and apprehensive about how it may affect other urban phenomena; only one mixed response was obtained in the HH neighborhood. Finally, this LL neighborhood had only one response that was answered but never duplicated and four possible perceptions that were not expressed by any of those that answered, as well as five respondents who answered “don’t know.”
The HH neighborhood produced nine of the expected perceptions, with three not being duplicated in the sample; the LL neighborhood produced eight of the expected perceptions, with one not being duplicated in the sample (Table 3). But, when it did occur, saturation—even with the weak standard of simple duplication—was not fast to occur. This outcome furnishes limited documentation of the presence of an SA factor in sampling. No respondent had a fatalistic perception (#2).
Frequency of Perceptions Expressed by Respondents in the Surveyed Neighborhoods.
Note. HH = high–high; LL = low–low; WNV = West Nile Virus.
Discussion and Implications for Mixed Methods Sampling Strategies
Despite modest response rates in both HH and LL neighborhoods, the survey findings reveal subtle indicators of SA effects. Most notably (as seen in Table 3), a greater proportion of respondents from the LL neighborhood expressed dissatisfaction with government efforts to control WNV. The socioeconomic/demographic attributes used to quantify SA appear theoretically relevant to this finding, but they likely do not tell the full story about the predominance of this perception. Recall that compared with the HH neighborhood, the LL neighborhood contains a larger proportion of African American residents, a relatively higher level of unemployment, and relatively lower educational levels. At the same time, a relatively higher percentage of residents are younger than 20 years of age, with the remainder of the sample spread across a range of ages. Together, these attributes could, in theory, signify a neighborhood with greater structural disadvantage and possibly more negative views about the effectiveness and legitimacy of government overall.
In the future, SA analyses could be fruitfully incorporated into MM sampling strategies concerned with identifying locations of perception clustering and explaining the mechanisms predictive of this clustering. Based on the MM sampling typology developed by Teddlie and colleagues (Teddlie & Tashakkori, 2009; Teddlie & Yu, 2007), both sequential and multilevel MM sampling strategies are particularly relevant to the use of SA analysis. Sequential MM sampling incorporates probability and purposive sampling techniques in consecutive order, as in studies that incorporate quantitative data collection (QUAN) followed by qualitative data collection (QUAL)—that is, QUAN ↔ QUAL. In a replication and expansion of the Dallas study, the SA analysis would serve as simply the first step of a MM study, beginning with the identification of probable perception clusters. Then survey findings could inform a purposive sampling strategy to better understand the mechanisms driving the predominance of certain views within the HH and LL communities. For instance, a homogeneous sampling strategy could be used for focus groups with individuals who are nested in the HH and LL communities, respectively, and who share particular attributes, such as age, race/ethnicity, level of education, and housing status. Such focus groups could refine, confirm, or even invalidate the presumed mechanisms explaining perceptions of WNV eradication measures.
SA analysis also could be incorporated into more complex MM designs. For example, the preceding sequential design could inform the parameters of a respondent-driven sampling (RDS) component that would generate a wider and more refined sampling frame. RDS is a relatively new sampling approach that seeks to improve on the merits of snowball sampling, especially in relation to topics where the views of hidden or vulnerable populations are critical to ascertain (Heckathorn, 1997, 2011; Rudolph, Young, & Lewis, 2015; Volz & Heckathorn, 2008). Although similar in fundamental ways to snowball sampling, RDS incorporates mathematical calculations designed to minimize chances of discovery failure that can emerge when respondents refer researchers to like-minded individuals. In their study of injection drug users’ perceptions and experiences of police encounters in New York City, Beletsky et al. (2014, p. 106), for example, used an RDS approach “that allows for statistical weighing of results to adjust for recruitment biases common in peer-referral designs.” An initial sample of injection drug users served as “seeds,” with each sampled individual referring researchers to three additional peers, who in turn referred the researchers to three of their peers, and so on. At the seeding stage, the researchers sought to achieve diversity in demographic and geographic characteristics, and recruitment was operated from four different neighborhood storefront locations. SA analysis potentially could precede a sampling approach like this by identifying, in the first instance, neighborhoods with a moderate to high degree of positive SA. Then, specific SA mechanisms theoretically relevant to this study could inform RDS criteria as well as the analysis of findings. 8
SA analysis also could be incorporated into what Teddlie and colleagues describe as multilevel MM sampling, where units of analysis are nested within one another (Teddlie & Tashakkori, 2009; Teddlie & Yu, 2007). Depending on a research topic, different georeferenced attributes could signal various theoretical mechanisms driving the clustering of perceptions. Moreover, such attributes could be analyzed at different spatial scales. For example, a study focusing on fear of crime could be guided by a SA analysis at the census tract scale. The level of violence—as measured by violent crime incidents recorded by police—constitutes georeferenced data that could be examined at the scale of street blocks and intersections. The survey sampling procedure could be guided by a SA analysis that incorporates violent crime, and if degrees of fear prove to be associated with proximity to violent locations, interviews could be conducted with individuals working or living in and around the most violent street blocks and intersections, commonly referred to as “hot spots” (Ratcliffe, 2004; Sherman, Gartin, & Buerger, 1989). Such interviews could be focused on eliciting information about a range of possible other mechanisms driving fear of crime, beyond the sheer presence of violence, including environment features such as presence of alcohol outlets, existence of abandoned buildings, or inadequate lighting. Currently, such features are georeferenced in many large American jurisdictions and based on interview findings, could be subject to a more complex SA analysis that could refine survey samples in the future. Depending on the research topic, a richer set of georeferenced attributes could inform SA analyses in advance of determining purposive sampling strategies—including deviant/extreme or typical case samples—designed to understand the most extreme or typical views about an issue.
In summary, results of the SA analysis used in the Dallas pilot study inspires confidence in the prospect of minimizing discovery failure in survey-based research designed to elicit the widest possible range of views about matters of interest to the public. However, the socioeconomic/demographic characteristics indicative of SA that characterize the geographic clustering of individuals provide us with a starting point, not an end point, in enhancing the precision of sampling decisions. As such, further sampling experiments using SA analysis should adopt MM sampling strategies with purposive and qualitative elements that can explain the mechanisms signaled by SA.
Conclusions
Although SA is known to signal the presence of perception subgroups, the potential for analyses of SA to inform sampling designs as well as advance the integration of quantitative and qualitative research through sampling (see Fetters & Freshwater, 2015) has not been adequately explored. We refer in particular to MM sampling with sampling strategies incorporating both surveys to identify the range of views on a topic and qualitative components (e.g., interviews, focus groups) to expand on the mechanisms driving the spatial clustering of views. We argue that analyses of SA can help in the geographic targeting of survey administration, while the qualitative components can deploy purposive sampling strategies informed by the SA analysis and survey findings.
A principal goal of this research was to test a systematic approach to sampling in a survey-based design using an analysis of SA. The study demonstrated that the frequency of expressed perceptions varies between the two Dallas, Texas, neighborhoods studied. As an exploratory study, the findings inspire future experiments aimed at discovering, in a fuller way, the mechanisms that mediate perceptions (and variations between them) within geographic clusters. It provides evidence of the need to account for SA effects in MM sampling, and suggests that a strong analytic basis of sample size determination must be grounded in the best of spatial statistical techniques combined with a sophisticated theoretical understanding of how, and through which mechanisms, perception subgroups form and persist within and beyond defined spatial units.
Footnotes
Appendix
Authors’ Note
Daniel A. Griffith is an Ashbel Smith Professor. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the U.S. National Science Foundation.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research for this article was supported by U.S. National Science Foundation grant BCS-1262717.
