Abstract
Researchers increasingly take advantage of the comparative case design to build theory, but the degree of case dependence is occasionally discussed and theorized. We suggest that the comparative case study design might be subject to an often underappreciated threat—dependence across cases—under certain conditions. Using research on innovation diffusion as an illustration, we explore the role of social linkages across cases when building theory through comparison and contrast between cases. We develop an agent-based simulation, grounded by comparative case research about innovation diffusion, as novel way to study the implications of case dependence in theory building using multiple-case study research. Our simulation results suggest that the degree of case dependence has a nontrivial bearing on innovation diffusion experienced by case entities, specifically when the researcher draws a few case entities operating in a highly interconnected industry. Under these conditions, overlooking the degree of case dependence might weaken newly built theory against commonly held standards of internal validity and external validity in inductive research. We conceptualize the issue of case dependence as a concern about researchers’ bounded rationality. Accordingly, we build on our findings to provide actionable advice aiming to alleviate this concern while being amendable to the variety of approaches to build theory from multiple cases in social sciences.
It was extremely desirable, for the sake of those who may wish to study the evidence for Dr Tylor’s conclusions, that the full information should be given as to the degree in which the customs of the tribes and races which are compared together are independent. It might be that some tribes derived them [i.e., the customs] from a common source so that they were duplicates of the same original.
Sir Francis Galton is known for his work in anthropology, for developing the correlation measure, and for devising a classification method for fingertips. He is also known for his celebrated comment on Dr. Tylor’s research at an academic conference. As described in the introductory quote, Sir Galton challenged Tylor’s theory about institutions across societies on the grounds that a high degree of social linkages or dependence between tribes (i.e., cases) undercuts theory validity. Sir Galton did not object to dependence between cases. Rather, he cautioned against researchers evoking cross-case evidence to support the veracity and generalizability of their claims, when knowledge about the degree of case dependence is limited.
In multiple-case research, social dependence broadly refers to meaningful linkages between cases, such that conditions set for one case influence the other cases. One example is when managers who work in a case organization share social ties with members in another case organization. When dependence remains undisclosed or no attempt is made to unveil its role, researchers may treat two cases as highly different when these cases are in fact two illustrations of the same phenomenon. While case dependence might manifest in several ways, social linkages between members across cases are a pervasive and probably the most determinant form of case dependence (George and Bennett 2005:33). The researcher’s knowledge about social linkages between cases is fundamental in establishing whether cases represent duplicates of the same original or different originals. The comparison and contrast of cases is a central feature in theory building using a comparative case design (Dyer and Wilkins 1991; Eisenhardt 1989; Ragin 1992). The discussion about the researchers’ assumptions about case dependence is relevant because case research designs are increasingly used to build theory (Bartunek, Rynes, and Ireland 2006; Beach and Rohlfing 2018; Elman, Gerring, and Mahoney 2016). Comparative case research is suitable to examine emerging phenomena or aspects that are difficult to study (Bresman 2013; Edmondson, Bohmer, and Pisano 2001; Majchrzak, Cooper, and Neece 2004).
Broadly, the conjunction between the centrality of case comparison in inductive reasoning and the growth of comparative case research is such that the researchers’ knowledge about whether the cases are linked in ways that are meaningful to the research question merits further scrutiny. To put it more directly, this article asks whether and under which conditions Galton’s challenge is relevant for today’s comparative case researchers.
In this article, we build on research across the social sciences to clarify—and to some extent demystify—why and how issues of case dependence in theory building are distinct from those in statistical analyses. Following an inductive approach ourselves, we draw on research using comparative case design for the study of innovation diffusion to develop a systematic analysis of the role of case dependence. Our findings show two generalized research designs that, based on the analysis of prior research, relate to confusion—or incomplete reporting—rather than to variants of the comparative case design. One research design displays high dependence between cases (we call this “complete” case design to indicate social linkages between cases), while the other design displays low case dependence (we call this “ego” case design to refer to a more egocentric approach without social linkages between cases). We provide illustrations of confusion in past research about the rationale for the use of comparative case research and evidence of case dependence through social linkages when building theory on innovation diffusion (George and Bennett 2005).
To examine the degree of case dependence, we draw on past research on innovation diffusion using a comparative case design to develop an agent-based simulation that provides us with a “virtual lab” (Bruch and Atwell 2015; Davis, Eisenhardt, and Bingham 2007; Fioretti 2013). The purpose of the simulation is to help us exploring how a researcher’s assumptions affect what qualifies as cross-case evidence about innovation diffusion, according to varied levels of case dependence. Indeed, as we will elaborate in this article, we provide a novel way to use agent-based simulations as meta-theoretical tools in sociological methods and research. Acknowledging the variety of approaches to theory building from multiple cases, we outline the general conditions under which case dependence undercuts the insights derived from cross-case comparisons. Specifically, we discuss implications about the veracity of the study’s presumed causal or logical relationships (i.e., internal validity) and the extent to which the conclusions are generalizable or transferrable beyond the cases being examined (i.e., external validity; Cook and Campbell 1976:37; Yin 1994:33). The experimental findings further aid us to devise actionable suggestions for studying and communicating case dependence in future research. Innovation diffusion provides an instructive context, but our insights enhance reasoning in the social sciences in general. This study is probably one of the first studies to take advantage of agent-based models (ABMs) to address central questions in research methods about qualitative research designs. 1 Our study also aims to facilitate transparency and promote the diversity of approaches to theory building in social sciences.
This article proceeds as follows. It first presents the conceptual framework for the study and then shows the development of an ABM simulation grounded in prior research on innovation diffusion. Next, it reports the results and its consequences for research. Finally, it discusses the implications for researchers who are interested in multiple-case research.
Conceptual Framework
We synthesize the literature on the role of multiple cases in theory building followed by the use of the literature on innovation diffusion to illustrate the sources of social linkages between members of a social system. 2 Next, we propose that case dependence in building theory from multiple cases is best conceptualized as an issue about the researchers’ bounded rationality.
The Role of Case Comparisons in Theory Building
Cases are central for theory building in comparative case research. However, the definition of a “case” is occasionally surrounded by confusion (Flyvbjerg 2006; Glasser 1965; Ragin 1987, 1992). In an illuminating discussion, Ragin (1992) argues that what a case is might often be unclear to experienced researchers and novices alike. Ragin (1992) compares one study that conducts interviews about an organization’s informal structure with another study that uses interviews about employees’ job satisfaction. These two hypothetical studies are similar, but the former focuses on the organization (i.e., case) while the latter concerns the employees (i.e., cases).
Several definitions of a case are available (Eisenhardt 1989; Glasser 1965; Lijphart 1971), but these definitions converge toward a class of, for example, events, organizations, and places that are of interest to a researcher. A case might be an organization (Bansal and Roth 2000), an episode (Kapoor and Furr 2015), or a tribe, like the example from the introductory quote. 3
Why do case comparisons matter in theory building?
Cross-case comparisons guide the researcher in identifying data patterns that support emerging propositions. Early methodological contributions stress the importance of case comparisons in theory building. This idea is present in Mill’s ([1843] 2002) method of agreement (i.e., two cases are similar in the outcome variable and dissimilar in all but one explanatory variable) and method of difference (i.e., two cases are the same in all but one explanatory variable and differ on the outcome variable). The comparative case study method entails a systematic analysis of a small number of cases, a number deemed too small for an analysis using statistical techniques (Eisenhardt 1989; Lijphart 1971). Case comparisons are useful (1) to uncover explanatory linkages between concepts, (2) to illustrate theory, and (3) to compare and contrast different contexts (Collier 1993; Lijphart 1971; for an overview of the case method, see Elman et al. 2016).
There is a wealth of approaches to comparative case research (Eisenhardt, Graebner, and Sonenshein 2016; Glasser 1965; Ragin 1992), but they unite around the importance of case comparisons as a prime source of theory building from cases (Eisenhardt 1989; Leonard-Barton 1990). In a study of eight ailing organizations, Harris and Sutton (1986) address the aspects of case dependence by declaring the selection of organizations (i.e., cases) according to four categories: private, dependent; private, independent; public, dependent; and public, independent. As Harris and Sutton (1986:8; italics added) state, “[t]he varied sample served to strengthen theory-building about constant elements […] and the similarities observed across a diverse sample offer firmer grounding for such propositions.” The multiple-case study approach provides an instructive basis for discussion as probably one of the most frequently used case designs to build theory among case researchers (Gehman et al. 2018). The central idea of multiple-case approach hinges on (assumed) low degree of case dependence. Eisenhardt (1989:541) notes that “the juxtaposition of seemingly similar cases by a researcher looking for differences can break simplistic frames. In the same way, the search for similarity in a seemingly different pair can also lead to more sophisticated understanding.” Furthermore, Eisenhardt and Graebner (2007:25) add that “each case serves as a distinct experiment that stands on its own as an analytic unit. Like a series of related laboratory experiments, multiple cases are discrete experiments that serve as replications, contrasts, and extensions to the emerging theory.” The corollary is that case comparisons are instrumental for theory building by revealing similarities and differences between cases.
However, the multiple-case study approach has also been criticized for attempting to mimic deductive research (J. W. Dyer and Wilkins 1991; Gehman et al. 2018). At the core, still, case comparisons play a central role for theory building from cases, regardless of the exact approach followed by the researcher.
Demystifying case dependence in multiple-case research
At first, case dependence would appear to be a matter for deductive research only. After all, inductive and deductive approaches are distinct and used for different purposes in social research. In deductive research (Goodman and Kruskal 1959; Wood 2008), linkages between observations undermine statistical inference by violating the assumption of independence (Kenny and Judd 1996).
In inductive research, cases are not used to test hypotheses in the statistical sense. Although uncontroversial, this observation should not result in the discharge of the role of case dependence in theory building. In referring to this point, George and Bennett (2005:33) conclude that “[t]he statistical version of the problem does not apply to case studies, but a more fundamental concern does.” In particular, social linkages between case members are a typical and dominant manifestation of case dependence in comparative research (Della Porta and Keating 2008; George and Bennett 2005). A low degree of case dependence ensures that theoretical propositions are substantiated through evidence from across cases (Eisenhardt 1989; Gerring and Christenson 2017). In contrast, cases with a high degree of dependence might actually be desirable to address specific research questions (Della Porta and Keating 2008; George and Bennett 2005). For example, consider a study of the market-entry strategies that were adopted by several U.S.-based startup organizations (cases) in the software industry and how these strategies influence their financial performance. The financial performance of these start-ups might be influenced not only by different organizational strategies but also by the mobility of “star” engineers between start-ups and across neighboring states.
Case dependence bears implications for internal validity and external validity of the answers being provided to research questions. As for internal validity, neglecting dependence among cases undermines the chain of evidence in support of the newly reported relationships between constructs. First, the presence of duplicates undercuts a tenet in comparative case research: Comparisons across cases strengthen the evidence of emerging relationships between concepts (Eisenhardt and Graebner 2007:25). New observations based on interconnected cases do not add as much new information as those from unrelated cases even if these observations are still be relevant (King, Keohane, and Verba 1994). Second, a best practice in comparative case research is to rule out alternative explanations of the reported relationship between constructs (Eisenhardt 1989; Gibbert, Ruigrok, and Wicki 2008). In qualitative research, more broadly, “the persuasiveness of the arguments is greatly strengthened if serious attention is given to alternative explanations” (Siggelkow 2007:23). The lack of awareness of dependence among cases restricts the range of distinct cross-case evidence available—where the choice for a multiple-case study approach is often justified by the interest in gathering extensive evidence from multiple cases—to rule out alternative explanations and bolster the evidence reported to support emerging insights and often formalized as propositions.
As for external validity, information about case characteristics aids better specifying the transferability of the findings (Gibbert et al. 2008; Meredith 1998). A clarification of the degree to which the cases are dependent aids researchers to specify a nontrivial condition for whether the newly developed theory “accounts for phenomena not only in the setting in which they are studied, but also in other settings” (Gibbert et al. 2008:1468). By overlooking case dependence, researchers risk to omit a core case characteristic that defines the newly developed theory and its application to other similar contexts. As we show below, we can further address the concern above, asking how researchers might use (high vs. low) case dependence to advance theory.
Case Dependence: Studies on Innovation Diffusion as an Illustration
Diffusion concerns how, why, and at what rate new ideas and technologies spread (MacVaugh and Schiavone 2010; Taalbi 2017). Diffusion occurs through social influence among the members of a system (e.g., an organization or a region). Rogers (2003:293) posits that “we must understand the nature of networks if we are to comprehend the diffusion of innovations fully.” With such a focus, we identify three sources of social linkages between cases: the industry network, the diffusion process, and the organizational characteristics.
Industry networks
Members of a system are embedded in industry networks. Industry and national contexts support ties among organizations and individuals (Camey and Farashahi 2006; Hamilton and Biggart 1988). More specifically, countries and industries entail social institutions and values (Saka 2004) that are translated into regulations and industry standards that contribute to linkages among actors (Hoppmann et al. 2013; Reinstaller 2005). Industry standards not only support information diffusion but also help mediate innovation activities between industry actors (Featherston et al. 2016; Perez-Aleman 2011).
Furthermore, the members of a social system maintain ties that support relational and economic exchanges. According to Alter and Hage (1993:46), networks “constitute a basic social form that permits interorganizational interactions of exchange, concerted action and joint production.” Industry networks underpin innovation diffusion (for a discussion, see Cowan and Jonard 2004). Examples of industry actors that support industry networks include supplier associations (Dyer and Nobeoka 2000), industry regulators (Edler and Yeow 2016), certification agencies (Camey and Farashahi 2006), and trade advisory boards (Knights and Scarbrough 2010). These actors also provide “networking” opportunities for managers to exchange knowledge (Dyer and Nobeoka 2000) and enact new practices (Ferneley and Bell 2006). Prior literature further notes the role of personal networks in an industry, for example, based on graduate programs (Carney and Farashahi 2006; Hargadon and Fanelli 2002) and communities of users (Hienerth and Lettl 2011; Shluzas and Leifer 2014).
The high embeddedness of actors in industry networks promotes common values, practices, and standards that sustain linkages between members of a social system. Table 1 presents an overview of the sources of social linkages among cases.
Innovation Diffusion: Sources of Linkages Between Potential Cases.
Diffusion process
The second source of social linkages among cases is the uncertainty of the diffusion process. Such uncertainty accrues from three factors: the technology (what), the actors (who), and the process (how). The term technology is given a broad meaning here; it refers to products, practices, ideas, and services. Technology’s features add uncertainty to innovation diffusion. For instance, technologies that display (1) a low degree of cumulativeness (e.g., telecommunications technology; Mu and Lee 2005), (2) maturity in the market (e.g., solar photovoltaic modules; Hoppmann et al. 2013), and (3) applicability of routines (e.g., management best practices; Szulanski 1996) add uncertainty about the extent to which system members adopt innovation.
The actors (which we refer to here as the entities involved in innovation), such as organizations, and individuals, further influence the degree of uncertainty. They endorse values and norms that might change the process of diffusion. For instance, Toyota established a supplier association (kyohokai) that was specifically designed to promote the exchange of technical information between Toyota and its suppliers (Dyer and Nobeoka 2000). The adoption by specific actors (Saka 2004), the extent to which they craft solutions that shape ideas about the technology (Robertson, Swan, and Newell 1996), their preferences (Reinstaller 2005), and their reactions to conflicting institutional pressures introduce uncertainty linkages between members (Yoshikawa, Tsui-Auch, and McGuire 2007). The factors concerning uncertainty in the diffusion of innovations make it difficult for researchers to anticipate which system members share social linkages with (Table 1).
Organizational characteristics
The last source of social linkages between members of a system concerns organizations’ structural and operational characteristics. Structural characteristics include organizational design features. For example, the degree of control-related work systems in the source company influences the diffusion and adoption of the recipient company (Keidel 1981). Other structural characteristics include work systems between parent companies and headquarters (Saka 2004; Zhao, Anand, and Mitchell 2005), resources and incentives shared between actors (Schaefer 2007; Woiceshyn 2000), and formalized interactions between actors (Bresman 2013).
Operational characteristics support linkages between actors. A canonical example is an organization’s set of buyer–supplier relationships. Other operational characteristics include the mobility of skilled engineers within and across organizations (Mu and Lee 2005), joint teams between a focal organization and its suppliers (Dyer and Nobeoka 2000), employees’ career trajectories (Camey and Farashahi 2006), talent recruitment strategies (Hargadon and Fanelli 2002), and co-ownership agreements (Hoppmann et al. 2013). Overall, the structural and operational characteristics of organizations define the extent to which members of a system are interconnected.
As George and Bennett (2005:33) first conjectured, innovation diffusion is elucidative of the challenges that researchers face when accessing social linkages among case members. One may assume that information flows from the cases to the researchers without barriers. In this ideal position, it is plausible to form a realistic assessment of the degree of case dependence.
One may also assume that researchers have the issue of the degree of case dependence in mind while designing and conducting their studies. The researcher could select, for example, theoretically distinct cases to serve as independent experiments. Thus, the researcher must explicitly consider dependence to be information worth gathering and meaningful for one’s research.
Case Dependence as a Researcher’s Bounded Rationality Problem
Innovation diffusion is a case in point to the challenges that researchers face when building theory from multiple cases and, more broadly, comparative case research. We submit that the issues around case dependence is best conceptualized as a bounded rationality problem. Drawing on behavioral science (Kahneman 2003; Simon 1979, 1990), researchers’ bounded rationality refers to the limits of their ability to identify optimally, sometimes even satisfactorily, the degree of case dependence in complex empirical settings. Researchers might intend to set up a comparative case design where cases share no meaningful linkages, but those linkages are often unknown to the researcher.
We are concerned with the extent to which researchers’ assumptions about case dependence fail to fit the empirical setting; we are not concerned with researchers’ preference for any particular approach to comparative case design. We focus on researchers’ challenge of identifying the degree of case dependence against their own aspirations and the misfit between researchers’ aspirations and the actual degree of case dependence. In the widespread multiple-case research that foregrounds case comparison, this misfit presents threats to internal validity and external validity. 4
A conceptualization of the degree of case dependence as a bounded rationality problem contributes to a better understanding of social research methods. First, the notion of bounded rationality shifts the attention from an abstract debate about quantitative versus qualitative research or the preference for a specific version of comparative case to the challenges that researcher face when building theory. Second, by focusing on bounded rationality, the discussion moves forward by devising heuristics which future researchers can resort to align their research design with their declared rationale to use comparative case research and their research goals.
Our call is not to standardize the research process. Rather, we aim to keep creativity as an essential element of discovery in social research. The central issue is the following: If we were to formulate a comparative case design in which high versus low degrees of case dependence were acknowledged, how would we revise the research process and evaluate newly developed theory against commonly held standards in qualitative research?
Model Development
Past literature aided us in identifying simulation parameters and value ranges for these parameters, thus lending realism to our model (Bruch and Atwell 2015). The simulation model captured key features of innovation diffusion on the context of number of cases (e.g., organizations). This purpose fits broadly to what Edmonds et al. (2019) define as an illustration purpose. More specifically, we use the information gathered from the simulation model to learn about multiple-case research methods when it comes to the assumptions on dependence of cases. As we elaborate below, we use the model to examine whether, and when, case dependence could influence theoretical insights based on multiple-case research. Hence, the simulation model has a second-order (indirect), or meta-theoretical, purpose in that it results are informative to (methods) choices that might impact theory development efforts. 5
Comparative Case Research About Innovation Diffusion
We first conducted a systematic review of the literature on innovation diffusion that uses comparative case research. This review aimed to identify current practices related to the degree of case dependence and to guide the design of a computational simulation.
Review scope
We used a set of criteria to define the review scope. First, we selected the top-tier journals in general management and leading journals on innovation. 6 Second, we defined a set of search words used in the EBSCO Business Elite data set. We used “diffusion” OR “adoption” (in the abstract) and “case” (in the full article). Various labels have been used to refer to comparative case design (e.g., “multiple-case study” and “comparative case”). We searched for “case” in the full text. We manually removed single case studies. Finally, we placed no restriction on the start date; the first result in our search was Eilon and Elmaleh (1970). The end date was December 2019. In total, we found 393 articles.
Coding procedures
We assessed the relevance of articles against two criteria: They used any form of comparative case research, and they studied diffusion or adoption. Two independent coders assessed the relevance of the articles. Each coder received a coding booklet. This booklet included instructions for the coders, supported by examples. The coders compared their coding and settled any disagreements. We found very strong intercoder reliability (.94; Cohen 1960). In total, we identified 40 relevant articles.
Case dependence in past research
A few insights are worth noting with regard to the study of innovation diffusion using comparative case research. Table 2 presents an overview. First, we found sustained growth of the use of comparative case research to study innovation diffusion. Of 40 articles, 18 were published between 2009 and 2019. Table 2 shows that comparative case research has been used to study innovation diffusion across a wide range of national, industry, and organizational contexts. Second, researchers have studied various types of cases, such as firms (D’Ippolito, Miozzo, and Consoli 2014), products (Frattini et al. 2014), and teams (Bresman 2013). Finally, researchers have used six cases on average (mean = 5.98; standard deviation = 5.70). The cases are the frequent number (mode = 2); half of the studies use up to three cases (median = 3).
The use of comparative case research is overwhelmingly justified by the advantage of within- and cross-case comparisons to develop theoretical propositions (e.g., Bresman 2013; Langley and Truax 1994; Valente 2012). This suggests that prior research has largely followed Eisenhardt’s contribution (1989:542) on the role of cross-case comparisons of distinct cases (“experiments”) to build robust and generalizable theory.
We found that studies provided limited information about the degree of case dependence. As Table 2 shows, 10 studies (25.0 percent) provided no information. Our review shows that 17 studies (42.5 percent) report information suggesting independence of the cases (e.g., based on the selection of cases from different countries; Valente 2012), while 13 studies (32.5 percent) reported information about case dependence. The dominant manifestation of case dependence was social linkages between case members (e.g., membership of local industry clusters; Perez-Aleman 2011; joint team membership; Bresman 2013). However, the role of case dependence was largely left open to interpretation. For example, it was unclear whether linkages across cases were of any relevance to the newly developed theory. It was also unclear whether the reported dependence played a part in the boundary conditions of the newly developed theory. Perhaps the dearth of information is a by-product of articles citing methods from different intellectual traditions with little integration of ideas (Gehman et al. 2018:284; for a broader discussion about transparency in qualitative research, see Aguinis, Ramani, and Alabduljader 2018; Pratt, Kaplan, and Whittington 2020). 7
The insights from our literature review provided us with further assurance of the pertinence of examining the role of case dependence in theory building. In our review, we found two archetypes of comparative case research: high versus low dependence between cases. Dependence is primarily understood in terms of social linkages because it refers to the source of case dependence alluded in prior methods discussions (George and Bennett 2005) and it dominates in prior comparative case research, as shown in our review.
Use of Comparative Case Research to Study Innovation Diffusion.
Note: Mean = 5.98; standard deviation = 5.70; mode = 2; median = 3; N = 40. n/a = not applicable.
a The information reported in the articles was insufficient for an assessment of the degree of case dependence. We coded the articles using following categories: independent (if the article provided some information about the independence of cases), dependent (if the article provided some information about the dependence of cases), and unclear (if the article provided insufficient information).
The complete research design involves a high degree of case dependence. Figure 1 (left-hand side) illustrates the complete research design. By complete, we mean a high degree of social relationships across cases. The notion of a complete research design is consistent with a “collective case study” (Dodgson et al. 2008) and an “embedded case” (Gil, Miozzo, and Massini 2012). Our notion of a “complete research design” specifically emphasizes the extent to which cases are connected through social linkages. In a study of architecture projects conducted by Gehry Partners, social relationships among individuals across projects influence practices of innovation diffusion (Boland, Lyytinen, and Yoo 2007). Other examples from our literature review include firms in Italy’s aerospace industry (Sammarra and Biggiero 2008), radical innovation in the aerospace industry (Majchrzak et al. 2004), and computer games (cases) owned by the same firm (Flowers 2008).

Two archetypes of comparative case research.
Ego research design features a low degree of dependence between cases. When cases (e.g., departments and organizations) under study do not share social linkages, we refer to them as using an ego research design. We borrow our terminology from sociological studies of networks. Similar to an ego network based on all ties by an “ego” (Marsden 1990), our notion of an ego research design describes a set of cases that are not connected through social ties.
Figure 1 (right-hand side) illustrates the ego research design; it entails a low degree of case dependence. The lines represent social ties and the node’s entities (e.g., individuals) within the case (e.g., the organization). Low case dependence occurs when cases are from different industries (Schaefer 2007) or distinct divisions in the organization (Ferlie et al. 2005). In general terms, “ego” research designs resemble conventional multiple-case research (Eisenhardt 1989; Eisenhardt and Graebner 2007; Ozcan, Han, and Graebner 2017). In an “ego” network, the nodes other than the ego are called alters, and they usually have a direct connection with the ego. All other connections of the network—that is, outer connections of other alters with the ego’s alters—are excluded.
ABM
Having identified two archetypes of comparative case designs, we faced the challenge of examining why and under which conditions each research design could influence understanding and analysis. As previously noted, we use the agent-based simulation in an indirect way. The purpose of our ABM is threefold: (a) to explore the interaction between case dependence and emerging theoretical insights, (b) to manipulate experimentally the degree of case dependence, and (c) to explore why and under which conditions each research design could influence emerging theoretical insights. From many parts, computer simulation use has been advocated as an aid to theory building (Cowley 2016; Smaldino, Calanchini, and Pickett 2015), when
– researchers aim to analyze boundary conditions,
– the landscape of potential alternatives is partially unknown, and
– the impact of the construct under analysis is difficult to assess solely through the literature.
The study of the degree of case dependence meets all three of these conditions. For this reason, we used an agent-based computational simulation to study experimentally the degree of case dependence. 8
Model specifications
The simulation model entails a nexus of agents that, in this case, represent organizations. We used our review of past research to develop realistic parameters in our simulation (Bruch and Atwell 2015) and align it to the standard mechanisms as usually coded in threshold models of diffusion (for a review, see Kiesling et al. 2012). Below, each model parameter is described (see Table 3 for a summary).
In our computational model, 9 some agents are selected as case agents. Other agents are connected via links (i.e., social linkages) to each other and to the case agents. The other agents are part of the industry. The number of cases ranges between 2 and 50, while the other agents (labeled else) are kept constant at 300 (Table 3). This figure of 50 cases is much higher than the average number of cases found in our literature review (i.e., six cases). We purposefully added a maximum of 50 cases to test the conditions of our model when pushed beyond the limits found in the literature (the so-called analysis of counterfactuals; Squazzoni 2012:26, Chapter 1). We also opted for a relatively large population of potential other organizations (i.e., 300) that might be suitable to develop theory on a specific issue. 10 Ultimately, the relevance and number of cases depend on the research question (Gerring and Cojocaru 2016).
Figure 2 depicts the simulation environment, developed via NetLogo Version 5.2 (then updated to Version 6.1.0), an agent-based simulation software. Each agent is randomly located in a 3D space where the links among agents become visible. These links are established based on the parameter range that defines the Euclidean distance of r pixels around each agent, hence it specifies interaction based on proximity of other agents falling within the specified range. This is a typical feature of ABMs; it represents the idea that each agent (organization) has a limited reach and can only connect to those other agents (organizations) that fall within its range of possible interactions (whether social, economic, cultural, or geographical). To some extent, the parameter range is a function that attributes uniqueness to each agent (organization) by helping define their interactions. In practice, even if the range is the same for all agents in the environment, the actual agent-based network takes a form that depends on the actual number and characteristics (see below) of other agents at a distance that is defined by this parameter. In other words, range represents the diverse extent to which organizations interact with other organizations in the environment. Therefore, it represents industries where companies interact more or less frequently—for example, a relatively low-interaction industry is that of dairy products and a relatively high-interaction industry is that of smart phones. When this parameter is set to 1, each agent screens the eight possible positions in the grid immediately around itself; when set to 2, there are 24 positions; when set to 3, there are 46 positions, and so on. The higher the range of interaction, the greater the opportunity for agents to establish links with other agents because the number of positions in the surroundings grows exponentially with every integer increase. The range values are set to 3, 6, 9, and 12 (Table 3). Links between agents are established on the basis of a random number that has the range as its upper limit. For example, if range = 6, then each agent connects to other agents that are approximately located at a distance given by a randomly selected number between 0 and 6. This parameter provides each network of organizations with different configurations. It further allows for changes in innovation diffusion on every run of the simulation, hence capturing the basic uncertainty of the diffusion process (Lee, Song, and Yang 2016).

Screenshot of the diffusion model on NetLogo 3D Version 5.2 (color exhibit). Note: “Case” organizations are white colored and are larger while “other” (else or noncase) organizations are yellow; when the organization adopts an innovation, the node shows red.
Simulation Parameters: Linking Past Research and Parameter Values.
Note: We aimed to develop a parsimonious model of innovation diffusion. Our approach does not preclude future researchers from adding parameters that suit their own interests in innovation diffusion. Our typology of sources of cross-case ties (Table 1) provides a starting point to tailor this model to specific research questions.
Each agent is allocated a value of innovation; this value represents the extent to which the agent is inclined toward an innovation. This represents a general attitude (Ajzen 2005) toward innovation behaviors that an entrepreneur or a combination of decision makers in every firm has. Since our model represents an abstract diffusion of innovation without adhering to a particular function derived from existing data, values are initially assigned to agents using a random normal distribution ≈ N (0, 1) (for a review, see Kiesling et al. 2012).
Our simulation features an overall precondition and two rules in which agents can adopt an innovation. In order to qualify as potential adopters, agents should be positively disposed toward it. This is given by the general innovation attitude that exceeds the 84th percentile of the distribution—this is the value 1, that is, at setup, equal to mean + standard deviation (see below). At the beginning of the simulation, these values represent Rogers’s (2003) early adopters in the way specified below by the affiliation rule.
When the precondition above is met, then each agent considers the network of close peers around it (as defined by the parameter range) and compares their attitudes toward innovation to its own. If innovation exceeds the 84th percentile of innovation in the local network, then the agent adopts the innovation. This affiliation rule is designed to make adoption conditional on the dynamics relative to the local network, so that it represents the likelihood that each firm-agent is exposed to a proportion of peers as opposed to the entirety of agent-firms in the environment.
Given the affiliation rule is particularly strict and it serves to define early adopters, the simulation includes a second rule based on agents’ threshold levels (Abrahamson and Rosenkopf 1997), distributed according to a random uniform distribution with [0, 1). The threshold is a value such that when the number of locally available adopters (better, its percentage) is higher than the threshold value of the given agent, an innovation is adopted. A threshold is each agent’s level of sensitivity to innovation. We conceived the level of sensitivity to innovation to be socially constructed through social linkages (e.g., Secchi and Gullekson 2016). The number of adopters considered by each agent are those around the range, as it is specified below. The affiliation rule works together with the threshold rule, in the sense that adoption may occur when the latter is satisfied, when the former is, or when both are.
Our simulation allows agents to be more likely to align their innovation attitudes to those of the case members. This is set to represent a dynamic adjustment of decision makers’ attitudes that gets progressively close to those of the firm of reference. A case member is considered the equivalent of the focal firm in a business ecosystem (Adner and Kapoor 2010) or a supply chain (Dyer and Nobeoka 2000). In other words, we assume that the multiple case researcher selects firms to be included in their study, partly because of their relatively influential role in the network. At every step of the simulation, noncase firms increase or decrease their innovation attitudes on the basis of:
where InnC is the innovation of the case to which the noncase firm is connected, InnO is the innovation attitude of the noncase firm (“own” innovation), and a discount factor D is distributed uniformly at random between [0, 1). This factor D is necessary to allow for some noise in the adjustment, so that it is not excessively automatic. The pseudocode above operates to converge the inner level of the network around the case toward the attitudes of the case.
The parameter mode refers to the two types of comparative case research designs: the ego mode and the com or complete mode. Under the com mode, the number of cases is randomly allocated in a 3D space and connects to other members who are close depending on range settings. Members are free to connect solely depending on their location. They may end up being connected to more than one case, hence serving as connectors or bridges between different cases. Under the ego mode, the noncase organizations tend to converge over the cases that are the closest. Thus, dependency among cases is less likely to occur because the network is more concentrated around cases.
Finally, our simulation represents a multiple-case study designed by a researcher to study the diffusion and adoption of innovation as the main outcome (Table 3). We thus study the extent to which the degree of case dependence has a bearing on the outcome of the researcher’s interest. Due to bounded rationality, the researcher makes assumptions and selects cases on a possible ego or com mode but does not have complete knowledge of the world; hence, there is uncertainty about the cases that are actually ego or com dependent. To do so, our model runs with different configurations of parameters. We also use two measures of adoption. The first measure takes the case as a central point in the analysis; hence, we exclude the influences from members who simultaneously integrate two networks (under the ego mode). The second measure takes into account all members contributing to the diffusion of innovation, regardless of whether they are part of more than one network (under the complete mode).
Simulation Procedures
The configuration of parameters produced a factorial design of 4 × 5 × 2 for a total of 40, where range assumed four values, the number of case agents varied through five values, and there were two modes. We computed innovation for the ego mode and the com mode. This approach was desirable to capture the researcher’s assumption of the degree of case dependence versus the actual degree of case dependence. We investigated whether the ego mode assumption made by a researcher was more suitable when studying networks that were structured in a way that seemed to be more in line with the ego mode and, vice versa, whether the com mode provided more robust results when the com mode was assumed by the researcher.
Time was modeled explicitly by the so-called tick or steps (Fioretti 2013; Secchi 2015). A good approximation in the case under analysis was that of days; thus, every tick represented a day. After a sensitivity analysis, ticks were set at 300 to cover approximately one year. In comparative case research, a full year is also suitable—it allows enough time for the diffusion to occur—and feasible—it allows the tractability of the data analysis. Multiple-case research often deals with multiyear time windows.
Before launching the main experimental design, we engaged in a series of manipulation checks 11 to ensure that the assumptions were coded properly in our ABM simulation. They included an analysis of the characteristics of the network under “ego” and “com” modes, performed with measures of social network analysis (SNA). Namely, we used the mean path length between cases (to measure the length between network nodes using Euclidian distance) and the clustering coefficient (to capture how much alter nodes are actually connected in relation to how much they could potentially be connected to other nodes around them). The “ego” and “com” modes differed significantly. We also did a preliminary study of graphical and analytical effects of parameters on the outcome variable. An interesting outcome of the analysis was that, at the upper bound of range (i.e. 12), it is very difficult to establish whether a network is formed out of an “ego” or a “com” mode. However, this also makes that condition particularly interesting to observe, hence we decided not to exclude this element.
Once settled on a number of conditions for the simulation and in following Snijders and Steglich’s (2015) encouragement to combine inferential statistics and ABMs, we used statistical power analysis to estimate the number of runs for each configuration of parameters (Secchi and Seri 2017; Seri and Secchi 2017). Assuming that the effect size was small (ES = 0.1; Cohen 1992) and setting the standard for the two statistical error types α and β at .01 and .05, respectively (i.e., power = .95), we used the equation of Secchi and Seri (2017) and found that 130 runs reduced the potential incidence of Type II errors. In total, we had 5,200 runs.
Results and Interpretation
Results
In the simulation, we explored the degree of case dependence by examining the effect of the model parameters on the diffusion of innovation, measured by the number of case organizations that adopted the innovation. Cases in the ego mode are largely independent of each other (low case dependence), while cases in the com mode feature high case dependence.
Figures 3 and 4 show an overview of the simulation findings. The y-axis of each plot is calculated on the difference between the proportions of case organizations that adopt the innovation subtracted from the proportion of other organizations in the system that also adopted. When the proportion of innovative case organizations is the same as that of noncase organizations, then there should be 0 (zero); when case organizations innovate at a rate higher than the other organizations, then the value should be above 0, and below 0 in the opposite situation. This measure repeats at every step of the simulation. The graph’s lines show mean values over the 130 runs per each configuration of parameters (Seri and Secchi 2017). Figure 3 presents data on ego and com modes when there is low industry interconnectedness (i.e., range 3 and 6), while Figure 4 shows data when industry interconnectedness is high (i.e., range 9 and 12).

Difference in adoption rates between case and noncase organizations over time (range ≤ 6; color exhibit). (A) Ego mode. (B) Com mode.

Difference in adoption rates between case and noncase organizations over time (range ≥ 9; color exhibit). (A) Ego mode. (B) Com mode. Note: “Time” is represented in days; that is, each “tick” represents a day in the simulation.
The difference between the ego and com modes for low industry interconnectedness is remarkable. While there is a wide difference in adoption rates (≈0.17) in the ego mode (Figure 3A), there is relative convergence over −0.010 and 0.005 for organizations in the com mode. Configurations where there is a wide difference are those with a limited number of cases (<10). In the com mode, if one excludes early oscillations (step < 100) for a number of cases of 2 and 5, the data pattern converges.
Figure 4 shows an important insight. Under high interconnected industries (range ≥ 9), case and noncase organizations display similar patterns of technology adoption. However, there is a meaningful difference when the researchers draw on two cases, with an approximately 0.030 difference in the ego mode and approximately 0.020 in the com mode.
The dependence between case organizations is a function of interconnectedness in the industry in which organizations operate. Although informative, Figures 3 and 4 only provide a general overview of findings. To delve deeper into the results, we perform random coefficient panel regressions. This analysis aids us in comparing com and ego case organizations with their immediate and more distant network structures.
Tables 4 and 5 show the results of four models. In these models, the range is 3, 6, 9, and 12 per each mode when the case organizations are 2, 10, and 20, respectively (four range values × three case values). The outcome under analysis is the number of case organizations that adopt an innovation. The independent variables are the mean and standard deviation of innovation levels for case organizations and the number of other organizations that adopted in the inner network (direct link with a case organization, labeled OF1), in the second indirect order (linked through another organization, OF2), and in the third (there are two organizations between this one and a case organization, OF3—this third-level network is not present for case organizations in the ego mode because the case organizations are unconnected as a distinctive feature of the ego research design). By including inner and outer circles, these regressions show the extent to which case adoption is a function of the actual network configuration and not just the other case organizations the researcher includes in the study.
In Table 4 (com mode) and Table 5 (ego mode), a low number of cases (i.e., number of cases, nc = 2) always relates to an increase in the difficulty of explaining changes in the outcome variable. In fact, in this condition the R2 is always the lowest for all regression models. This result is understandable because, with the number of the outer (noncase) organizations kept constant, as the number of cases increases, the opportunity for contact and interaction also grows. While Table 4 shows that the estimated β coefficients show a satisfactorily low p value (p < .001), on average, what happens in the inner network (OF1) almost always has a larger effect than the outer circle on the outcome variable. There are exceptions in Model 1(c) and in Model 2(c), where the second-order network has a slightly more relevant impact (β = .137 and β = .131).
Multiple Random Effects Panel Regressions in the Com Mode (Cases = 2, 10, and 20; Range = 3, 6, 9, and 12; DV: Number of Case Organization Adopting the Innovation).
Note: nc = number of cases in the simulation model.
a All F statistics significant at the p < .001.
*p < .05. **p < .01. ***p < .001.
Multiple Random Effects Panel Regressions in the Ego Mode (Cases = {2, 10, 20}; Range = {3, 6, 9, 12}; DV: Number of Case Organizations Adopting the Innovation).
Note: nc = number of cases in the simulation model.
a All F statistics significant at the p < .001.
*p < .05. **p < .01. ***p < .001.
These regression models, when considered together, present results that may seem puzzling, at least at first. In fact, in both modes, adoption is a function of adoption in the system independent of whether these are considered ego or com. However, when a small number of case organizations are selected (e.g., nc = 2, see Tables 4 and 5), the explanatory power of the regression models usually declines significantly. In the com mode, there is a gradual leap from models (a) to models (c). In the ego mode, this leap is consistent as one moves from models (a) to (b). In spite of low dependence, when cases increase five times, systemic interactions between the case organizations and the other organizations in the ecosystem drive most of case adoptions. When organizations are more closely connected (range = 12), we find that the difference between assumptions of dependence or independence of case organizations is marginal.
Interpretation
We use the simulation results to reflect on the degree of case dependence for theory building using comparative case research (Eisenhardt 1989; Glaser 1965; Ragin 1992). Our findings suggest that dependence is not secondary when looking at the diffusion of innovation. Arguably, prior research details additional mechanisms that may affect the way in which an innovation propagates in a network (e.g., Young 2009) but, if a network between firms is in place, dependence seems to be an essentially useful research frame. To help interpreting the results in a clear way and to connect them more directly to comparative case research, we use a fictitious situation in which a qualitative researcher approaches a study of 2–50 cases.
As a fictional expedient to help our discussion, let us assume a researcher (e.g., scholar, consultant) is conducting a study of the diffusion of innovation management techniques in startup organizations (i.e., the cases). In one scenario, the researcher has no interest in the degree of case dependence from a methodological or a conceptual viewpoint; thus, knowledge about linkages between case agents is irrelevant. Alternatively, the researcher might be interested in case dependence as a central aspect of building general theory. The researcher may have made assumptions about the degree of case dependence beforehand and selected the cases accordingly. The researcher accesses the range and innovation diffusion rates of the case members (as shown in Figures 3 and 4), together with the number of cases in the study. However, because of her or his bounded rationality, the researcher is not fully aware of other agents in the local network that may affect the case agents, especially in the com mode. The second scenario represents a typical instance where the researcher has limited ability to identify optimally, or sometimes even satisfactorily, the degree of case dependence in a complex empirical setting. There are four alternatives: The researcher assumes high case dependence, and the cases relate to each other (com mode). The researcher assumes low case dependence, and the cases do not relate to each other (ego mode). The researcher assumes high case dependence, and the cases do not relate to each other (ego mode). The researcher assumes low case dependence, and the cases relate to each other (com mode).
We first discuss alternatives 1 and 2. These alternatives refer to instances in which the researcher’s assumption matches the social linkages between cases. Hence, the main issue concerns the interpretation of findings. The researcher might identify propositions that take account of the high degree of case dependence. In a study of two start-up organizations with moderate network extension (range = 6) in the com mode, the researcher might observe the following:
This finding is consistent with the sociological explanation of innovation diffusion as a process based on ties among members (e.g., Abrahamson and Rosenkopf 1997). From our results above, we know that interconnectedness is all the more relevant as the entire industry leans on these connections (i.e., if the range of interactions increases). Taking advantage of in-depth analyses, the researcher would then further examine how innovation actually diffused in these two start-ups.
In contrast, in the ego mode (under alternative 2 above), the researcher may interpret the following:
Unlike Proposition 1a, the claim in Proposition 1b stresses a contingency perspective, specifically how institutional pressure influences innovation diffusion (Abrahamson and Rosenkopf 1993). An in-depth comparative case study has the advantage of specifying how environmental pressures operate.
In the com mode, the variation in the innovation average depends on the two connected case start-ups. Using the regression findings above (Tables 4 and 5) as a basis, Figure 5B focuses on exactly this point by specifying that the innovation levels of the two case start-ups increase as the diffusion of innovation of the other organizations in their networks also increases. The regression lines are similar to those in the ego mode. Hence, in this hypothetical theorizing, either proposition displays low accuracy since innovation derives from the synergic effects of the inner and outer circles that jointly affect the innovation levels of the case start-ups. However, while in the com mode, these networks connect the two case start-ups, in the ego mode, the two case start-ups are unrelated because they share no social linkages.

Linear regressions for case-organization adopters in relation to their own innovation levels and of innovation of other organizations in the inner and outer circle of the network (range = 6, cases = 2). (A) Ego mode. (B) Com mode.

Linear regressions for case-organization adopters in relation to their own innovation levels and of innovation of other organizations in the inner and outer circle of the network (range = 12, cases = 20). (A) Ego mode. (B) Com mode.
Assuming low dependence (i.e., few or no ties) between case start-ups, the increasing levels of innovation accrue from pressure that other start-ups in the network exert on the case start-ups. As shown in Figure 5A, the adoption of innovation by case start-ups is likely to be determined by the mean innovation levels of the case studies. Moreover, the inner and outer network circles seem to relate to the way the case start-ups adopt the innovation.
We then expand the example above (where range = 6 and cases = 2; Figure 5) to comparative case research with a low number of cases. The first general implication of our simulation for comparative case research is as follows:
We now turn to alternatives 3 and 4 from the fictitious example above. These alternatives represent instances of a mismatch between the initial assumption made by the researcher and the social linkages among start-ups. Under alternative 3, a researcher would derive a conclusion similar to Proposition 1a, while under alternative 4, the researcher might propose a conclusion that is similar to Proposition 1b. However, either proposition would be based on an erroneous assumption about case dependence. For the sake of illustration, we take a potential comparative case study, this time with range = 12 and case = 20. Still building on the simulation results, Figure 6B shows regression lines for innovation levels and their impact on diffusion in the com mode, while Figure 6A provides the same information for the ego mode. In the latter, the slope of the lines is mild, where the innovation level of the case start-ups is unrelated to the outer and inner network circles. The case start-ups are actually disconnected from the network, as the combined innovation levels do not influence innovation significantly. In the com mode, the slope is positive and steeper than in the ego mode. Moreover, the direction of the outer and inner network circles’ innovation is consistent with that of the case start-ups. However, the slope is particularly small, suggesting a minor impact.
Based on our experimental findings, we argue that the social linkages between cases are essential for researchers to understand and analyze them as part of the process of building theory from multiple cases. The way in which an innovation is adopted is a function of the inner and outer networks that may (directly or indirectly) link cases together. Hence, whenever theoretical proposition is developed from cross-case comparisons, the question of accesses to information about linkages between cases emerges. Still, there are instances where limited knowledge about case dependence is less problematic. The second implication of our study is as follows:
The experimental results show that when the researcher addresses a limited number of cases, the com mode should be the preferred choice regardless of how low or high interactions among cases are. If adoption depends on the average level of innovation in other cases, then high case dependence is more appropriate when either the com mode or the ego mode is assumed by the researcher (e.g., see the size of β coefficients in Model 1, Table 5, showing how strong the network effects are in those cases). Instead, (a) the ego mode assumption should be the preferred choice when the range of interactions is high and research is conducted with a relatively high number of cases. The relatively flat lines depicted in Figures 5 and 6 (when nc ≥ 10) suggest that the innovation levels of other cases do not influence the percentage of adopters.
Another implication—a corollary of Implication 2 above, perhaps obvious to well-versed case study researchers, is that there is enough information on the dynamics in the industry already with 20 cases (see the high R2 from the regression models in Tables 4 and 5). In practice, researcher often have access to a few cases, but it is precisely when researchers draw on a few cases that information about case dependence is most valuable. Incomplete information about the degree of case dependence might undercut the specification and reporting of newly developed theory.
Given the above evidence, pertinent questions arise: What are the implications of these results for theory building from multiple cases? How can researchers address the degree of dependence in future studies?
Discussion
The comparative case design is instrumental to aid researchers with building theory, but the role of case dependence in theory building appears to be surrounded in confusion in social science methods. To address this shortcoming, we study whether social linkages between cases when building novel insights through comparison and contrast of cases might influence the veracity and transferability of findings. We develop an agent-based simulation based on the survey of comparative case research on innovation diffusion. At a general level, we find that the number of cases and industry characteristics are important factors to take into account when making claims about newly developed theory from comparative case research.
The focus of our analysis is the extent to which researchers’ assumptions about case dependence match their own methodological requirements. Taking innovation diffusion as canonical example where social linkages are a leading source of case dependence (George and Bennett 2005), we argue and show that the neglect of the degree of case dependence in theory building is problematic. As we elaborate below, first, it posits threats to internal validity by undercutting within- and cross-case comparisons used to support theoretical propositions. Second, it hampers transparency in research about whether the degree of case dependence is a critical factor for the generalization or transferability of newly developed theory. Our contributions concern the design and reporting of comparative case studies. Whether this study’s findings based on a general computational model of innovation diffusion are relevant to other topics in the social sciences presents a fruitful direction for future research. The central issue about the researchers’ awareness of the degree of case dependence, of varied types, remains a valid concern for those interested in theory building using comparative case study.
Innovation Diffusion: Why and When Does Case Dependence Matter?
One of our article’s main contributions is a clarification—through a novel use of an ABM—about whether case dependence affects new theory about innovation diffusion against commonly held standards of internal validity in social sciences. Extending discussions in social sciences (George and Bennett 2005; Gerring and Christenson 2017), we show that the degree of case dependence may undercut the internal validity of findings about innovation diffusion. Failure to acknowledge the degree of case dependence might lead to overstating the empirical evidence in support of or against a newly developed theoretical proposition. The issue at stake is one of the validity of claims rather than a prescription aiming at restricting the diversity of ways in which theory is built from multiple cases. We heed the advice from past research that building theory from cases benefits greatly from embracing research approaches (Gehman et al. 2018; Orlikowski 2010).
We augment the literature by examining under which conditions the degree of case dependence, when unattended by the researcher, undercuts internal validity in innovation diffusion studies. We show that specific features of the comparative case design call for great attention by the researcher. Specifically, when the researcher addresses a limited number of cases (<10), the researcher’s assumption of low case dependence is preferred to build theory about innovation diffusion regardless of the range of interactions among the cases. However, the researcher’s assumption of high case dependence is preferred when the range of interactions is high and research addresses a relatively large number of cases.
The influence of case dependence for understanding innovation diffusion is particularly salient in industries where organizations are known to be highly interconnected. Therefore, and making a direct link to our simulation (i.e., affiliation rule and threshold rule), researchers might learn about the implications of case dependence by, for example, taking advantage of in-depth interviews to identify (i) the basis of the individual’s attitude for innovation in case firms, (ii) examples of imitation (or isomorphism) by organizations in the environment, and (iii) organizations operating in a proximity radius to the case firm.
Implications for Comparative Case Research and Its Variants in Social Sciences
We put forward an original proposition that the issues of case dependence are better understood as a researcher’s bounded rationality problem in complex organizational phenomena (Simon 1979, 1990). Researchers’ bounded rationality is not a problem in its own right. Rather, the concern is the possibility that the researcher’s bounded rationality about case dependence cases distorts her or his own assessment of the cross-case evidence in support of a theoretical proposition. It involves a mismatch between the reported versus actual states of innovation diffusion.
We identify two archetypes of comparative case research based on past research: the complete research design and the ego research design. The complete research design denotes a high degree of case dependence, while the ego research design entails a low degree of case dependence. The archetype of comparative case research used by the researcher should be made explicit in future studies. Beyond adding new terms, we argue that the discussion of the degree of case dependence is informative about issues of internal validity and external validity.
The exact implications of the researcher’s bounded rationality vary according to the variants of comparative case research. Figure 7 shows when researchers’ bounded rationality about the degree of case dependence posits a threat to validity in inductive research. Our study specifically focuses on the mismatch between using a complete case design (low case dependence) while evoking an experiment-like analysis of across cases to support new theory propositions. An experiment-like approach has been described as following computational reasoning in social research (Mantere and Ketokivi 2013). However, the discovery of data patterns though eliminative logic might be problematic because possible duplicates across cases are overlooked by the researcher (Eisenhardt 1989). These duplicates stem from meaningful cross-case dependence. By meaningful, we mean social linkages that have a bearing on the research question and the subsequent answer.

Analysis of threats to validity. Note: This study focuses on the mismatch between the use of a complete case design while evoking a computational logic to build theory about innovation diffusion (upper left-hand side corner). Under this approach to theory building, the researchers’ bounded rationality about the degree of case dependence may posit a threat to validity to newly developed theory about innovation diffusion and, more broadly, organizational phenomenon closely related to diffusion processes (e.g., adopted of new practices, and spread of wrongdoing practices). We use ideal types to enhance the communication of the key insights. Our representation is illustrative. Inductive research remains a process of discovery where creativity and transparency are essential.
However, as shown in Figure 7, the authors have also suggested that “we don’t discover theory, we create it!” (Mintzberg 2005:347). This approach to inductive research celebrates the role of the researcher in theory building. The warrants of internal validity are based on researchers’ ability to uncover novel findings, as opposed to “experimental” comparisons between cases (Mantere and Ketokivi 2013). Inductive research is viewed as a cooking without a recipe (Graebner, Martin, and Roundy, 2012) in process of discovery in which creativity is preferred (Gehman et al. 2018). Still, showing the chain of evidence is desirable in qualitative analysis so that the reader can fully appreciate the quality of the data and conclusions put forward (Elman et al. 2016).
Our study has implications for external validity / transferability of findings. The degree of case dependence often entails a boundary condition that, if acknowledged, can actually strengthen the newly built theory. As shown in our review of the literature, the readers of comparative case research often have difficulty in accessing information about whether low versus high degree of case dependence influences the extent to which findings being reported can be transferred to other settings. Clarity about the constructs and their relationships, and the boundary conditions are essential features of a theoretical contribution (Whetten 1989).
Our study further extends the current debate on transparency in comparative case research (Aguinis et al. 2018; Pratt et al. 2020). We caution against practices of declaration. That is, the use of statements such as “we follow the procedures in X paper” is no substitute for an explanation of the analytical procedures that the research followed to draw conclusions from comparative case analysis. Furthermore, we present guidelines for dealing with the degree of case dependence across the main stages of building theory in comparative research design. Our guidelines furnish existing best practices in qualitative research with specific suggestions to reflect the findings of our computational simulation (e.g., industry characteristics). (Online Appendix A provides detailed suggestions.)
When not blindly applied, these guidelines do not preclude creativity in building theory from cases or neglect the role of the researcher’s interpretation during the research process. In fact, researchers’ cognitive processes during theory building encompass the discovery of new theoretical insights that would otherwise be overlooked (Ketokivi and Choi 2014). The values of veracity and transparency have long been at the core of comparative case research in social sciences.
Limitations and Further Research
As with any study, ours has limitations. Theory building follows many processes that cannot be fully captured in a computational simulation. However, our novel use of such a simulation was advantageous to allow a “virtual experiment” (Bruch and Atwell 2015; Burton and Obel 2011). A virtual experiment was needed to manipulate the degree of case dependence, using a few generic mechanisms of innovation diffusion, which is not possible in an empirical setting in terms of qualitative research. More generally, we envisage opportunities to use agent-based simulations in innovative ways to study important questions in sociological methods and research.
We examined a specific variant of the comparative case design: the multiple-case study. Our approach was desirable because the role of cross-case analysis, where cases are treated as “experiments,” has been articulated in past research (Eisenhardt and Graebner 2007; Ozcan et al. 2017). Thus, our suggestions apply primarily to multiple-case study research that uses case comparison as a prime basis for deduction. Future research should explore the issue of case dependence for other variants of this research design. The impact of case dependence might depend on whether researchers use comparative case studies for theory building versus theory elaboration purposes (Ketokivi and Choi 2014). We also call for future studies that examine researchers’ common practices in the research process, thus adding insights into the richness of practices of scientific knowledge production in comparative case design.
While analyzing the extent to which ego and com network structures affect diffusion processes, we found that SNA provides a wealth of information about characteristics of the networks linking and surrounding the case organizations. Future research may further explore these differences, with particular attention to the mechanisms of diffusion other than those based on the threshold rule and affiliation rule. In doing so, future research will extend the debate about social linkages—one we started to address—as well as other sources of case dependence in the context of comparative case research.
Furthermore, we envisioned opportunities for research that specifies types of innovation (e.g., product innovation vs. process innovation) and empirical settings (e.g., public sector vs. private sector). Our simulation treats cases firms are homogeneous entities, so the exact importance of case dependence might vary according to, for example, the type of innovation and the sector of each firm. There are bountiful opportunities to extend our research about case dependence and theory building.
Conclusion
Researchers often justify the use of multiple cases based on comparisons between “experiments” that, in turn, enhance internal and external validity of the new theory. However, researchers face limited ability to identify optimally, or even satisfactorily, the degree of case dependence in complex empirical settings. Drawing on research on innovation diffusion, we created a computational simulation to learn about the implications of the degree of case dependence in building new theory. Our experimental results show that the misfit between meaningful social linkages across cases and researchers’ assumptions about such linkages undercuts theory. Such concern is more salient when research uses a few cases (often two cases) to develop theory about highly interconnected settings. Then, as illustrated in our introductory quote, many cases that appear to be distinct might just be duplicates of the same original. The central issue is not whether Galton’s challenge is relevant, but when it is relevant and how it should be accounted when building theory from cases in social sciences.
Supplemental Material
Supplemental Material, sj-pdf-1-smr-10.1177_0049124120986201 - Theory Building, Case Dependence, and Researchers’ Bounded Rationality: An Illustration From Studies of Innovation Diffusion
Supplemental Material, sj-pdf-1-smr-10.1177_0049124120986201 for Theory Building, Case Dependence, and Researchers’ Bounded Rationality: An Illustration From Studies of Innovation Diffusion by Nuno Oliveira and Davide Secchi in Sociological Methods & Research
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
