Abstract
Outliers are promising candidates for theory building because they defy expected cause-and-effect relationships. Nonetheless, researchers often treat them as a nuisance and exclude them from further study. In fact, our analysis founds only two article using outliers for theory development in all quantitative articles published from 1993 to 2012 in six major management journals, and less than 5% cared to even mention them (relaying reasons for deleting them, mostly). To rectify this, we provide a roadmap for empirical researchers interested in theory building.
Keywords
Most management and organization researchers are familiar with the recurring feeling of unease that comes with finding outliers, defined as “data points that deviate markedly from others” (Aguinis et al., 2013, p. 270). Upon discovery, we often treat them as a nuisance and typically exclude them from further study in an effort to tie up “loose ends” and improve statistical power (Kendall & Wolf, 1949; Pearce, 2002). This common practice needs rectifying since using model misfit in a constructive manner—by rigorous, in-depth analysis of outliers—improves our theoretical understanding of empirical realities (Gibbert et al., 2008) and foregoing such regularly occurring opportunities is counterproductive. Notwithstanding influential calls in organizational sciences for the “detective work” (Freedman, 2008) needed to understand what is special about an outlier (King et al., 1994), there is unfortunately little guidance available on how to “make doubt generative” (K. Locke et al., 2008, p. 907), specifically, how to use outliers for theory building.
This short essay aims to change this. It is not a call to conjure up (new) theory at any cost in the presence of any outlier. Instead, it clarifies which outliers lend themselves best to be used constructively for theory building by outlining a roadmap with four concrete steps. What little previous work there is on outliers has either looked at them from a purely managerial perspective (Valikangas et al., 2016) or provided valuable methodologies for defining and controlling outliers in statistical analyses without addressing how outliers can be used for theory building (Aguinis et al., 2013; Lewin, 1992; Nair & Gibbert, 2016; Pearce, 2002). In a recent article, for example, Aguinis et al. (2013, pp. 287-288) listed distinct examples of “handling interesting outliers” without establishing a systematic technique for pursuing deeper theoretical understanding of such interesting outliers.
Our roadmap (in Figure 1) is based on an analysis of all quantitative empirical articles published from 1993 to 2012 in Administrative Science Quarterly, Academy of Management Journal, Journal of Management, Strategic Management Journal, Organization Science, and Management Science (Valikangas et al., 2016). This search resulted in 308 articles that in some way reported outliers (less than 5% of all quantitative articles in that period). Of these, four articles stood out: Two articles acknowledged outliers explicitly in the discussion section without analyzing them (Carney et al., 2011; Ferlie et al., 2005), and two not only mentioned the outliers but also examined their theoretical implications (Gittell, 2001; Pisano, 1994). These figures suggest that in our discipline, there is a lack of interest in (or, we like to think, a lack of salient tools for) more closely inspecting outliers as an impetus for theory building. While a full review is beyond the scope of this short essay, we would like to emphasize that outlier analysis does in fact constitute a widely utilized theory building strategy in disciplines as diverse as biology (Hagstrum, 2013), comparative politics (Lieberman, 2005), law (Gordon, 1947), medicine (Couzin-Frankel, 2016), and criminology (Sullivan, 2011).

Roadmap for theory building using outliers.
Roadmap for Using Outliers to Develop Theory
Step 1: Examine the Theory-Building Potential of the Outlier
First, “error” outliers need to be differentiated from nonerror outliers. Aguinis et al. (2013) provided a useful definition of error outliers: Data points that lie at a distance from other data points because they are the result of inaccuracies. More specifically, error outliers include outlying observations that are caused by not being part of the population of interest (i.e., an error in the sampling procedure), lying outside the possible range of values, errors in observation, errors in recording, errors in preparing data, errors in computation, errors in coding, or errors in data manipulation. (p. 275)
Error outliers can be removed from the sample as long as this process is transparently relayed. For example, Worren et al. (2002) deleted one outlier and repored that in a post hoc telephone interview, they “called up the respondent submitting these data, who said that he had misunderstood some of the questions in that he had considered, e.g. component sharing between product lines rather than within product lines when filling out the questionnaire” (p. 1132).
Nonerror outliers fall into two fundamentally different categories depending on whether they are characterized only by extreme values on the studied variables or (also) by large model residuals (Aguinis et al., 2013). The former category of outliers is likely to offer only limited theory building potential because they align with the other cases illustrating the relationships between focal variables. In a hypothetical study linking, for instance, intelligence to job performance, these outliers would consist of individuals who are extremely intelligent and perform above average on their job or those who have a very low IQ and show poor job performance. Analysis of these cases is thus unlikely to provide insights that go beyond what a theory based on the less extreme cases would generally predict. In any event, deleting or retaining them should be accompanied by performing robustness checks to inform readers of these outliers’ impact on the results (Aguinis et al., 2013).
Outliers in the second category, namely, the ones that exhibit large model residuals (labeled “prediction outliers” by Aguinis et al., 2013), are particularly valuable for theory building because they are characterized by model misfit and are off the regression line. These outliers are commonly referred to as “deviant cases” precisely because they deviate from theoretical expectations and affect parameter estimates (e.g., slope and/or intercept coefficients). In the multivariate methods that are dominant in organization research—such as linear regression, multilevel modeling, and structural equation modeling (Aguinis et al., 2009), prediction outliers can be identified by calculating indicators like DFFITS, Cook’s D, or DFBETAS (for a technical summary of identifying such outliers, see Aguinis et al., 2013).
In our hypothetical study, prediction outliers would be individuals who, despite high IQs, score low on job performance. Conversely, there might be cases of low IQ with high job performance. Both of these scenarios defy the original predictions as well as the distribution of the main body of observations. Inspecting these data in depth might reveal that employees with below-average intelligence still demonstrate above-average job performance when they score very highly on social skills (and vice versa), suggesting that social skills can compensate for below-average intelligence. Thus, the analysis of such cases may shed new light on inconsistencies in emerging theory and potentially reconcile theoretical predictions with real-world observations (Lieberson, 1992). Model-data conflicts may also point to variables that were not considered in the original analysis but would improve the correspondence between theory and data. Deviant cases do not necessarily invalidate theories outright but may reveal boundary conditions and contingencies (Gerring, 2007).
Pisano (1994) is one of the two articles using outliers for theory building in our sample. Given limited opportunity for learning before doing in biotechnology due to the higher level of novelty prevalent in this area, Pisano expected that investing many project hours in research efforts is unlikely to shorten time to market. Although the general pattern of results supported these assumptions, three biotech projects deviated from the others: Two outlying cases showed short time to market together with many project hours, and the third outlying case showed a long time to market with few project hours invested in research. Upon investigation and after taking additional variables into consideration, he noted that “all three outlying projects were undertaken by organizations with relatively more biotechnology process development experience than the others” (Pisano, 1994 p. 98), which points to the potential moderating role of this experience. The new proposition he derived was that experienced firms have accumulated deeper technical knowledge that can be tapped through research. A firm with little experience may be forced to “learn by-doing” [as compared to learning before doing in experienced firms] until it accumulates enough understanding of the underlying technical parameters and interactions. (Pisano, 1994, p. 98)
Step 2: Determine Analytical Strategy
Determining the most appropriate analytical strategy depends on the number of outliers available. Because there are typically few deviant cases, a qualitative methodology would appear to be most appropriate for revealing hitherto neglected but theoretically consequential variables in a quantitative data set. This allows for inductively refining the predictive power of a theory (Eisenhardt, 1989; E. A. Locke, 2007) while expanding the population of theoretically well-understood phenomena. Sample size and number of deviant cases permitting, a quantitative analysis using statistical methods may be feasible as well. Regardless of whether we employ qualitative or quantitative techniques, the next step in theory development is to compare the prediction outlier(s) with the main body of observations. However, the specific approaches for identifying potential causes underlying a prediction outlier differ between methods that require many prediction outliers and methods that can be performed on only few or even a single prediction outlier. We summarize these options in Table 1.
Analytical Strategies for Closer Inspection of Outliers for Theory Building.
One or few outliers
The research design and analytic method called for when there is only one outlier often starts with a qualitative, in-depth analysis. In the single outlier analysis, investigators ask what factors might lead to the deviant outcome or which preconditions are necessary and sufficient to make a specific kind of outcome possible (Gerring, 2007). They search for causal conditions that are individually necessary and, in combination with other causal conditions, sufficient to explain the outcome. Blatter and Haverland (2012) suggested that this kind of research is Y-oriented; that is, it works backwards from the observed phenomenon to find explanatory factors leading to the deviation. The analysis of a single deviant case can be instrumental in identifying the boundary conditions (e.g., moderating effects) and process explanations (e.g., mediating effects) of a theory that before the detection of the deviant case, remained unchallenged. A weakness of the single outlier analysis is the lack of generalizable conclusions. Glaser and Strauss (1967) pointed out: “saturation can never be attained by studying one incident in one group. What is gained by studying one group is at most the discovery of some basic categories and a few of their properties” (p. 62). In this case, analysis of a single outlier might nevertheless provide the starting point for other authors to follow up on the observed idiosyncrasies in an effort to establish their generality.
A second research strategy is the comparative analysis of not just one but several deviant cases, following “the logic of treating a series of cases as a series of experiments” (Eisenhardt, 1989, p. 542). The objective here is to qualitatively or quantitatively confirm or extend the observed pattern of relationships among variables, with the objective of increasing internal and external validity (Gibbert et al., 2008). Ideally, such an analysis involves theoretical replication involving cases from both ends of a covariational spectrum of possible outcomes, namely, high as well as low values on both dependent and independent variables (Hoorani et al., 2019; Yin, 1994).
Many outliers
If the number of available prediction outliers is sufficient for quantitative analysis, there are many ways for researchers to look for patterns. What constitutes a sufficient number of cases may differ substantially depending on the requirements of the chosen statistical method and the desired level of statistical power (J. Cohen, 1988; Hair et al., 2006). Even though the methods differ when it comes to contrasting prediction outliers with the nonoutlying cases, the common aim is to identify specific differences that may account for the observed deviance through intergroup comparisons (prediction outliers vs. nonoutlying cases). In such intergroup comparisons, we can investigate potential moderator variables (which can be categorical or continuous variables). This normally happens through calculating and comparing the relationship between the independent and the outcome variable, depending on the level (i.e., high vs. low values) or category of the moderating variable (P. Cohen et al., 2003); routinely done via the examination of interaction terms and simple slopes (Dawson, 2014). With regard to the latter, the different slopes resulting for varying values of the moderator variable can be probed in detail through methods like slope difference tests or calculating the regions of significance of the focal relationship depending on the values of the continuous moderator (Dawson, 2014). This search for potential moderators can occur in a deductive way (i.e., based on theoretical considerations after the outlying cases have been identified but prior to the mean differences tests) if a theory suggesting such moderators already exists. This search can also happen through an inductive approach (i.e., the researcher conducts a series of mean difference tests on the available variables that are not based on specific theoretical considerations), which would be useful if no theory suggesting such moderators exists or the aforementioned deductive search for moderators did not yield meaningful findings.
Beyond examining differences between outlying and nonoutlying cases in single variables using a variable-centered approach, there is another option enabling researchers to look for specific configurational patterns in variables that might differentiate outlying and nonoutlying cases. The idea of such case-centered methods (also labeled person-centered approaches in microlevel research) is that the cases in a sample might comprise different subpopulations that possess specific profiles spanning multiple variables (Wang & Hanges, 2011). Woo et al. (2018) gave a useful overview of such case-centered methods. When using outliers for theory building, this approach is particularly helpful to identify the peculiar characteristics of the outlying subpopulation based on the profile of a system of variables. This allows the researcher to identify a more complex interplay of variables rather than focusing on a single candidate variable. Even though less common in organizational sciences, a conventional method of the case-centered approach is cluster analysis (Ketchen & Shook, 1996). Overall, while many procedures exist for carrying out cluster analyses, this method generally involves a more inductive search for profiles underlying the outlying cases (Wang & Hanges, 2011), even though the variables used for the cluster analysis need to be predefined. Depending on the specific focus of the outlier analysis, researchers may choose from a number of clustering algorithms and approaches to determine the number of clusters (Hair et al., 2006; Ketchen & Shook, 1996).
Model-based approaches of cluster analysis that allow for both confirmatory and exploratory applications are latent class procedures such as latent class cluster analysis (Lawrence & Zyphur, 2011; Wang & Hanges, 2011). Even though the confirmatory procedures among them (i.e., confirmatory latent class analysis) have not yet found wide application in organizational sciences, they allow researchers to confirm their theoretical expectations about specific profiles underlying the outlying cases, for example, in the way explained and demonstrated by Schmiege et al. (2018). At the same time, latent class procedures enable the exploratory analysis of cases in the data set. This helps identify specific outlier profiles through estimating multiple models and selecting the best-fit model when we can make no a priori theoretical assumption for such profiles or if we reject this assumption via confirmative tests (Wang & Hanges, 2011).
Bridging qualitative and quantitative analysis
Another method for analyzing multiple deviant cases is qualitative comparative analysis (QCA; Greckhamer et al., 2008; Rihoux & Ragin, 2008), which integrates features of qualitative and quantitative approaches (Ragin, 1987). The basic intent of a researcher using QCA is to provide exhaustive explanations of a phenomenon of interest without dismissing exceptions (Greckhamer et al., 2008; Nair & Gibbert, 2016; Rihoux & Ragin, 2008). When using QCA for outlier analysis, researchers first identify the causal conditions, which could explain each individual deviant case (Rihoux & Ragin, 2008). After examining all deviant cases individually, the researchers will compile every possible combination of causal conditions (a.k.a., “causal recipes”) thus identified. These causal recipes are compared with each other and simplified (often using QCA freeware packages).
Simultaneously, the researchers examine how these causal recipes apply across multiple deviant cases (Greckhamer et al., 2008; Schneider & Wagemann, 2010). The extent to which a causal recipe is applicable might vary from case to case. Researchers then reflect on whether the causal recipe identified is indeed at play in all the different deviant cases. If it is not, they revise the recipe and conduct a new round of QCA until a causal recipe that explains every case is identified (Legewie, 2017).
Step 3: Determine if Necessary Data Are Available
Within any of the analytical strategies chosen, the outliers’ potential for theory development cannot always be fully realized without further analyzing existing data and possibly collecting additional (i.e., new) data. Analyzing existing data means revisiting the raw data from a theoretical perspective that was not included in the original analysis. If the data needed for the chosen strategy were not collected as part of the original research design, new data must be collected. Outliers thus represent the initial spark for a new cycle of iterative theorizing (Eisenhardt, 1989; Gibbert et al., 2008).
The collection of new data may serve two primary objectives. First, it may involve theoretical sampling of additional cases with the purpose of replicating or complementing the deviant case(s). This strategy is most appropriate if the number of available prediction outliers is too low for sophisticated qualitative or quantitative analyses (see the section on single case analysis for the potential weaknesses of this approach). Second, collecting new data may provide researchers with further information on the cases already contained in the data set and help explain why they differ so much from the other cases. The outlying cases should be probed in depth to develop informed expectations about the potential drivers of deviance so that appropriate new variables (i.e., those that are likely to account for the observed deviance) can be systematically included before gathering additional data for all cases.
As an example of revisiting the raw data already obtained to acquire further information, consider the case of Pisano (1994), discussed earlier. To identify factors that could differ between the outliers and the main observations, he examined the existing data from a new theoretical perspective (i.e., the prior biotechnology process experience of the three outlying projects), noting similarities among the outlying cases, which seemed to corroborate the emerging new theory (Pisano, 1994, p. 98).
Gittell (2001) is the second study we found that effectively used outliers for theory building (apart from Pisano, 1994) and provided a nice exemplar where new data were collected following the detection of the outliers. The author initially tested existing theories on supervisory spans and group processes by conducting a regression analysis of cross-functional groups. The results showed that small supervisory spans improve performance through their positive effects on group processes and broad supervisory spans decrease performance. However, there were two outliers. Narrow spans and low performance characterized the first, whereas the broad spans and high levels of performance characterized the second. In a post hoc analysis, Gittell collected additional qualitative data, which pointed to supervisory span being a necessary but not sufficient condition. From analyzing the first outlier, she inferred that while small supervisory spans are normally beneficial for performance, supervisors can also use these small spans in a negative way, thereby hampering performance (Gittell, 2001). Analyzing the second outlier, she concluded that groups with broad spans can achieve strong group process without much supervisory input, at least in the short run, with the help of supporting practices like performance measures that focus on cross-functional accountability, and the selection of group members for team orientation. (Gittell, 2001, p. 479)
Step 4: Develop New Propositions
Theory development using outliers requires researchers to transparently report their sensemaking procedures, including comparisons made with existing theories that confirm or refute the findings and the sequence of analytical steps taken in the research process. This allows the audience to better understand and interpret the findings that led to the development of the new theory. In fact, transparently including the use of outliers as an impetus for theory development constitutes an antidote to problems associated with theorizing after results are known (Bosco et al., 2016). In summary, it is crucial to clearly state that the revised theoretical models imply new propositions that require confirmation with a separate sample.
Discussion
The roadmap we offer for using outliers as theory-building tools starts from the simple fact that prediction outliers are particularly valuable even if not all outliers are theoretically consequential. Depending on their number, prediction outliers accommodate different analytical methods with varying degrees of additional effort and theoretical gains. Our discussion of the different types of outliers and analysis strategies for deviant cases provides a systematic procedure for investigating atypical results to achieve theoretical gains in organization research.
Please note that the disproportionally small numbers of deviant cases (compared to the complete sample) might lead researchers to underestimate their theoretical relevance. Although the number of outlying observations is typically small, the new or refined theory does not only apply to these few cases. Rather, the outliers make the hidden phenomenon most blatantly visible, and it is for this reason that they point researchers toward the various avenues for theory development outlined here.
There may be instances precluding the meaningful incorporation of outlier-based theorizing strategies into an article, even when deviant cases of high theory-building potential are discovered. This may happen, for example, when there are constraints on article length that prevent adding further analyses, researchers are not familiar with the additional methods necessary, or there are insufficient data available to probe the deviant case(s) in detail. Thus, a key question is whether to explore the theory-building potential in the article pertaining to the data set where the outlier occurs or whether it can be “expelled from the current paper to the exclusive challenge of future research” (Shepherd & Suddaby, 2017, p. 22). Empiricists believe that the detector of the outlier also has the right (and perhaps obligation) to offer a first explanation (e.g., Hambrick, 2007). An alternative approach is to acknowledge the existence of outliers and reflect on their theoretical implications (whether in the results section, the limitations section, or as a kind of disclaimer) without actually analyzing them in the article where they first appear (Aguinis et al., 2013; Brutus et al., 2013). This practice effectively delegates their exploration to further researchers. For instance, two articles in our sample transparently identified outliers and acknowledged their value for (future) theory building endeavors (Carney et al., 2011; Ferlie et al., 2005). Either way, outliers must not be ignored.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Swiss National Science Foundation Grant # 100018_134523 awarded to Michael Gibber.
