Abstract
Organizational researchers have increasingly noted the problems associated with nonnormal dependent variable distributions. Most of this scholarship focuses on variables with positive values and long tails, such as employee performance, capital expenses, and assets. However, scholars frequently test organizational theories using dependent variables that include negative values, which is perhaps most prominently the case as it relates to measures of firm performance. Over the course of two studies, we investigate the implications of such nonnormally distributed dependent variables in organizational research. In Study 1, we examine the nonnormality of firm performance measures and uncover extreme levels of skewness and kurtosis that vary substantially across measures, samples, and years. We also illustrate that many transformations scholars use to address nonnormality are ineffective. In Study 2, we create simulations that seek to mirror these distributions, and we find that such extreme nonnormality reduces efficiency and increases Type II errors with most statistical approaches. Our analyses also reveal the effectiveness of quantile regression when modeling dependent variables that exhibit the nonnormal distributions often found in organizational research.
Introduction
Over the past decade, organizational researchers have highlighted the nonnormal distributions of several prominent variables and the corresponding impact on statistical models (Beck et al., 2014; Becker et al., 2019; O’Boyle & Aguinis, 2012). By and large, this research underscores that the distribution of the dependent variable influences the normality of the error term in a regression, such that nonnormally distributed dependent variables might necessitate unconventional estimation approaches (Cohen et al., 2003; Wooldridge, 2020). As Wooldridge (2020, p. 174) describes, “if the errors are random draws from some other distribution than the normal … this is a potentially serious problem because our inference hinges on being able to obtain critical values or p-values.” Organizational scholars often promote variable transformations (e.g., log, square root, etc.) as the primary means to resolve these issues (Crawford et al., 2015; Rönkkö et al., 2022). For the most part, this research on nonnormality focuses on variables with distributions that are right skewed and bounded by zero, such as employee performance, total assets, and revenues (e.g., Becker et al., 2019; Joo et al., 2017).
There exists considerably less knowledge, however, about the effects of nonnormally distributed variables that include negative values. Such nonnormally distributed variables preclude the use of widely used transformations and require approaches that either ignore the distribution of the variable or employ problematic adjustments to eliminate negative values (Becker et al., 2019; Rönkkö et al., 2022). At the same time, organizational scholars often test theories using dependent variables that can take negative values, such as stock market reactions (e.g., Graffin et al., 2016), strategic change (e.g., Zhu et al., 2020), initial public offering underpricing (e.g., Filatotchev & Bishop, 2002), and competitive repertoires (e.g., Connelly et al., 2017).
Although the lack of clarity about nonnormal variables with negative values applies to all content domains, it may have the most profound impact on firm performance, the preeminent dependent variable in strategy research (Henderson et al., 2012). As Gans and Ryall (2017) describe, “Understanding persistent heterogeneity in firm performance is, perhaps, the central objective in the field of strategy.” Uncovering potentially problematic distributional properties may therefore have dramatic implications for an array of theories (e.g., agency theory, resource-based view, and upper echelons) and contexts (e.g., acquisitions, alliances, and boards of directors).
Owing to the ambiguity in the literature about the potentially problematic nature of dependent variables distributed like firm performance, the structure and motivation of our research surround two broad research questions. First, just how nonnormal is firm performance across different measures and time? To investigate this question, in Study 1 we explore the shape of several performance variables over time and across samples. Research indicates that performance measures are associated with skewness and kurtosis (Henderson et al., 2012), but our results extend this knowledge to demonstrate that this nonnormality is dramatic and varies substantively across different measures, samples, and years. Moreover, we find that many transformations used by scholars (e.g., log, winsorizing, etc.) to resolve nonnormality are largely ineffective for variables with distributions like firm performance measures.
With this challenge in mind, the results of Study 1 inform our second research question: How might distributions such as those associated with firm performance influence statistical results? In Study 2, we design simulations whereby we generate dependent variables to mimic the distributions of the variables in Study 1. We examine the efficiency of ordinary least squares (OLS) against techniques designed to account for nonnormally distributed dependent variables—OLS with log transformations, winsorizing, inverse hyperbolic sine (IHS) transformations, robust standard errors, and bootstrapping. Ultimately, we find that all of these oft-adopted techniques to address nonnormality produce inefficient estimates and subsequent Type II errors. We also compare these results to those of quantile regression, which some researchers posit may help alleviate potential issues with nonnormal distributions (Li, 2015). Our simulations reveal that quantile regression represents an attractive analytical technique when analyzing distributions associated with firm performance.
Our research offers several contributions. First, we confirm that the distributions of performance measures are indeed nonnormal, but we find they are nonnormal in different ways. We uncover that skewness and kurtosis vary dramatically across measures, subsamples, transformations, and timeframes. We report a wide array of skewness values, including approximately −67 (ROA), 46 (ROE), and 124 (EPS). The values for kurtosis, which receives remarkably less attention than skewness in strategy research, are even more drastic and typically exceed 1,000 (or even 10,000), which is a far cry away from the kurtosis value of 3 for normally distributed variables. We also note that a vast majority of the variance in our variables is within firms over time and not between firms.
Second, we show that many of the popular transformations used by scholars to normalize distributions are ineffective in creating normal distributions, upholding efficient parameter estimation, and eliminating Type II errors. Specifically, in Study 1, we illustrate that the performance measures remain nonnormally distributed even after transformations like logging or winsorizing the variables. Further, in Study 2, we demonstrate that these transformations do not produce notably more efficient estimates, particularly compared against quantile regression. Taken together, our results suggest that the techniques currently used by scholars to resolve nonnormality issues may prove less effective than commonly thought.
Third, we extend prior work on variable distributions by demonstrating how different types of nonnormal distributions undermine causal inference. Our simulations illustrate that dependent variables with nonnormal distributions induce substantial Type II errors (not rejecting the null hypothesis when it should be rejected) when using linear models, which vary based on the type and extent of nonnormality. These challenges remain even when using widespread techniques such as variable transformations, robust standard errors, and bootstrapping. In contrast, we find that quantile regression outperforms all of these remedies in terms of model efficiency and Type II errors.
Nonnormal Distributions of Continuous Variables
Organizational research has historically focused on variables with normal distributions (Crawford et al., 2015). Simply stated, a normal distribution involves observations that (a) cluster around a stable group mean and (b) disperse out into symmetrical tails with an identifiable variance (O’Boyle & Aguinis, 2012). In recent years, though, scholars have highlighted the importance of nonnormal distributions, which are characterized by departures from a stable mean and finite variance. Indeed, management researchers have examined the nonnormality of job performance in various contexts, including employees, firms, politicians, and other stakeholders more broadly (Beck et al., 2014; O’Boyle & Aguinis, 2012). For instance, Joo, Aguinis, and Bradley (2017) study the performance of individuals, providing a taxonomy of these nonnormal distributions. Similarly, Andriani and McKelvey (2009) offer an overview of other social and organizational phenomena that are nonnormally distributed. Together, these studies illustrate that variables used to study organizations are often not normally distributed.
Measuring (Non)Normality: Skewness and Kurtosis
Statisticians use the idea of moments to describe the normality of distributions (Cox, 2010). Management researchers commonly reference the first two moments—the central tendency (i.e., mean) and variability (i.e., standard deviation)—of each variable when summarizing statistical results. Skewness and kurtosis, which are based on the third and fourth moments of a variable's distribution, have received relatively less attention. To gain a sense for how skewness and kurtosis are calculated, it is first important to understand that the rth moment of variable, y, for a sample size, n, can be calculated as:
Skewness refers to the symmetry or asymmetry of a distribution (i.e., whether the distribution has a longer tail on one side or the other). Although distributional asymmetry is somewhat of “a vague concept” (Cox, 2010, p. 483), positive (negative) skew occurs when the right (left) side of the distribution has a longer tail. A normal distribution is associated with zero skew, which indicates symmetrical tails on both sides of the distribution. Using some of the most popular measures in psychology, Bishara and Hittner (2017) summarized datasets and estimated that the skewness of these measures ranged from approximately −3 to 3. Adapting Equation 1 for the third moment provides a measure of skewness known as g1:
Consequences of Nonnormality
Statisticians note that the distribution of dependent variables is particularly relevant for linear estimators, such as OLS regression, and the numerous techniques that build on its estimation approach (Wooldridge, 2020). One primary assumption of OLS regression—and more sophisticated analytical techniques with similar assumptions, such as two-stage modeling, multilevel models, and panel data models—involves the requirement that the error term follows a normal distribution (Wooldridge, 2020). If the error term is not normally distributed, p-values are not interpretable in the conventional sense because the parameter estimates do not follow a normal sampling distribution. In other words, t-statistics and p-values are intended to provide some insight on the likelihood of a relationship existing in a population but researchers cannot interpret this likelihood if the error term is not normally distributed (Kennedy, 2008; Wooldridge, 2020).
For the most part, research on the impact of nonnormally distributed residuals focuses on the implications for standard errors rather than coefficients. This is due to the fact that when sample sizes are sufficiently large, linear modeling is able to estimate relatively unbiased coefficients despite nonnormal disturbances (Wooldridge, 2020). Standard errors, though, are far less accurate and fluctuate from their true values depending on the nature of the disturbance (Kennedy, 2008). Nonnormal residuals induce Type II errors due to inflated standard errors, a concept referred to as inefficiency (Kennedy, 2008; Wooldridge, 2020). Researchers in a variety of disciplines recognize this fact and tailor their estimation procedures accordingly. For instance, work in sociology (Osgood et al., 1988), political science (Katz & King, 1999), psychology (e.g., Cohen et al., 2003), economics (e.g., Leamer, 1983), and several others have accounted for the fact that residuals are not normally distributed. Scholars usually do this by examining the distribution of their dependent variables, reasoning that if the dependent variable is sufficiently nonnormally distributed, the residuals estimated using linear modeling likely follow suit.
Accounting for Nonnormal Distributions
Given the potential efficiency problems (inflation of standard errors) associated with nonnormal residuals, O’Boyle and Aguinis (2012, p. 105) suggest that “deviations from normality are seen as ‘data problems’ that must be ‘fixed.’” One fix that organizational researchers often employ involves transforming variables to reduce the impact of nonnormality in linear models. These transformations induce “inconstant changes in the units or scale of a variable … with the intent of creating a new distribution that is more normal” (Becker et al., 2019, p. 831).
In their review of such transformations in top organizational journals, Becker et al. (2019) found that approximately 40% of all studies employed at least one transformation. Specifically, they determined that the most popular transformation was the log transformation (comprising almost 90% of all transformations), but they also reported the use of other techniques such as winsorizing, square root transformations, and dropping outliers. Similarly, Rönkkö et al. (2022) reviewed articles published in top-tier journals and found that 66% of empirical articles using regression-type models used at least one transformation, with the log transformation being the most pervasive. Rönkkö et al. (2022) highlighted that scholars may avoid such transformations by using general linear models (GLMs) with nonlinear link functions.
In addition to these two recent studies (i.e., Becker et al., 2019; Rönkkö et al., 2022), we examined published research in two top management journals (i.e., Academy of Management Journal or AMJ, and Strategic Management Journal or SMJ) over the course of the past decade (i.e., 2010, 2015, and 2020) to help better understand how scholars tend to address potential issues stemming from nonnormality. The results in Table 1 confirm these trends over time. We observe that a remarkable portion of the empirical articles 1 published in these journals over the past decade rely on the dependent variable or model transformations that we emphasize in this manuscript. Indeed, 70% (over 60%) of the studies published in AMJ (SMJ) account for nonnormality by transforming the model or dependent variables themselves. Similarly, the usage of transformations appears to have increased dramatically over time. To this point, the extent to which scholars have employed transformations in AMJ (SMJ) in 2020 was nearly triple (double) that of 2010, with all but two empirical studies employing transformations in AMJ in 2020.
Content Overview of Nonnormality in Published Research.
Note: We analyze empirical articles that feature at least one variable that could potentially assume a nonnormal distribution. Accordingly, the counts depicted in the rows “Empirical articles in our analyses” do not include qualitative research, meta-analyses, theory-exclusive manuscripts, or editorials. “Articles with transformations” depicts whether an article featured at least one dependent variable or model transformation of the ones we examine in our research, not the total number of transformations. To this end, it is possible for the number of different specific types of transformations to exceed the total number of articles with transformations, as any given article could have featured more than one.
Table 1 also reveals several different techniques scholars have employed to account for nonnormal distributions over the past decade—that is, robust standard errors, bootstrapping, logging, winsorization, quantile regression, and inverse hyperbolic sine. All these procedures are well documented, so our aim here is to provide a brief overview.
Despite the popularity of winsorizing, it is fallible since it replaces values of observations that may be empirically problematic but conceptually salient. As Aguinis et al. (2013, p. 271) contend, “outliers can also be of substantive interest and studied as unique phenomena that may lead to novel theoretical insights.” To this point, the three most profitable firms in Execucomp in 2020—Apple, Microsoft, and Berkshire Hathaway—had net incomes of US$57.4 billion, US$44.3 billion, and US$42.5 billion, respectively. Winsorizing transforms all these values (as well as for every other firm in the top 1%) to the net income of the 99th percentile, which is US$7.5 billion. While we use net income in this example, this transformation dilutes top (and bottom) performers regardless of the measure. Although this dilution may not always present problems, this example illustrates the potential challenges that may result from winsorizing.
Cameron and Trivedi (2010) highlighted several advantages of quantile regression. First, as compared to OLS, quantile regression is less sensitive to outliers, in part because it was created to help account for skewed distributions (Li, 2015). Second, whereas OLS requires a normally distributed error term, quantile regression avoids assumptions about the parametric distribution of regression error terms. Third, while quantile regression permits the examination of the effects of regressors at different points along the distribution of the dependent variable, it also estimates parameters analogous to OLS except reflecting the median effect instead of the mean.
Summary
Almost without exception, research on nonnormality in organizational contexts focuses either on variables that range from zero to positive infinity or require potentially problematic scaling to create positive variables (e.g., Becker et al., 2019; Joo et al., 2017). At the same time, we lack an understanding of the effectiveness of the techniques summarized in this section when nonnormal distributions include negative values. In the following two studies, we, therefore, investigate nonnormality in the context of firm performance, which is one the most prominent variables in strategy research that scholars have suggested follows a nonnormal distribution (Henderson et al., 2012). In the first study, we examine the nonnormality of various performance measures, and in the second study, we leverage our findings from Study 1 to investigate the efficacy of techniques to address problems with nonnormal distributions.
Study 1: The Shape and Normality of Firm Performance
Strategy research seeks to answer the questions: “What drives the performance of an organization? Why do some organizations succeed while others fail? And what, if anything, can managers actually do about it” (Makadok et al., 2018, p. 1530). Accordingly, a vast body of empirical scholarship examines the antecedents of firm performance, with as much as three-fourths of all published strategy articles featuring it as the dependent variable (Hamann et al., 2013; Richard et al., 2009). Scholars typically operationalize firm performance using accounting (e.g., ROA, ROE, ROS, and margins) and/or stock market (e.g., shareholder returns, market-to-book ratio, and analyst evaluations) indicators (Dalton et al., 1998; Hamann et al., 2013). Researchers have also recently expanded this to include social approval assets and external perceptions of the firm as another measure of success (Gamache & McNamara, 2019; Zavyalova et al., 2016).
Despite the salience of firm performance in strategy research, there remains mixed evidence regarding how firm activities influence subsequent performance. To this point, Makadok et al. (2018, p. 1530) argue that “the strategic management field has proven incredibly inconclusive” about what actually drives performance heterogeneity across research domains. Representing one potential reason for this conflict, Miller et al. (2013) conducted a meta-analysis and reported “remarkably weak” correlations among various measures of firm performance, particularly across categories (i.e., stock market and accounting measures).
In this Study 1, we thus examine how the distributional properties of performance variables may help explain these inconsistencies. To this end, we explore how skewness and kurtosis can quantify the nonnormality of a variety of commonly used performance measures, as well as the efficacy of transformations to normalize firm performance variables. Taken together, in Study 1, we examine the (non)normality of firm performance by asking:
RQ1: How do the skewness and kurtosis of firm performance variables vary across measures, subsamples, and transformations?
Methodology of Study 1
Results of Study 1
Table 2, which details our outcomes for the variables included in our sample, illustrates a number of findings that warrant attention. Notably, skewness and kurtosis varied considerably across measures. For example, ROA was negatively skewed, while EPS and M/B were positively skewed. Similarly, the kurtosis for EPS (18,760.59) was nearly 40 times the kurtosis for NI (489.37). Overall, Table 2 illustrates the incredible divergences in skewness and kurtosis across several variables that are intended to reflect the same construct of firm performance.
Distribution Parameters of Performance Measures.
Note: The sample sizes in Table 2 reflect all firm-year observations covered in both Compustat and Execucomp databases for which at least one performance measure was available and are not missing values for the “SPcode” indicator in Execucomp. The reason the sample sizes diverge slightly across the different measures is due to missing data for each respective indicator.
Complementing these statistics, in Figure 1, we present kernel density plots to illustrate the distributions. The horizontal axis for each plot represents the actual range of values in the sample for each variable, and the vertical axis represents the density of observations featuring the value. Figure 1 reveals that ROA and Tobin's Q are negatively skewed and have high peaks, whereas ROE, EPS, and M/B have high peaks and are positively skewed. This figure also shows that NI has the least skewness and one of the lowest levels of kurtosis. In fact, the extreme nonnormality makes the central regions of the distributions in these plots difficult to visualize.

Performance measure distributions: full sample.
We also examined the distributions of the performance variables across different index subsamples—S&P 500, MidCap 400, and SmallCap 600. We found substantial variation in both skewness and kurtosis for the same variable across samples. The distributions of ROE for each subsample are shown in Figure 2. 4 As depicted, the magnitude of kurtosis for ROE ranged from 1533.83 for MidCap firms to 9295.37 for the full sample. In addition to magnitude, skewness also varied in a direction across samples. ROE skewness was negative (−73.90) for the SmallCap subsample and positive (46.44) for the S&P 500 subsample.

Trends in skewness & kurtosis of ROE across subsamples.
We also show in Table 2 that transforming the variables does not appear to result in normal distributions. For instance, the winsorized values (1st & 99th percentiles) of net income still were associated with nonnormality in terms of both skewness (4.63) and kurtosis (28.14). The same holds true for ROA, which also actually exhibited increased kurtosis from 9476.61 to 37579.74 after logging the variable and adding the most negative number. These general trends were fairly consistent across all measures of firm performance. 5 Similarly, in unreported supplemental analyses, we found that these patterns persisted even when adjusting the measures for industry performance or standardizing based on industry membership.
We also examined skewness and kurtosis for the different performance measures over time. Figure 3 shows how skewness and kurtosis of ROA varied in both magnitude and direction over time. For instance, skewness (kurtosis) for ROA in the full sample was approximately 45 (2050) in 2007 and −20 (515) in the following year. These findings are particularly compelling when coupled with the intraclass correlation coefficients (ICCs) associated with firms in our study (we do not report these ICCs for the sake of brevity). The ICCs for all of the variables reveal that an overwhelming majority of the variance in each measure exists within firms over time and not between firms. 6 In fact, the average amount of between-firm variance for the variables in our study is approximately 7%, meaning almost all (i.e., about 93%) of the variation we observe is within firms over time.

Trends in skewness & kurtosis of ROA in the full sample across years.
Discussion of Study 1
Research suggests that firm performance variables are not always normally distributed (Henderson et al., 2012; Makino & Chan, 2017), but in Study 1 we extend this scholarship in two crucial ways. First, our findings highlight there is not a simple dichotomy that distinguishes normal from nonnormal distributions. Instead, our results indicate that skewness and kurtosis vary considerably across different performance measures, subsamples, transformations, and time. In fact, some variables are skewed positively while others are skewed negatively. In contrast to the skewness range of approximately −3 to 3 of some of the most prominent psychology measures (Bishara & Hittner, 2017), we found that the absolute value of skewness actually often exceeded 50 for our performance variables. Similarly, while Bishara and Hittner (2017) note that kurtosis in psychology measures ranged from −1 to 40, we found that it often exceeded 1,000 (and commonly above 10,000). Our results thus suggest that performance measures are not just nonnormal, but exceedingly nonnormal compared to other related fields.
Second, and perhaps more importantly, our findings suggest that many performance measures remain nonnormally distributed even after different transformations that are popular in organizational research. In particular, while winsorizing the variables slightly reduced nonnormality, the values of both skewness and kurtosis for the winsorized versions of each measure still indicate extreme nonnormality (for both ratio and nonratio measures). Moreover, log transformations at times increased the degree of the nonnormality, and some variables even changed the overall direction of their skewness after applying the transformation. These findings illustrate that the transformations organizational scholars routinely employ do not meaningfully address the incredibly nonnormal distributions endemic to popular firm performance measures.
Taken together, the outcomes from Study 1 indicate that there is no single way to describe the nonnormality of the distributions of performance measures, making it challenging to offer specific recommendations or diagnoses about techniques to account for a particular variable. Accordingly, we do not know what our results might mean for researchers who use such nonnormally distributed variables as dependent variables in statistical models. With that in mind, in Study 2 we created simulations to better understand the effectiveness of popular techniques to address nonnormality—OLS, winsorization, log and IHS transformations, robust standard errors, bootstrapping, and quantile regression—in empirical estimation when variables follow distributions like those we uncovered in this study.
Study 2: Modeling Dependent Variables with Extremely Nonnormal Distributions
The results of Study 1 reveal that measures of firm performance are associated with extreme levels of skewness and kurtosis. As we described previously, research in statistics (Cohen et al., 2003) and econometrics (Wooldridge, 2020) has consistently documented that nonnormal disturbances, which stem from nonnormally distributed dependent variables, tend to inflate standard errors and induce inefficiency. Our focus in this Study 2, then, involves examining whether this remains the case—or is especially true—with the remarkable nonnormal distributions we uncovered in Study 1, as well as understanding the effectiveness of the techniques organizational scholars typically adopt to address this issue. Stated plainly, we investigate the following research question:
RQ2: How do dependent variables with high levels of skewness and/or kurtosis, as well as techniques to account for nonnormal distributions, influence parameter estimation?
Methodology of Study 2
We designed simulations to help examine the relative efficacy of techniques to account for nonnormally distributed error terms. Our goal with these simulations is to compare parameter estimates—paying careful attention to the standard errors, which provide insight into the efficiency of the estimator—across the transformations, adjustments, and modeling techniques we described previously within the same type of nonnormality.
The purpose of this simulation was to generate a dependent variable, y, that followed a nonnormal distribution and was associated with an independent variable of interest. As we describe later, we then substantively varied the distribution of y to help mimic the nature of nonnormality in the dependent variables from Study 1. Creating y first involved generating a random independent variable, x, which followed a normal distribution with a mean of 0 and a variance of 1. We then used Equation 4 to create y as a function of our independent variable, x, as well as an error term, e, which was generated to reflect different nonnormal distributions:
Intuitively, the key to understanding this data generation process is that the distribution of y is largely a function of the distribution of e, which is consistent with other research on the topic that has employed different distributions to emulate nonnormality (de Winter et al., 2016; Westfall, 2014; Wright & Herrington, 2011).
To generate our nonnormal error terms, we specified several conditions using the skewed generalized t distribution using the -sgt-package in R (Davis, 2015). This distributional package is attractive for our purposes because it allows us to simulate distributions with negative and positive values while also varying both skewness and kurtosis (McDonald & Michelfelder, 2017). The command in R includes five parameters: mu (i.e., mean), sigma (i.e., variance), lambda (i.e., skewness), and two parameters (p and q) to denote kurtosis. Because the values for the parameters do not correspond exactly with skewness and kurtosis statistics, we employed an iterative process to identify our final conditions that mirror the distributions from Study 1.
We created six conditions to better understand how dependent variables with different (both normal and nonnormal) distributions might influence OLS regression. In Table 3, we display each condition and its corresponding simulation code. Condition 1 is our baseline, which represents a normal distribution; Condition 2 represents moderate (positive) skew and moderate kurtosis; Condition 3 represents moderate (negative) skew and moderate kurtosis; Condition 4 represents no skew and high kurtosis; Condition 5 represents high (positive) skew and high kurtosis; Condition 6 represents high (negative) skew and high kurtosis. 7 Table 4 displays the descriptive statistics and kernel density plots across all 10,000 iterations for each condition.
Simulation Conditions.
Note: We used the seed number 12345 to establish the starting point for the pseudo-random number generation process.
Summary Statistics and Simulation Results of Ordinary Least Squares (OLS) an Quantile Regression Analysis.
Results of Study 2
Table 4 displays the results of our simulations. We observe a relatively stable B at or near values of 0.050 across Conditions 1–6, which is consistent with the idea that nonnormal residuals do not bias coefficient estimates. The slight exceptions to this involve the log and IHS transformations, which fundamentally change the values of the variables and confer different interpretations of the coefficients. Consequently, the B values deviated from the true value in the log and IHS models across the conditions.
We assessed efficiency, which may indeed fluctuate due to nonnormality, via SE and PerSig. Because the error term changes in each condition, it is important to compare these efficiency metrics across the models within and not between each condition. Our findings for Condition 1, which represents a normal distribution, indicate that estimates from OLS regression in its Original form and virtually all of the other parametric models (i.e., all the transformations and bootstrapping) are approximately equally as efficient. Indeed, the PerSig values for each of these models are almost identical at about 60%. By contrast, quantile regression appears slightly less efficient, with PerSig ranging between approximately 28% (10th and 90th percentiles) and 43% (50th percentile). Taken together, these results reinforce that OLS is the most efficient estimator when all assumptions are upheld (Kennedy, 2008). Our results for Condition 1 reinforce that quantile regression underperforms OLS in terms of efficiency when the assumptions of OLS are met.
In Conditions 2–6, we introduced error terms with various nonnormal distributions, which are summarized in Table 4. In every condition we simulated, we observe inefficiencies associated with six variants of OLS modeling (Original, Winsorized, Log, IHS, Robust standard errors, and Bootstrap) compared to quantile regression. To this point, the PerSig values for all nonquantile models are notably lower than their counterparts at various percentiles in quantile regression. For example, in Condition 6, the PerSig of the Original model was 15.22%, whereas nearly all the percentiles in the quantile regression (except the 10th percentile) outpace the OLS estimation with no transformations or adjustments.
On average across Conditions 2–6, the six different OLS specifications estimated statistically significant parameters in about 25% of the iterations. In contrast, across the same conditions, the three different points of quantile regression produced statistically significant parameters in about 44% of the iterations on average. Given that the B outcomes were consistent across the models and the conditions, this means that researchers seeking to test hypotheses would fail to rightfully reject the null hypothesis more often with OLS specifications than with quantile regression.
Discussion of Study 2
Nonnormally distributed error terms reduce the efficiency of OLS (and thus other linear models), and our results help to understand the relative effectiveness of techniques used to mitigate this problem. Not surprisingly, OLS performs the most efficiently when the dependent variable follows a normal distribution, such as in our baseline Condition 1. In this condition, transformations did not improve the efficiency of OLS in any meaningful way because the error terms were normally distributed. And although the log and IHS models produced different coefficient estimates owing to the data transformation, the efficiency remained almost identical. At the same time, quantile regression was less efficient than OLS and its related approaches when the residuals adhered to the strict assumptions of normality.
The relative efficacy of OLS—and its several variants using popular techniques to address nonnormality—and quantile regression invert when the dependent variable displays different levels of skewness and kurtosis. In Conditions 2 and 3, we feature moderate levels of skewness and kurtosis, and quantile regression was more efficient than the several OLS specifications. In Conditions 5 and 6, in which we feature high levels of skewness and kurtosis, quantile regression is again correspondingly incredibly more efficient than the OLS specifications. Even our bootstrapping models estimated parameters and displayed levels of efficiency much more in tune with the OLS models than with quantile regression. In other words, although bootstrapping represented the only estimation technique that did not transform the data or estimates in some ways, it also did not substantively enhance efficiency, at least to the same degree as did quantile regression.
Of particular note is Condition 4, in which we isolate the effects of kurtosis, a characteristic of nonnormality that receives far less attention in the literature than skewness. In this case, nearly all of the techniques to address the nonnormality outperformed OLS in terms of efficiency, but quantile regression was particularly effective. For the first time to our knowledge, our simulation reveals that nonparametric estimation like quantile regression enhances efficiency in the absence of skewness but in the presence of kurtosis. This finding is particularly important considering that Li's (2015) article on quantile regression references skewness (or a close variant) 23 times but does not mention kurtosis. Moreover, given the extreme kurtosis values uncovered in Study 1, we think this is an important contribution to research examining firm performance as a dependent variable.
Figure 4 illustrates the intuition underlying quantile regression by displaying the standard errors estimated by quantile regression at different percentiles of the dependent variable (across our conditions). In particular, Figure 4 underscores the association between the nature of skewness and corresponding standard errors. For instance, in Condition 2—the distribution of y has a long tail that extends to the right (Table 4)—the standard error increases as the tail moves to the right and decreases on the opposite side of the distribution where the probability mass is highest. The opposite pattern holds true for Condition 3, where the distribution of y has a long tail that extends to the left. We observe similar patterns for Conditions 5 and 6, which also include positive and negative skewness. Taken together, the lines in Figure 4 demonstrate that quantile regression is most efficient when estimating relationships near the median (or the highest probability mass) value for the dependent variable.

Standard errors estimated by quantile regression across percentiles and conditions.
Discussion and Conclusion
Our work offers several contributions to research examining the impact of nonnormality on causal inference as well as empirical research examining the nonnormality of firm performance. We extend work on the distribution of firm performance (Henderson et al., 2012; Makino & Chan, 2017) by showing that both skewness and kurtosis vary dramatically across measures, samples, years, and transformations. Stated simply, there is no single way to describe the nonnormality of firm performance—an idea that to our knowledge has not been discussed in strategy research. Moreover, our results reveal levels of skewness and kurtosis that are well beyond the levels discussed in other disciplines (Bishara & Hittner, 2017; Cohen et al., 2003; Osgood et al., 1988). We also document the ineffectiveness of many techniques that organizational scholars use to resolve such extreme nonnormality. Specifically, our results from Study 1 suggest that even after using log transformations and winsorization, performance variables often remain nonnormally distributed.
To understand how the types of distributions reported in Study 1 influence casual inference, in Study 2 we designed simulations to examine the effectiveness of analytical techniques that scholars use to analyze nonnormally distributed dependent variables. We find that the use of linear regression with such dependent variables results in larger standard errors, which increase confidence intervals, p-values, and Type II errors. Moreover, these challenges remain even after employing transformations used to address nonnormality. With this in mind, we can only imagine how many manuscripts are in the proverbial “file drawer” as a result of such nonnormal variables. Our results may at least partially explain the conclusions that strategy research remains inconclusive about the drivers of performance heterogeneity (Makadok et al., 2018).
Our results also demonstrate that quantile regression is better suited than OLS to study such nonnormally distributed dependent variables. The outcomes in Study 2 highlighted that quantile regression was more efficient than OLS, winsorization, log transformations, IHS transformations, robust regression, and bootstrapping in all conditions in which the dependent variable had a nonnormal distribution. In addition to being more efficient, quantile regression also provides information about the effect of the independent variable at additional points of the distribution of the dependent variable, such as the 10th or 90th percentile. Moreover, researchers can enjoy the benefits of quantile regression without transforming values—and the potential meaning—of the dependent variable. This is especially advantageous given that firms in the top or bottom of the tails for any particular performance distribution are likely theoretically interesting.
Taken together, the results of our two studies present a counterintuitive perspective for researchers examining firm performance. Undoubtedly, scholars in strategy are particularly interested in uncovering common characteristics and practices of the most highly performing companies (Henderson et al., 2012). Study 2 suggests that quantile regression is well-suited for analyzing skewed data, but counterintuitively, its advantages relative to OLS are particularly pronounced when analyzing data at the end of the distribution opposite the long tail. In other words, the advantages of quantile regression relative to OLS when examining high-performing firms increase when the data are negatively skewed. Study 1 indicates this is generally the case with strategy research.
Recommendations for Addressing Nonnormal Dependent Variables
The implications from the outcomes in Studies 1 and 2 are clear: Firm performance measures, even when transformed, exhibit remarkable degrees of nonnormality (i.e., Study 1), and the techniques organizational scholars routinely employ to account for nonnormality prove impotent at best and deleterious at worst (i.e., Study 2). What remains less clear is precisely how researchers confronted with nonnormal dependent variables may proceed in their own work. In this section, we integrate research on transformations and nonparametric models with the implications from our studies to provide recommendations for researchers examining dependent variables with extreme nonnormality.
Another primary advantage of quantile regression involves understanding how the effect of an independent variable may vary along different points of the distribution of the dependent variable (Koenker & Hallock, 2001; Li, 2015). To this point, Beck et al. (2014) contend that, at least in their context of employee performance, it is valuable to account for the range of values in case seemingly problematic areas of the distribution are theoretically rich or practically salient. A logical question stemming from this recommendation, however, involves deciding which quantiles to examine and report. Although we reported the results at the 10th, 50th, and 90th percentiles, we could just have easily examined the 1st and 99th percentiles or any other pairing. With this in mind, in Figure 4, we graph the standard errors from the 10th through the 90th percentile in each of our six conditions. As we demonstrate in Figure 4, the nature of skewness and kurtosis plays a vital role in which percentiles estimate more efficient parameters.
At the same time, we do not have specific recommendations about which particular percentiles to report from quantile regression, as the most appropriate level appears to be contingent on the nature of the data (see Figure 4). This is perhaps why there remains debate in other disciplines about best practices for reporting results from a quantile regression (Petscher & Logan, 2014). We do, however, encourage researchers to consult the standard errors across all of the percentiles, just as we illustrate in Figure 4 (e.g., Andriani & McKelvey, 2009). And as we pointed out in the previous subsection, we suggest that the median (i.e., 50th percentile) should be included in each quantile regression since it represents a natural analog to comparable approaches that focus on average effects.
Future Research
Another area of future research involves extending the data generation process we employed to account for a multilevel context. As we noted previously, we calculated ICCs for each of the performance variables in Table 2 to better understand the amount of between-firm relative to within-firm variance in each variable. These ICCs revealed that almost all of the variance in the constructs is within-firm over time rather than across firms. Future researchers might further examine this issue and determine the multilevel components of skewness and kurtosis. This would, in turn, potentially allow for simulations that investigate multilevel approaches to quantile regression.
Conclusion
The primary take-away from our research is this: When examining performance as a dependent variable, or another measure that exhibits such extreme nonnormality, research using statistical models that strategy scholars tend to favor may be doomed from the start. Given that scholars rarely empirically test relationships involving firm performance using nonparametric techniques, it is possible that the methodologies commonly employed by strategy scholars have undermined knowledge accumulation and the ability to replicate results. Our research works to right this ship by offering some insight into how to determine when nonnormality represents an issue for causal inference and how to resolve the problems with nonparametric models.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
