Abstract

The good news is clearly that Pornprasertmanit and Little (2012) consider the idea viable that directional dependence can be determined based on indicators of skewness and kurtosis. This idea goes back to Dodge and Rousson (2001; cf. Dodge & Yadegari, 2010) and was introduced to researchers in applied statistics and the social and behavioral sciences by von Eye and DeShon (2008, 2012). The latter authors also proposed new tests and a decision strategy. There is more good news. The discussants propose a more detailed alternative decision strategy that is based on excess kurtosis instead of standard kurtosis, demonstrate the deleterious effects that lurking variables can have, and present a threshold below which it may be pointless to attempt to establish directional dependence. In addition, Pornprasertmanit and Little recommend distinguishing skewness and kurtosis at the population and sample levels. From our perspective, all these are worthwhile elaborations and contributions to the development of an exciting new methodology. We are grateful that the discussants undertook the effort to develop these elaborations and to present them to a readership in the social and behavioral sciences.
In this rejoinder we discuss the decision strategies proposed in our original article and the ones proposed by the authors of the commentary. In addition, we address issues concerning lurking variables. We begin with the decision strategies.
Decision strategies
Unfortunately for the user, the authors of the two articles tend to prefer different decision strategies for the process of determining directional dependence. The strategies rest on different variants of the definition of kurtosis. Conceptually, let the nth central moment of the random variable, X, be
Taking a more general perspective, we note that the rules proposed by Pornprasertmanit and Little (2012) and von Eye and DeShon (2012) both require that the explanatory variable be more extreme in skewness or kurtosis (or both) than the outcome variable for a decision about directional dependence to be defensible. Whether the reference value for the evaluation of kurtosis as extreme is 0 or 3 is of marginal importance. Some might even argue that a reference point 0 leads to more straightforward interpretation because differences go in just one direction, but there is a subjective element in such preferences. It is important that the kurtosis values deviate (significantly) from their reference values. This deviation is the key argument in the rules proposed by both author groups.
Naturally, when one form of kurtosis is estimated and the reference point for another is used to make a statistical decision, the decision comes with a high error probability. This is one point of caution that can be derived from the critique presented by Pornprasertmanit and Little (2012). In contrast to what Pornprasertmanit and Little conclude, however, it is not a problem to use 0 as the reference value for kurtosis. It is the use of this reference point for the concept of excess kurtosis that can result in problematic statistical decisions. Conversely, it would be a problem to use the value of three as a reference when the decision is based on standard kurtosis. In brief:
for kurtosis, use the value of 0 as reference;
for excess kurtosis, use the value of 3 as reference.
One can require that the ratio of the skewness or kurtosis measures be significantly different than their expectancies. This, however, should be self-understood. One wouldn’t base decisions as important as about casual directions on tiny numerical differences. Pornprasertmanit and Little (2012) deserve credit for making this explicit.
There is one more element of the decision strategies discussed in the commentary that is of relevance and deserves attention. This is the element that the variables under study, X and Y, be significantly related to each other before it is discussed whether one functions as cause and the other as effect variable. It is plausible to require that X and Y be related. Therefore, we agree with the authors of the commentary and add this condition to the decision strategy proposed in our article. In fact, a significant relationship should be a precondition. If it is not fulfilled, all the following steps (calculating and comparing skewness and/or kurtosis and making a decision concerning the direction of effect) are pointless. In the blood level attention deficit hyperactivity disorder (ADHD) example in the von Eye and DeShon (2012) article, both indicators of ADHD were significantly related to blood level. Specifically, the t-value of the regression coefficient of inattentiveness onto blood level (or vice versa) is 2.198 (df = 148; p = 0.029). The corresponding value for hyperactivity is 3.978 (df = 148; p < 0.01). Similarly, the t-value of the regression coefficient of depression onto post-traumatic stress disorder (PTSD; or vice versa) is 8.071 (df = 202; p < 0.01). So, in these examples, the precondition was fulfilled. The same applies to the other examples.
An additional issue raised by Pornprasertmanit and Little concerns the possible patterns of directional dependence. Five possible patterns are:
X is the explanatory variable and Y is the outcome variable;
Y is the explanatory variable and X is the outcome variable;
there is reciprocal dependence;
directional dependence is undetermined; and
there is no directional dependence.
If the preconditions of lack of normality and a significant relationship between X and Y are fulfilled, at least one of the variables—X or Y—is significantly skewed or significantly extreme in (excess) kurtosis, and one of the discussed decision-making strategies is properly executed, a decision concerning outcome patterns 1 and 2 will be possible (note that tests of the difference between two measures of skewness or two measures of kurtosis still need to be introduced into the context of assessing directional dependence). Lack of power, the suspicion that lurking variables may distort the true relations, sampling problems, invalid data, or other threats can be among the reasons why directional dependence is undetermined (outcome 4). If, however, both X and Y are normally distributed, one plausible decision can be that there is no directional dependence (outcome 5). Even if both variables are skewed or extreme in kurtosis, the methods discussed here may not allow one to determine reciprocal dependence (outcome 3). This can occur, for instance, when the ratio of two measures is close to 1.0 when both measures suggest deviations from normality. Researchers should be open to each of these outcomes. In addition, without theoretical foundation, it is a high risk enterprise to decide that two variables are in a dependence relationship to each other.
When deviations from normality are examined, there are multiple options to perform the examination, and there are even more possibilities to deviate from normality. Ito (1975, p. 199) states that “assumptions… can be violated in many more ways than they can be satisfied.” Our article as well as the articles by Dodge and Rousson (2000, 2001) and Dodge and Yadegari (2010) focused on deviations in terms of significant skewness and kurtosis because these measures allow one to derive conclusions about the direction of dependence. The general aim in all these articles was to determine whether a variable deviates from normality. Considering the large number of possibilities to deviate from normality (see, e.g., the local deviations that were discussed by von Eye and Gardiner, 2004), it is hard to understand why exactly both skewness and kurtosis should be violated. These measures target different distributional characteristics that can lead to violations of the normality assumption. If just one of these two characteristics suggests significant deviation from normality, one would conclude that a variable is non-normal. Adding that the respective other characteristic also suggests non-normality (or not) will not change the verdict, unless one entertains specific hypotheses about the nature of the deviation. We therefore stand by our recommendation that the decision as to whether the data can come from a normally distributed population can be made when just one of the measures or the composite proposed by D’Agostino and Pearson (1973) suggests significant deviation. Still, one can be open to discussions of differential implications when both skewness and kurtosis indicate non-normality as compared to only one of the measures.
Clearly the information from the respective other measure provides detail. However, we assume that this detail will not change the decision concerning normality, and neither will it change the decision concerning directional dependence. Requiring that both skewness and kurtosis be used to make a decision on directional dependence is, therefore, from our perspective, not necessary.
The final point of discussion of the decision strategy for directional dependence concerns longitudinal designs. The idea behind the methods discussed here is that if X, the explanatory variable, has a non-normal distribution, and the residual term is normally distributed, then Y, the response variable, is a linear convolution of these two distributions. As we say in our article (von Eye & DeShon, 2012), the implication of this idea is that this convolution is necessarily closer to normality than the distribution of the explanatory variable, because adding the normally distributed residual to a non-normally distributed explanatory variable necessarily results in an outcome variable that is closer to the normal distribution than the non-normally distributed independent variable alone. If the explanatory variable is normally distributed to begin with, the convolution will be normal also.
Now, in longitudinal research in which reversible phenomena such as learning or drunkenness are studied, the putative outcome variable, Y, may or may not violate normality assumptions when the putative explanatory variable, X, is not active. In contrast to Pornprasertmanit and Little, we do not assume that there must be a cause for non-normality. A good number of variables are non-normal by definition. Examples include disease distribution or the simple variable “Coming late to work.” However, and here we trust Pornprasertmanit and Little agree, when the explanatory variable is active, it has the potential to change the distribution of the outcome variable. Similarly, when the explanatory variable ceases to be active, the distribution of the outcome variable will change back to its original shape (when the effect is reversible).
The idea that the explanatory variable changes the distribution of the outcome variables is not new. It has been discussed in the context of what is known as Granger Causality (Granger, 1969; Seth, 2007). Consider the two measures X1 and X2, with X2 measured at least twice. Now, according to the concept of Granger Causality, “if a signal X1 ‘Granger-causes’ (or ‘G-causes’) a signal X2, then past values of X1 should contain information that helps predict X2 above and beyond the information contained in past values of X2 alone” (Seth, 2007, p. 1667).
In our two scenarios, we propose that this information be made part of the analysis. We propose asking the question whether the convolution displays a different distribution of Y than Y without X. In other words, we add to the decision steps proposed for cross-sectional data the step that answers the question whether Y changes its distributional characteristics when the putative causal agent is active. These changes can occur when the explanatory variable becomes active as well as when it ceases to be active. We continue to believe that examining possible changes in the distribution of the putative outcome variable can provide valuable information about the effects of the explanatory variable.
Lurking variables
If a third variable correlates with both the putative explanatory and outcome variables and, thus, creates a spurious appearance of a relation between these two variables, it is called a lurking variable. In the empirical social and behavioral sciences, the dangers of lurking variables are well known and have often been discussed. One well-known example of the effects of lurking variables concerns mortality and cell phone use. World-wide, those who use cell phones enjoy a higher life expectancy than those who do not use cell phones. We would certainly not conclude that cell phone use affects life expectancy. The lurking variable here may be that medical services are better developed in countries whose inhabitants can afford cell phones.
Pornprasertmanit and Little present interesting and convincing simulation results on the possible effects of lurking variables. The conclusion is clear. Inferring causality comes with the risk of falling for spurious relations. Accuracy and the validity of inferences are in jeopardy if lurking variables are ignored.
Here, however, two questions arise. The first question concerns the historical development of a science and the development and empirical foundation of a theory. The empirical developmental sciences are young, compared to geometry or chemistry. However, there has been an accumulation of empirical results in the developmental sciences as well. Ever-better methods of sampling and data analysis are employed. Theories are reformulated in the face of new evidence. Methods of meta-analysis are being employed. New domains of data are being included. Just think of the contributions that brain imaging has made to the differential diagnosis of ADHD (Nigg et al., 2008; Swanson et al., 2007). Therefore, the probability of falling for lurking variables has decreased—and that from an already low level. The probability certainly is larger than 0; but still, we ask how large (small) it is, and whether this magnitude should prevent researchers from testing causality hypotheses.
Second, the risk of falling for lurking variables is not specific to the methods discussed by von Eye and DeShon (2012), or Dodge and collaborators (Dodge & Rousson, 2000, 2001; Dodge & Yadegari, 2010). Even without the causality context, spurious correlations are dreaded as they can jeopardize the validity of conclusions. Therefore, the second question we ask concerns the specificity of the results presented in the simulation section of the commentary. Unless we have overlooked important elements of the simulations, results seem unspecific to the context of the methods discussed by von Eye and DeShon (2012). Yes, the results are correct and impressive. However, in the context of established developmental science research, the probability of falling for spurious relations in general is low and may decrease even more as the social and behavioral sciences progress and develop.
When it comes to dealing with unobserved heterogeneity (unobserved variables), it should be noted that the notion of lurking variables can be seen as connected to the endogeneity problem as it is discussed in econometrics. This problem is characterized by the phenomenon that the explanatory variables in a model are themselves dependent on the outcome variables (cf. the reciprocity outcome, above). Variables (or parameters) are considered endogeneous when a correlation exists between the parameter of the variable and the residual term. Reasons for this correlation include, among others, measurement error and unobserved—that is, lurking—variables. What is interesting about the discussion of endogeneity is that many scholars see no need to model a lurking variable. Instead, what is discussed as a strategy is cutting off the influence of the lurking variable by blocking backdoor paths (Pearl’s [2000] criterion in his graphic modeling approach), including one or more instruments to causally identify the effect of interest (cf. instrumental variable regression), and Heckman (1979) selection correction. Either of these can be done without modeling all of the potential lurking variables (for a discussion of the endogeneity problem in a developmental context, see, e.g., Duncan, Magnuson, & Ludwig, 2004).
The last word on how to deal with unknown lurking variables may not have been spoken. Therefore, we look forward to learning about new approaches that have the potential to help researchers in the design phase, data collection, data analysis, and interpretation phases of empirical research.
What stands is— as we said at the beginning of the rejoinder—that, as far as we are able to see, Pornprasertmanit and Little and ourselves agree on the potential of the methods discussed in our original article. In particular, in developmental research and in research that develops methods of intervention, the possibility of testing hypotheses that are compatible with directional dependence can be priceless.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
