Abstract
This article revisits how the end points of plotted line segments should be selected when graphing interactions involving a continuous target predictor variable. Under the standard approach, end points are chosen at ±1 or 2 standard deviations from the target predictor mean. However, when the target predictor and moderator are correlated or the conditional variance of the target predictor depends on the moderator variable value, these end points may reside in regions with little or no supporting data, encouraging potentially erroneous interpretations of the interaction, in particular, and patterns in the data, in general. Tumble graphs are introduced to minimize the likelihood of these problems. The utility of the Tumble graph over the standard approach is demonstrated with a real data example.
Statistical two-way interactions imply that the relationship between two variables varies as a function of a third variable. It is difficult to overstate the importance of interactions for theoretical and applied research. For theory development and testing, an interaction may suggest a boundary condition for a theory’s applicability; in applied settings, an interaction may qualify the generality of an accepted predictive relationship. More broadly, critical questions about the external validity or generalizability of research results are inherently questions about interactions (Cook & Campbell, 1979). This article focuses on the common Linear × Linear interaction based on a moderated multiple regression model of the form,
where Y, X, and M are an outcome, a continuous target predictor, and a continuous or categorical moderator variable, respectively, the βs are model parameters to be estimated,
One purpose of this article is to highlight two conditions where the standard approach to constructing interaction graphs may provide misleading information. These conditions are when the target predictor and moderator are correlated, which often occurs in nonexperimental research, and when the conditional variance of the target predictor depends on the moderator variable value. When either of these conditions are present and the standard approach to constructing interaction graphs is followed, as demonstrated below, portions of the plotted line segments (i.e., conditional associations between Y and X at given values of M) may venture into regions with little or no supporting data. Although interpolation is considered reasonable in regression analysis, extrapolation is not (e.g., Tufte, 1974). Of critical importance, unwitting researchers may attempt to interpret or give emphasis to such plotted data values. In response to this concern, the second purpose of this article is to introduce a new approach for constructing interaction graphs called a Tumble graph that minimizes the likelihood of plotting extrapolated values.
This article is structured as follows: First, the standard approach for constructing interaction graphs with a continuous target predictor is briefly reviewed and the extrapolation problem under the two conditions noted above is demonstrated. The subsequent section introduces the Tumble graph and provides guidance for its construction. A real-data example follows comparing the standard interaction graph with the Tumble graph in the case of a continuous moderator variable. The concluding section provides further comments on the construction and use of Tumble graphs.
Standard Interaction Graphs and Potential Extrapolation
Graphs play an important role in scientific communication (Wilkinson, 2012); in the current context, two-dimensional graphs are commonly used to represent and communicate the nature of an interactive effect. A typical format displays the outcome variable Y on the vertical axis, the target predictor variable X on the horizontal axis, and uses distinct line segments in this space to represent the relationship between Y and X for different values of moderator variable M. For a continuous moderator M, these line segments represent slices of an implied smooth three-dimensional regression surface at specified values of M; for a categorical moderator M, these line segments represent the regression line within each moderator variable category. To obtain these line segments, consider that the regression equation relating outcome Y and target predictor X is
for a given value of M. In Equation 2, the quantity β1 + β3Mi is referred to as the simple slope, which indicates that the relationship between Y and X changes by β3 for a unit increase in M. Importantly, simple slopes of the displayed line segments will differ across values of M only when β3 ≠ 0 (i.e., when an interactive effect is present) and graphs representing these differing simple slopes convey the nature of an interactive effect.
A practical decision when constructing an interaction graph involves the choice of X and M values that are used in either Equation 1 or 2 when computing and displaying the line segments and their simple slopes. These choices determine whether or not the plotted line segments include extrapolated values. As Aiken et al. (2012, p. 116) note, researchers should ensure that sufficient data exist to support each of the plotted line segments in an interaction graph; failing to do so may plot some line segments in sparse or empty regions of the predictor variable space.
For categorical M, any coding scheme (e.g., Cohen et al., 2003, chap. 8) for representing the categories in the moderated multiple regression could be used without concern. For continuous M, Aiken et al. (2012) echo earlier recommendations (e.g., Cohen & Cohen, 1975) to plot line segments that are not too extreme in the observed moderator variable distribution (e.g., ±1 standard deviation [SD] from the sample mean of M). Although many researchers adhere to this recommendation, other researchers have used values such as ±2 SDs from the sample mean or the minimum or maximum values in the sample for M, both of which may exacerbate potential extrapolation. Importantly and in spite of the apparent arbitrariness for the chosen values of M, the difference in the simple slopes of the plotted line segments remains proportional to β3, which retains information on the direction of the interactive effect; for example, suppose that line segments are plotted at ±1 SD from the mean of M, the difference in the plotted line segment slopes will be
For a continuous target predictor X, researchers typically follow recommendations to cap the end points of the plotted line segments using values of X at ±1 SD from the sample mean (cf. Aiken & West, 1991; Cohen et al., 2003), presumably to minimize the likelihood that the plotted line segments venture into sparse or empty data regions. However, other researchers may use values of X such as ±2 SDs from the sample mean or at the minimum and maximum values of X (see, e.g., figure 1 of Hayes & Matthes, 2009). These latter options may exacerbate the problem of plotting extrapolated data points.

Scatter plots of 1,000 random standard normal variates for target predictor X and moderator variable M as a function of the correlation between X and M (left panels) and as a function of the magnitude of the heterogeneity in X given M (right panels).
In the remainder, the term standard interaction graph is used to describe an interaction graph constructed following the recommendations noted above (i.e., using values of X at ±1 SD from the mean of X and for continuous M using values of M at ±1 SD from the mean of M). Although this approach increases the likelihood of plotting line segments that on average are supported by actual data values, the end points of the plotted line segments may still venture into regions with little or no supporting data. Consider that under the standard approach, the specified values of X and continuous M are based on each variable’s univariate distribution. To ensure proper support from the data for an interaction graph, the joint distribution of X and M should be inspected or, equivalently, the conditional distribution of X given M along with the univariate distribution of M.
As noted in the introductory section, two features that may generate sparse bivariate data regions are when the target predictor and moderator are correlated and when the conditional variance of the target predictor depends on the moderator variable value. The scatter plots displayed in Figure 1 illustrate these features. In each plot, target predictor and moderator variable data are drawn from standard normal univariate distributions with reference lines drawn at ±1 SD about their respective means. The left panels vary the correlation between the two variables (i.e., ρ = .00 for the top panel and ρ = .50 for the bottom panel). The right panels vary the conditional variance of the target predictor, given the moderator variable value (i.e.,
Tumble Graph Design Principle and Construction
Given the tendency of researchers to interpret the plotted predicted values in isolation as well as patterns in the plotted predicted values (cf. Rosnow & Rosenthal, 1989, 1991, 1995), the likelihood of plotting extrapolated predicted outcome variable values should be minimized. The key design principle underlying Tumble graphs is that the target predictor and moderator variable values used for creating interaction graphs should not be chosen blindly using default univariate values (e.g., at ±1 SD from a sample mean), as this may place the plotted line segment end points in sparse or empty data regions. Instead, the univariate distribution of M and the conditional distribution of X given M should be inspected to ensure that the selected data values reside in denser data regions. The following discussion provides a two-step process for choosing the values of X and M to construct a Tumble graph. These chosen values are then plugged into the estimated moderated multiple regression equation (e.g., Equation 1 or 2) to yield predicted values for the outcome variable and the results are then graphed.
Step 1: Choose Moderator Variable Values
For a categorical moderator variable, there are no values to choose; simply use the values indicating category membership with the categorical variable coding scheme used for the model in Equation 1 (see, e.g., Cohen et al., 2003; Keppel & Zedeck, 1989). For a continuous moderator variable, Cohen and Cohen’s (1975) recommendation of choosing values of M at ±1 SD from its sample mean is reasonable, provided that the univariate distribution of M is reasonably symmetric; if the distribution of M is highly skewed, however, alternative choices such as the first and third quartile values may be more appropriate. For example, in a distribution with a strong positive skew, a data value at 1 SD below the mean may be less than the smallest observed value.
Step 2: Compute Two Target Predictor Values for Each Chosen Moderator Variable Value
For categorical moderator variables, researchers will likely have a distribution of scores on the target predictor X within each category of M, and this feature provides several options. For example, the two values of X can be selected using the same quantities computed separately within each category (e.g., using the first and third quartile values in a category or at ±1 within-category SD from the category mean for X). If the target variable variance is reasonably homogeneous across levels of the categorical moderator variable (e.g., the ratio of the largest to the smallest variance being less than 2), the SD can be based on the square root of the pooled (within-moderator level) target variable variances.
For continuous moderator variables, the options are more limited, given that there may be no or few data values of X at a given value of M. If the relationship between X and M is approximately linear and the residuals of X about the regression line of X on M are reasonably symmetric and homoscedastic, then researchers may estimate quantities in the conditional distribution of X given M (e.g., at ±1 residual SD around the predicted value of X given M). In particular, regress the target predictor X on the moderator variable M to obtain a regression equation, that is,
Use Equation 3 to predict the conditional mean values of the target predictor X for each of the moderator variable values chosen in Step 1. The square root of the mean square residual from the analysis of variance summary table for this model is an estimate of the SD of the residuals around the predicted values; this value is added and subtracted from each predicted target variable value resulting in two values of the target variable X for each chosen moderator variable value. The following example demonstrates this approach.
Example: Continuous Target Predictor and Moderator
Bauer and Curran (2005) used a sample of N = 956 children from the 1990 assessment of the Children of the National Longitudinal Survey of Youth to demonstrate the Johnson-Neyman approach to probing Continuous × Continuous interactions in a moderated multiple regression analysis. In their example, a child’s level of hyperactive behavior significantly moderated the relationship between a child’s level of antisocial behavior and math ability, when controlling for age, gender, grade level, and minority status. These data are used to demonstrate the Tumble graph. Before doing so, however, the model in Equation 1 must be extended to include the additional demographic predictors; Table 1 provides the estimated model parameters. In computing the predicted math ability scores for the standard interaction and Tumble graphs, these additional predictors were evaluated at their sample means. Similarly, in Step 2 of the proposed Tumble graph construction procedure, the model in Equation 3 was also extended to include the additional predictors; these predictors were evaluated at their sample means when predicting the antisocial behavior ratings.
Model Parameters From the Regression of Math Ability on Antisocial Behavior, Hyperactive Behavior, Their Interaction, and Several Demographic Predictors
Note. N = 956. Model R2 = .69, F(7, 948) = 299.28, p < .001. Predictor variables are not mean centered.
The upper panel of Figure 2 displays the standard interaction graph. The difference in the plotted slopes captures the descriptive nature of the interactive effect; the relationship between antisocial behavior and math ability is slightly positive for low levels of hyperactive behavior but smoothly becomes negative for higher levels of hyperactive behavior. Attempts to interpret or compare two of the four plotted line segment end points, however, are challenged by relatively sparse data regions due to the positive correlation between the ratings of antisocial and hyperactive behaviors, r = .51, 95% CI [.46, .55]. Figure 3 displays a jittered scatter plot for these ratings where the four reference line intersections represent the values used to create the standard interaction graph. The upper left intersection (i.e., high hyperactive behavior and low antisocial behavior) and the lower right intersection (i.e., low hyperactive behavior and high antisocial behavior) lie in relatively sparse data regions. Also plotted in Figure 3, as triangles, are the predictor variable values used for the Tumble graph discussed below. These latter values are situated in slightly denser data regions.

A standard interaction (upper panel) and Tumble (lower panel) graph of the moderated multiple regression of math ability on ratings of antisocial behavior, hyperactive behavior, and their interaction controlling for age, gender, grade level, and minority status.

Jittered scatter plot of ratings of antisocial and hyperactive behavior (with reference lines representing 1 standard deviation above and below the sample means; circles and triangles represent predictor variable input values for constructing standard interaction and Tumble graphs, respectively).
The lower panel of Figure 2 provides the Tumble graph. 1 Note that the angle between the two plotted lines is identical in the standard interaction and Tumble graphs, which maintains this critical feature of graphs of interactions. The plotted lines in the Tumble graph, however, do not traverse into (relatively) sparse regions of the antisocial behavior rating distribution, given the chosen values in the hyperactive behavior rating distribution. Thus, a viewer’s eye is not drawn to these end point values where no (or very little) data exist. Although having little to do with the interpretation of the interaction, the Tumble graph also provides a reminder that children with higher hyperactive behavior ratings tend to have higher antisocial behavior ratings (i.e., based on a comparison of line midpoints projected onto the horizontal axis). Thus, the Tumble graph represents not only the interactive effect appropriately but also other potentially interesting or informative aspects of these data.
Concluding Remarks
Many researchers, when constructing the standard interaction graph, place tick marks on the horizontal axis, often labeled as “High on Target Predictor” and “Low on Target Predictor” (e.g., Dawson, 2014). This practice is awkward with Tumble graphs, as there will typically be more than two values for the target predictor variable used to construct the graph. A more informative practice, for both the standard interaction and Tumble graph, is to place the actual target predictor values on the horizontal axis as displayed in Figure 2. In this example, note that the midpoint of the antisocial behavior rating continuum is 3 (i.e., with possible values ranging from 0 to 6) and that most of the plotted line segments fall below this midpoint value. Thus, even in the case of “High Antisocial Behavior,” the absolute level is not very high; in the case of “Low Antisocial Behavior,” the absolute level is very low. This information would not be apparent if only the labels “High” and “Low” on the target predictor were placed on the horizontal axis. As another example, consider target predictor values based on multi-item scale scores, common in the social and behavioral sciences. Although there is no important mathematical difference between scale scores based on the mean and sum of scale item responses, the range of the response options values bounds the former and therefore provides some interpretational utility when the number of response options is identical across items (cf. McDonald, 1999, p. 49).
Extensions to graphing interactions in models that include control variables (i.e., additional predictor variables not involved in the interaction) are straightforward with a Tumble graph as demonstrated in the example. To the extent that these control variables explain incremental variance in the target predictor over and above the moderator variable, the residual SD for the regression in Step 2 of the proposed procedure will be smaller and the plotted line segments will be shorter, further minimizing potential extrapolation. Extensions to models with higher order interactions or nonlinear (e.g., quadratic) effects are straightforward in principle with a Tumble graph, but perhaps tedious in practice. With higher order interactions, a higher dimensional predictor variable space must be inspected to verify that the plotted data points are not too extreme. Nonlinear effects create no additional challenges unless the nonlinearity occurs in the association between the target predictor and moderator variables; in such cases, a nonlinear term should be added to the regression model in Step 2 of the proposed procedure.
The construction of Tumble graphs for interactions involving multilevel models and data structures (i.e., within-level and cross-level interactions) should also be straightforward. In these models, the fixed effects are the parameters of interest. Because the selection of the moderator (Step 1) and target predictor values (Step 2) is descriptive and not inferential, a simple approximation would be to ignore the nesting structure when selecting and computing these values (cf. Bauer & Curran, 2005, pp. 387–388). Future research should explore the appropriateness of this recommendation.
As Tumble graphs are a new approach for representing interactive effects, more research is needed to assess and improve their efficacy. For example, research is needed to assess whether the varying end points used in a Tumble graph influence the perception of the difference between the slopes of the plotted line segments and therefore the interpretation of the interactive effect (i.e., relative to a standard interaction graph). Methodologically, research is needed to develop alternatives to the use of the residual SD in Step 2 of the Tumble graph construction procedure for a continuous moderator when there is evidence of heteroscedastic or strongly skewed residuals in the Step 2 regression analysis.
In closing, I note that standard interaction graphs and Tumble graphs are designed to represent and communicate the nature of interactive effects assuming that the structural relationships among the variables in the statistical model are correctly specified (i.e., linear). Diagnostic analyses (e.g., Cohen et al., 2003, chap. 4) may provide information on the appropriateness of this assumption. Alternative analytical and graphical approaches (e.g., spline smoothing; Wood, 2006) that relax such assumptions are also available and may suggest more appropriate model specifications. Aiken et al. (2012, pp. 107–108, figures 5.4 and 5.5) provide an excellent example of good data analytic and graphical practices in such cases.
Footnotes
Author’s Note
Constructive comments of four anonymous reviewers and the editor are acknowledged.
