Abstract
Social and personality psychologists are often interested in the extent to which similarity, agreement, or matching matters. The current article describes response surface analysis (RSA), an approach designed to answer questions about how (mis)matching predictors relate to outcomes while avoiding many of the statistical limitations of alternative, often-used approaches. We explain how RSA provides compressive and often more valid answers to questions about (mis)matching predictors than traditional approaches provide, outline steps on how to use RSA (including modifiable syntax), and demonstrate how to interpret RSA output with an example. To bolster our argument that RSA overcomes many limitations of traditional approaches (i.e., incomplete or misleading inferences), we compare results from four popular approaches (i.e., difference scores, residuals, moderated regression, and the truth and bias model) to those obtained from RSA. We discuss specific applications of RSA to social and personality psychology research.
Psychologists, practitioners, and the general public are often interested in questions about whether (mis)matches matter. Does similarity foster attraction or do opposites attract (Luo & Klohnen, 2005; Selfhout, Denissen, Branje, & Meeus, 2009)? Are positive illusions adaptive or are realistic self-perceptions the hallmark of mental health (Church et al., 2014; Dufner et al., 2012)? Do employees perform better when their values match the values espoused by their organizations (Edwards & Parry, 1993)?
Table 1 outlines example questions in psychology concerning whether (mis)matching perspectives are associated with more (or less) favorable outcomes. These questions are at the heart of theoretical issues in the field (e.g., is self-knowledge adaptive?) and have important practical implications (e.g., should people learn more about themselves?), but they present formidable analytical challenges (Cronbach & Furby, 1970; Edwards, 1994). Indeed, these analytical difficulties have left many questions about the importance of (mis)matched predictors in psychology unanswered. To advance knowledge in these areas, we describe polynomial regression and response surface analysis (RSA; Edwards, 1994; Edwards & Parry, 1993; Nestler, Grimm, & Schönbrodt, 2015), a comprehensive analytical tool specifically designed to answer questions about whether (mis)matches matter.
Common Questions in Social and Personality Psychology That Response Surface Analysis Answers.
Why should researchers learn about RSA? RSA provides comprehensive answers to core questions in psychology, such as those listed in Table 1, and is far superior to frequently used alternative approaches that often provide incomplete or even erroneous conclusions because of their statistical limitations. To foster a better appreciation and understanding of why and how to use RSA, we explain the merits of the approach, provide instructions on how to use RSA, and interpret results using real data. We leverage free R software (R Version 3.3.2) and provide syntax that researchers can adapt to their own research questions. We also compare results from four popular approaches with those from RSA to demonstrate how these alternatives produce incomplete (at best) or misleading inferences (at worst). In sum, the current article aims to encourage the use of RSA by providing an intuitive guide on why and how to adopt this approach.
Merits of RSA
RSA has at least two major conceptual strengths. First, RSA assesses whether (mis)matches matter by modeling how all possible combinations of two predictors are associated with an outcome and does so in three-dimensional space (Edwards, 1994; Edwards & Parry, 1993; Nestler et al., 2015; Shanock, Baran, Gentry, Pattison, & Heggestad, 2010). This has important consequences for how much information RSA provides and for the validity of the results. With respect to validity, RSA models (mis)matching without using mathematical operations that conceal or distort information, such as the subtraction of one predictor from the other (i.e., difference scores; Edwards, 2002). Further, matches are operationalized in an intuitive way, specifically as the exact match between predictors. Using the example of Jordan and Taylor, the pair is matched if Jordan’s level of an attribute is the same as Taylor’s level, such as when both are 6 on 1–7 scale. Plotting response surfaces in three-dimensional space provides a thorough visualization and facilitates researchers’ understanding of their data. In sum, RSA models matches in an intuitive, statistically valid, and comprehensive way.
Second, RSA answers more nuanced questions than traditional approaches. Like many traditional approaches, RSA tests whether matching attributes are associated with more (or less) favorable outcomes than mismatching attributes (e.g., if self-knowledge is more or less adaptive than self-deception). However, theories about the consequences of (mis)matches can—and likely often should—be more complex. RSA is designed to address these complexities. In particular, rather than stopping at the general finding that matches are overall better than mismatches (e.g., self-knowledge is better than self-deception or similarity is better than dissimilarity), a researcher can use RSA to discover whether matched attributes at one level of the predictors have different outcomes than matched attributes at another level. For instance, RSA would detect—but alternative approaches would fail to show—that Jordan and Taylor are less likely to split up if they both have high levels of agreeableness than if they both have low levels of agreeableness. Examples like this, where matches at some levels are not better than mismatches, are easy to imagine but are missed by approaches that fail to differentiate between matches at different levels of a predictor.
In addition, a researcher can use RSA to test whether one type of mismatch (e.g., an overestimate) is worse than another (e.g., an underestimate). For example, if researchers find that greater discrepancies in intelligence between partners predicts lower quality relationships, researchers would also want to know if some types of discrepancies are worse than others. Is Jordan less satisfied when Jordan is more intelligent than Taylor or less intelligent? Or, is self-enhancement better or worse than self-effacement? Thus, rather than limiting hypotheses to the basic question of whether a match is better or worse than a mismatch, RSA answers richer questions about how (mis)matches matter. Indeed, past research using RSA has revealed that (mis)matches are often not the same (Barranti, Carlson, & Furr, 2016; Bleidorn et al., 2016; Edwards & Rothbard, 1999).
Steps for Conducting RSA
The entire process that we outline below can generally be achieved in one step using the RSA package (Version 0.9.10) in R (Schönbrodt, 2016). However, RSA conceptually involves two steps: (a) running a polynomial regression model and (b) using effects from this model to generate a response surface and test for if and how mis(matches) matter (Box & Draper, 1987; Edwards & Parry, 1993). Thus, the interpretation of results of RSA focuses on the response surface rather than the polynomial regression effects.
To use RSA, data must meet the assumptions of multiple regression (Shanock et al., 2010). Additionally, the two predictors must be commensurate, representing the same content domain and measured on the same interval or ratio scale (Edwards, 1994, 2002). A researcher could use RSA to explore if there are costs associated with (mis)matching self- and peer perceptions of intelligence on the same Likert-type scale but could not explore costs associated with (mis)matches between self-perceptions on a Likert-type scale and measures of actual intelligence on a different scale (e.g., Wonderlic). The outcome can be measured on a different scale.
Establish the Existence of Both Matches and Mismatches
Researchers should verify that the data include both matched and mismatched observations because the results are not reliable in the absence of one or the other. The RSA package automatically generates this output (i.e., the percentage of observations where X is greater than, equal to, or less than Y) based on whether the predictors are within half a z-score unit.
Center Predictors
Centering both predictors on the scale midpoint ensures that the interpretation of the results is consistent with theories of how (mis)matches relate to outcomes (i.e., as the exact match between predictors). Predictors should be unstandardized. If predictors are standardized, a one-unit change in one predictor may not have the same substantive meaning as a one-unit change in the other predictor (Edwards & Parry, 1993), precluding inferences about how (mis)matching relates to outcomes. Researchers should exercise extreme caution if they do not center predictors on the scale midpoint because it fundamentally changes the interpretation of a match—often to something convoluted, unintuitive, and inconsistent with theory. For example, mean centering predictors operationalize a match as each predictor deviating from their respective mean by the same amount, which substantially complicates interpretation.
Conduct Polynomial Regression
Regress the outcome on the main effects of X and Y, their squared terms (X 2 and Y 2), and the interaction term (X × Y). If the polynomial regression model is significant and the inclusion of the squared terms and interaction increased R 2, the next step is to examine the three-dimensional response surface and the tests of its shape.
Generate the Response Surface
The RSA package automatically generates the response surface. A hypothetical example is shown in Figure 1. As shown, the X and Y-axes range from negative to positive values, and 0 reflects the scale midpoint. Thus, positive values (e.g., +2) represent the points above the midpoint, and negative values (e.g., −2) represent points below the midpoint. The Z-axis depicts the outcome on its own scale of measurement. This hypothetical response surface displays the expected values of the outcome at all possible combinations of the two predictors. For example, it indicates the expected Z-value when X and Y are both high (the back corner where both are +2) or low (the front corner where both are −2), when X is high while Y is low (right corner), when Y is high while X is low (left corner), and everything in between.

Response surface with labeled features. Predictors are centered on the midpoint of the scale. X and Y values of 0 reflect the midpoint of the scale. The line of congruence reflects cases where values of X and Y perfectly match, at all levels of the scale. The line of incongruence represents cases where values of X are the opposites of values of Y.
Figure 1 also shows the two lines that test hypotheses about (mis)matched predictors. The line of congruence reflects cases where values of X and Y perfectly match at all levels of the scale. Using a similarity example, this line indicates points where Jordan and Taylor both report being very low (−2) or both report being fairly high (+1). The line of incongruence represents cases where values of X are the opposites of Y. This line would indicate all points where, if Jordan reports being high (+2), Taylor reports being low (−2), or if Jordan reports being fairly high (+1), Taylor reports being fairly low (−1).
Interpret Tests of the Response Surface’s Shape
RSA automatically provides statistical tests for four coefficients (a 1–a 4) that answer unique questions about how (mis)matches matter. Table 2 outlines each of the four questions these coefficients answer and illustrates response surfaces for possible answers to these questions (more details on the statistical tests of the coefficients appear in Online Supplemental Materials). Rather than discuss coefficients in numerical order, we explain them in terms of the conceptual questions they test. We first discuss each of the coefficients in isolation, describing how each coefficient should be interpreted when it is significant but all other coefficients are not. We then provide example interpretation for when more than one coefficient is significant.
Four Response Surface Analysis Coefficients and the Questions They Answer.
Note. Coefficients are based on polynomial regression’s unstandardized coefficients: a 1 = b 1 + b 2; a 2 = b 3 + b 4 + b 5; a 3 = b 1 − b 2; a 4 = b 3 − b 4 + b 5. Please see the Online Supplemental Materials for more modeling details and graphing syntax.
Are matches associated with higher or lower outcomes than mismatches?
The test of the curvature of the line of incongruence, the a 4 coefficient, is the critical test for whether the mismatching of predictors matters overall. It indicates if the outcome increases or decreases more sharply as predictors diverge. Thus, a 4 could reveal if, for example, self-knowledge predicts greater adjustment than self-deception or if similarity predicts more liking than dissimilarity. The bottom right panel of Table 2 shows examples of a 4 effects. As shown, it essentially tests if outcomes are higher (or lower) in the middle of the line (where X and Y are matched) compared to the ends of the line (where X and Y differ more). A positive a 4 indicates a convex (upward) curve, suggesting the outcome increases more sharply as the two predictors diverge. A negative a 4 indicates a concave (downward) surface, suggesting that the outcome decreases more sharply as the two predictors diverge.
Does the type of discrepancy matter?
RSA also reveals if the direction of mismatch matters by testing the slope of the line of incongruence, the a 3 coefficient. In the context of self-knowledge, a 3 would reveal if people are less liked when they self-enhance versus self-efface. As shown in the bottom left panel of Table 2, a positive a 3 indicates that the outcome is higher when X is greater than Y than the other way around. This would suggest that people are more liked when their self-views (X) exceed actual ratings (Y) than when their actual ratings are higher than their self-views. A negative a 3 indicates that the outcome is higher when Y exceeds X. In our example, this would suggest people are more adjusted when their actual ratings are higher than their self-views.
Are some matches better or worse than other matches?
The test of the slope of the line of congruence, the a 1 coefficient, reveals if the effect of a perfect match is different at higher or lower levels of the scale. Using our self-knowledge example, a 1 indicates if self-knowledge for high levels is more or less adaptive than self-knowledge for low levels of the attribute. As shown in the upper left panel of Table 2, a positive a 1 indicates that matches at higher levels are associated with higher outcomes than matches at lower levels. A negative a 1 coefficient indicates that matches at higher levels are associated with lower outcomes than matches at lower levels.
Do matches at extremes have different effects than matches at mid-levels?
The test of the curvature of the line of congruence, the a 2 coefficient, indicates if matches at extreme ends of the scale predict higher or lower standing on the outcome than matches at midrange levels. More specifically, the a 2 indicates if the outcome increases or decreases more sharply as predictors match at increasingly high and low levels. A positive a 2 indicates a convex (upward) curve or that matches which deviate from the scale midpoint predict higher outcomes than matches at mid-levels of the scale. Using a self-knowledge example, a positive a 2 might be observed if it is especially important for individuals to leverage their high levels of ability and also to be aware of any low levels of ability. A negative a 2 suggests a concave (downward) surface, suggesting that self-knowledge has diminishing returns at increasingly higher and lower ends of the scale.
Interpreting combinations of coefficients
Each RSA coefficient yields important information, but researchers are also interested in the combination and size of these effects. Indeed, focusing on one coefficient and ignoring the others could lead researchers astray, because the outcome is often determined by a combination of effects. Figure 2 helps demonstrate the importance of considering the combination of effects by illustrating an a 4 coefficient with increasing size levels of an a 3 coefficient. In our analysis of real data, we provide another example of how to interpret complex response surfaces.

Interpreting coefficient combinations. From left to right, response surfaces reflect constant a 1 = 0.00, a 2 = 0.00, and a 4 = −0.40 coefficients, with a 3 coefficients of increasing magnitude: (A) a 3 = 0.00, (B) a 3 = 0.40, and (C) a 3 = 0.80.
Example of RSA Analysis
For our demonstration, we focus on how assumed similarity on the personality trait of conscientiousness (i.e., how much Jordan’s conscientiousness matches Jordan’s belief about Taylor’s conscientiousness) is associated with relationship quality among romantic couples. Past work suggests that assumed similarity predicts higher quality (Montoya, Horton, & Kirchner, 2008), but to our knowledge, this question has not been answered with RSA. We first show how this question can be addressed with RSA and then explain how traditional methods provide incomplete or erroneous conclusions about if and how assumed similarity matters.
Our data are a subsample of the St. Louis Personality and Aging Network study, specifically participants who nominated their romantic partner as an informant (N = 322; age M = 62.22, standard deviation [SD] = 2.72; 60% male; 81.4% Caucasian, 17.7% African American, .3% Latino, .6% Middle Eastern (see Oltmanns, Rodrigues, Weinstein, & Gleason, 2014, for study details). Partners knew each other for about 30 years (M = 32, SD = 12). A power analysis revealed that this sample provided .99 power to detect a medium-sized change (f 2 = .15) in R 2 going from a two main effects model to a polynomial model (i.e., adding the interaction and two quadratic terms) or 0.54 power to detect a small-sized change (f 2 = .02; Faul,Erdfelder, Buchner, & Lang, 2009).
Participants (Jordan) described their own and their partners’ (Taylor) personality on five-factor model traits (Costa & McCrae, 2009). Our analyses focus on conscientiousness (self-report: M = 2.86; SD = 0.57, α = .68; impression: M = 2.83; SD = 0.76, α = .86), but we report results for other traits in the Online Supplemental Material. We focus on the perception of quality from the partners making the assumed similarity judgments (i.e., Jordan). Conscientiousness was measured using a 5-point Likert-type scale ranging from 0 to 4. We centered the scores by subtracting the scale midpoint (i.e., 2). Perception of quality was measured by the 4-item version of the Dyadic Adjustment Scale (Sabourin, Valois, & Lussier, 2005; M = 16.21, SD = 3.17; α = .83).
Figure 3 shows how all combinations of Jordan’s self-perception (X) and Jordan’s impression of Taylor’s (Y) conscientiousness relate to Jordan’s satisfaction (Z). Did assumed similarity predict higher relationship satisfaction? People were overall more satisfied when they thought they were more similar to their partner than when they thought they were more dissimilar, an association revealed by a negative curvature of the line of incongruence (a 4 = −1.64; 95% confidence interval [CI] = [−2.64, −0.60]).

Response surface for assumed similarity of conscientiousness. The polynomial coefficients were as follows: b 0 = 15.15, 95% confidence interval (CI) [14.33, 15.97]; Jordan’s self-perception b1 = 0.11, 95% CI [−0.94, 1.17]; Jordan’s impression of Taylor b 2 = 1.57, 95% CI [0.74, 2.40]; Jordan’s self-perception squared b 3 = −0.27, 95% CI [−0.91, 0.36]; Jordan’s self-perception and impression interaction b 4 = 0.81, 95% CI [0.12, 1.51]; Jordan’s impression of Taylor squared b 5 = −0.55 [−0.92, −0.18].
Dissimilarity was associated with lower quality than similarity but were all mismatches equally detrimental? No. People were particularly unsatisfied when they perceived themselves to be more conscientiousness than their partner, an association revealed by the negative slope of the line of incongruence (a 3 = −1.45, 95% CI [−2.48, −0.43]). Thus, there were costs to feeling dissimilar to one’s partner in terms of conscientiousness, and these costs were particularly large when people believed they were more (rather than less) conscientious than their partner.
Assumed similarity predicted higher quality than assumed dissimilarity but was assumed similarity equally beneficial at all levels of conscientiousness? People were more satisfied when they thought their partner was similar to them at high versus low levels of conscientiousness, an association revealed by the positive slope of the line of congruence (a 1 = 1.68, 95% CI [0.08, 3.28]). There was no curvilinear association along the line of congruence (a 2 = −.01, 95% CI [−0.94, 0.92]), suggesting that Jordan’s satisfaction did not increase more sharply as Jordan assumed similarity at extremely high versus low levels of conscientiousness.
How Do Results of RSA Compare to Alternative Approaches?
Questions about whether (mis)matches matter have been researched extensively using approaches other than RSA. To demonstrate their limitations, we reanalyzed our data using four common approaches for testing hypotheses about matching attributes: (a) difference scores (including absolute difference scores), (b) residual scores, (c) moderated regression, and (d) the truth and bias model (West & Kenny, 2011). The results are shown in Figure 4, but please see the Online Supplemental Material for detailed information about these approaches. As we shall see, none of the alternative approaches provides the information revealed by RSA, and some alternatives even provide erroneous conclusions.

Traditional approaches to testing if assumed similarity relates to relationship satisfaction. For panel A (difference scores), to calculate difference scores, we subtracted Jordan’s impression of Taylor’s conscientiousness from Jordan’s conscientiousness. For absolute difference scores, we took the absolute value of the difference scores. For panel B (residual scores), we regressed Jordan’s conscientiousness on Jordan’s impression of Taylor’s concientiousness and saved the residuals. For panel C (moderated regression), we mean centered Jordan’s conscientiousnes and Jordan’s impression of Taylor’s concientiousness and then regressed Jordan’s satisfaction on both centered variables and the interaction between them. The plot reflects the simple slopes of Jordan’s conscientiousness at +1 and −1 standard deviation (SD) of Jordan’s impression of Taylor’s concientiousness. For panel D (truth and bias model), Jordan’s conscientiousness and Jordan’s impression of Taylor’s conscientiousness were centered on the mean of Jordan’s conscientiousness, and Jordan’s relationship satisfaction was mean centered. We regressed Jordan’s impression of Taylor’s conscientiousness on Jordan’s conscientiousness and Jordan’s satisfaction. The plot reflects the simple slopes of Jordan’s conscientiousness at +1 and −1 SD of Jordan’s satisfaction.
Difference Scores
For both the difference score and absolute difference score approaches, a score of 0 reflects a perfect match. For difference scores, positive and negative scores reflect mismatches. For absolute difference scores, positive scores reflect mismatches. The results of these analyses are shown in Figure 4. If we only examined difference scores, we would conclude that relationship quality is highest when people perceive their partner as more conscientious than themselves, moderate when people perceive their partner is equally conscientious, and lowest when people perceive their partner as less conscientious than themselves. This conclusion is erroneous—RSA revealed that when people perceive their partners as more conscientious than themselves, their relationship quality is lower than when people perceive a match (but not as low as when people perceive that they are more conscientious than their partner).
Further, if we only examined absolute difference scores, we would conclude that assumed similarity is positively related to satisfaction, compared to assumed dissimilarity. This conclusion is correct. However, we would also conclude that relationship quality is the same when individuals think they are more conscientious than their partners and when individuals think they are less conscientious than their partners. This conclusion is also erroneous—RSA revealed that the former is associated with significantly lower quality than the latter.
Combining the conclusions from difference score and absolute difference score results will still provide limited conclusions as compared to RSA. With both versions of the difference score approach, matches at all levels (high–high, moderate–moderate, or low–low) are 0. Thus, difference scores and absolute difference scores cannot reveal if assumed similarity has the same effect at all levels of conscientiousness (as RSA does with an a 1 coefficient), nor can they test nonlinear effects of matching (as RSA does with an a 2 coefficient). Therefore, both versions of the different score approach miss the finding that assumed similarity at high levels of conscientiousness is associated with higher quality than assumed similarity at low levels. RSA detected this pattern.
Residual Scores
As is sometimes done in the literature, we computed residual scores by regressing one perspective (X; Jordan’s self-perception) onto the other (Y; Jordan’s impression of Taylor) and saved the residuals. The magnitude and direction of residuals indicate the degree to which what was predicted by one perspective tended to be above (or below) what was actually observed by the other (i.e., if Jordan was more or less conscientious than what would be predicted by Jordan’s impression of Taylor). A residual of 0 reflects perfect assumed similarity.
Results in Figure 4 suggested that there was no relationship between assumed similarity and satisfaction. Using this approach, researchers would infer that assumed similarity is unrelated to satisfaction. This conclusion is erroneous—RSA revealed that overall assumed similarity is associated with higher satisfaction than dissimilarity, plus one direction of dissimilarity is worse than the other. Further, like difference scores, the residual score approach masks effects of different types of matches (i.e., RSA’s a 1 and a 2 coefficients) by assigning all matches a score of 0. Thus, the residual score approach conceals the fact that assumed similarity at high levels of conscientiousness is associated with higher satisfaction than assumed similarity at low levels.
Moderated Regression
The moderated regression approach reveals if the link between one predictor (Jordan’s self-perception) and the outcome (satisfaction) depends on the level of the other predictor (Jordan’s impression of Taylor). For example, this approach reveals if Jordan is more satisfied when both Jordan’s self-perception and Jordan’s impression of Taylor are both high or both low. Results in Figure 4 suggested that assumed similarity was associated with satisfaction (i.e., the interaction was significant). Simple slope tests revealed that people were particularly satisfied when they perceived that both they and their partners were highly conscientious.
Results somewhat mirrored RSA effects, but unlike RSA, moderated regression does not provide a direct answer to the question posed in similarity research: whether matches are generally associated with higher quality than mismatches (which is revealed by the a 4 in RSA). The focus in moderated regression is on specific comparisons of arbitrary levels of each predictor (typically, ±1 SD from the mean) rather than overall comparisons of matches versus mismatches. Further, moderated regression typically does not formally test if high–high matches are associated with more satisfaction than low–low matches (a 1) or if high–low matches are associated with different levels of satisfaction than low–high matches (a 3) because these data points are on different regression lines (Shanock et al., 2010). Moderated regression also cannot test the possible curvilinear nature of matching (a 2). Researchers thus miss nuanced ways in which matching predicts outcomes.
Truth and Bias Model
Researchers have adapted the truth and bias model (West & Kenny, 2011) to test hypotheses about matching. This approach essentially makes the outcome of RSA a moderator and incorporates a centering procedure that formally tests directional or mean-level biases in predictors. We regressed one predictor (e.g., Jordan’s impression of Taylor) on the other predictor (e.g., Jordan’s conscientiousness), the outcome (e.g., Jordan’s satisfaction), and their interaction. This model tests if the association between predictors depends on the “outcome,” but it also formally tests if mean-level differences depend on the outcome (see the Online Supplemental Material for modeling details).
Results in Figure 4 suggested that the degree to which people tended to assume they were similar to their partner depended on their satisfaction (i.e., the interaction was significant), but simple slopes were not significant. The main effect of satisfaction was a significant positive predictor, suggesting that when Jordan’s conscientiousness was higher than Jordan’s impression of Taylor, Jordan was less satisfied. While these effects approximate some of the effects provided by RSA, they have similar limitations to moderated regression outlined above. There is no direct test of whether matches are associated with higher quality than mismatches. There are also no direct comparisons of different types of mismatches (e.g., Jordan thinks Taylor is more versus less conscientious than him). Finally, there are no direct comparisons between high–high and low–low matches and no test of potential curvilinearity among matches.
Summary of Comparison Between Alternative Approaches and RSA
Our demonstration suggests that alternative, often-used approaches to testing questions about similarity do not provide as much information as RSA provides about if and when (mis)matches matter. While some approaches approximated one or two of the effects provided by RSA (e.g., overall assumed similarity was beneficial), none provided all of the information RSA provided, and some approaches led to erroneous conclusions. The results that RSA identified—but that no other approach can—such as how matches matter (a 1 and a 2) are important for both theoretical and practical reasons. Researchers may develop inadequate theories if they assume that all matches are associated with the same outcomes, when in fact some types of matches have better outcomes than others. Further, theories may be inadequate if all mismatches are assumed to be equally detrimental, when in fact some mismatches are associated with worse outcomes than others. On a practical note from our data, therapists or laypeople informed by difference score results would believe that individuals who assume their partner is similar to them will be the most satisfied, when in fact, this is only true if both are high, not low, on conscientiousness.
Additional Features of RSA
RSA can be adapted to fit a variety of research designs. For example, the RSA package includes modifications for binary outcomes (e.g., voting behavior). Many research questions involve designs that introduce dependencies in the data (e.g., modeling similarity and both partners’ satisfaction). To adapt RSA to answer questions that involve multilevel modeling, researchers center predictors on the scale midpoint, conduct polynomial regression in multilevel modeling, and use the unstandardized coefficients, standard errors, covariances, and degrees of freedom from the multilevel model to generate and test the response surface (e.g., Barranti et al., 2016; Muise, Stanton, Kim, & Impett, 2016). RSA can also be adapted to include control variables. Please see the Online Supplemental Materials for additional syntax for these more complex specifications.
Researchers might have questions that go beyond tests of the slope and curvature of the lines of congruence and incongruence. A researcher might find that self-knowledge predicts better adjustment than self-deception (i.e., a negative a 4) and wonder if the a 4 effect apply at all levels of the attribute. Visually, inspecting the graph could provide a researcher with some ideas as to whether this is true, but tests of simple slopes are needed to make formal conclusions about these effects (Edwards & Parry, 1993). To aid in answering these more complex questions about boundary conditions, we provide some additional information in the Online Supplemental Material (see, e.g., Ilmarinen, Lönnqvist, & Paunonen, 2016).
Finally, to provide a rule of thumb for sample sizes, we calculated 80% power to detect a change in R 2 when going from two main effects to the full polynomial model (five predictors). The rationale is that, if adding the interaction and squared terms does not increase the predictive power of the model, it would be inappropriate to probe for matching effects that are derived from the interaction and squared terms. As such, researchers should aim for 550 observations to detect a small (f 2 = .02), 77 to detect a medium (f 2 = .15), and 36 to detect a large effect (f 2 = .35; Faul et al., 2009).
Conclusion and Implications
At the heart of some of the most important questions in social and personality psychology is if and when (mis)matches matter. Our goal was to present a flexible and statistically rigorous tool that can advance unresolved questions about (mis)matching perspectives, especially in light of the statistical challenges of such questions. Yet it might also be useful to revisit seemingly resolved questions in the literature using RSA. An important implication of our comparative examples is that the conclusions researchers make about (mis)matches depend on which approach they use. Traditional analytical approaches mask effects (e.g., whether effects of all matches are the same), and some approaches lead to incorrect inferences due to severe problems with statistical validity (Cronbach, 1955, 1958; Cronbach & Furby, 1970; Edwards, 1994). By masking or distorting findings, traditionally used approaches ultimately undermine the validity of inferences, which has serious theoretical and practical consequences (Edwards, 1994; Edwards & Parry, 1993). Findings from entire literatures might need to be reanalyzed using RSA to better understand if and how (mis)matches matter. While we hope the current demonstration provides the necessary background for researchers to apply this tool to their own work, we also hope researchers use this tool to reexplore seemingly resolved issues in the field.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Institute of Mental Health (1RO1-MH077840-01; Thomas Oltmanns) and the Social Sciences and Humanities Research Council (72048195; Erika Carlson; and 435160570; Stéphane Côté).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
