There Is No Empirical Evidence for Critical Positivity Ratios: Comment on Fredrickson (2013)

Abstract

Fredrickson and Losada (American Psychologist, 2005, 60, 678-686) theorized that a ratio of positive affect to negative affect (positivity ratio) of 2.9013 acts as a critical minimum for well-being. Recently, Brown, Sokal, and Friedman (American Psychologist, 2013, 68, 801-813) convincingly demonstrated that the mathematical work underlying this critical minimum positivity ratio was both flawed and misapplied. This comment addresses Fredrickson’s (American Psychologist, 2013, 68, 814-822) insistence that, regardless of the incorrect mathematical work, substantial empirical evidence exists both for critical minimum and maximum positivity ratios and, more generally, for a (unspecified) nonlinear relation between the positivity ratio and well-being, by first noting that there was a mismatch between Fredrickson and Losada’s (2005) theory and the data used to test it, then describing the methodological and statistical problems of Fredrickson and Losada’s empirical study (2005), and, finally, examining the other studies that Fredrickson (2013) cited as empirical evidence.

Keywords

across-persons flourishing languishing methodology positivity ratio well-being within-person

Based on the mathematical work of Losada (1999; Losada & Heaphy, 2004), Fredrickson and Losada (2005) theorized that a ratio of positive affect to negative affect (positivity ratio) of 2.9013 acts as a critical minimum for well-being; a person must meet or surpass this ratio to flourish, defined as “liv[ing] within an optimal range of human functioning, one that connotes goodness, generativity, growth, and resilience” (p. 678). A person who does not meet this ratio is doomed to languish, a state that Keyes (2002) defined as “emptiness and stagnation . . . a life of quiet despair” (p. 210). Fredrickson and Losada (2005) tested this theory with two samples of college students, reporting that flourishing and nonflourishing students had mean positivity ratios of 3.2 and 2.3, respectively, in Sample 1, and 3.4 and 2.1, respectively, in Sample 2; they considered their results to provide empirical support for their theory because the two means for each sample flanked the theorized critical minimum positivity ratio. Fredrickson and Losada (2005) also theorized, but did not test, that flourishing disintegrates at a positivity ratio of 11.6346.

Recently, Brown, Sokal, and Friedman (2013) convincingly demonstrated that the mathematical work underlying Fredrickson and Losada’s (2005) critical minimum positivity ratio was both flawed and misapplied, concluding that “Fredrickson and Losada’s claim to have demonstrated the existence of a critical minimum positivity ratio of 2.9013 is entirely unfounded” (p. 801). That being the case, Brown et al. (2013) decided not to undertake a detailed evaluation of Fredrickson and Losada’s (2005) empirical study.

Although understandable, Brown et al.’s (2013) decision was unfortunate because it allowed Fredrickson (2013), in her reply to Brown et al. (2013), to insist that, although the mathematical work might not be correct, substantial empirical evidence exists for critical positivity ratios or tipping points¹—both a lower tipping point, when a person will suddenly switch from languishing to flourishing, and an upper tipping point, signaling the end of flourishing—and, more generally, for a (unspecified) nonlinear relation between the positivity ratio and various indicators of well-being. In fact, no empirical evidence exists for either a lower tipping point or an upper tipping point or for a more general nonlinear relation between the positivity ratio and well-being, as I shall show by first noting that there was a mismatch between Fredrickson and Losada’s (2005) theory and the data used to test it, then describing in some detail their empirical study, and, finally, examining the other studies cited by Fredrickson (2013) as empirical evidence.

An abbreviated version of this comment on Fredrickson (2013) was published in the American Psychologist (Nickerson, 2014), along with four comments by other authors (Guastello, 2014; Hämäläinen, Luoma, & Saarinen, 2014; Lefebvre & Schwartz, 2014; Musau, 2014); Brown, Sokal, and Friedman’s (2014a) rejoinder to Fredrickson’s (2013) reply to their original comment (Brown et al., 2013); and Brown, Sokal, and Friedman’s (2014b) reply to the five comments on Fredrickson (2013). Unlike their comment on Fredrickson and Losada (2005), Brown et al.’s (2014a) rejoinder to Fredrickson (2013) contained some detail about the empirical study in the Fredrickson and Losada (2005) article, as well as a discussion of the additional empirical evidence that Fredrickson (2013) claimed for critical positivity ratios and nonlinearity in the relation between the positivity ratio and well-being. Where appropriate, I have acknowledged the overlap between Brown et al.’s (2014a) remarks and my own.

Theory/Data Mismatch

Fredrickson and Losada’s (2005) empirical study of the relation between the positivity ratio and well-being did not test their theory of a critical minimum positivity ratio because there was a mismatch between the theory and the data used to test it. Fredrickson and Losada’s (2005) theory (see also Fredrickson, 2009, Chapter 7) clearly described a psychological process that occurs within person across time. Testing a within-person across-time theory requires observations on two (or more) variables for one person over many time points; the analysis can be repeated for any number of persons. In a simple correlational analysis of the two variables, there would be as many correlations as there are persons.

Fredrickson and Losada’s (2005, pp. 683-684) data and analyses, however, tested a within-time (i.e., at one time point) across-persons theory. Testing a within-time across-persons theory requires observations on two (or more) variables over many persons for one time point; the analysis can be repeated for any number of time points. In a simple correlational analysis of the two variables, there would be as many correlations as there are time points. In terms of Cattell’s (1946, Chapter 5) well-known “data box,” Fredrickson and Losada (2005) performed an “R-technique” analysis although their theory required a “P-technique” analysis.

Despite the common belief that the result of a within-time across-persons analysis is some sort of an “on average” equivalent of the result of a within-person across-time analysis, there is in fact no necessary relation between the results of the two different types of analyses, even if performed on the same “persons by time points” data set. For example, any possible within-person across-time correlation (positive, zero, negative) can co-occur with any possible within-time across-persons correlation (positive, zero, negative) for the same data set (Nickerson, 1999, 2007; Nickerson & McClelland, 1989; see also Jones, n.d., and Snijders & Bosker, 1999, p. 14, for similar demonstrations in the multilevel-modeling literature). Within-person across-time correlations can differ from within-time across-persons correlations in magnitude as well as in sign; empirically, differences in magnitude are probably more likely than differences in sign. The same is true of statistical techniques other than simple correlations.

Because a within-time across-persons analysis can, and probably will, give a result different from that of a within-person across-time analysis, it should not be used to test a within-person across-time theory unless it can be shown that (a) persons are essentially replicates of one another or “exchangeable,” with observations that differ due to random variation only, and that (b) the structure of the across-persons variation between the variables under consideration is the same as the structure of the within-person variation, in which case the two analyses should give the same or similar results (Hammond, McClelland, & Mumpower, 1980, pp. 115-127; Molenaar, 2004; Torgerson, 1958, pp. 45-48). This situation is very unlikely in studies of most psychological phenomena, particularly those involving process, development, or change.

An appropriate test of one or the other or both of Fredrickson and Losada’s (2005) theorized critical positivity ratios requires that each person’s positivity ratio and well-being be assessed at many time points over an appropriate period that reflects theoretical notions of the time that it takes the processes posited by the critical positivity ratio theory to occur. The analysis should begin with a visual inspection for each person of the relation between the positivity ratio and well-being over these time points; this might include charting the paired values of the positivity ratio and well-being for each time point or creating various plots. Given the strong and dramatic predictions of the theory, visual inspection might be sufficient to determine whether critical minimum and maximum positivity ratios exist at all, and if so, whether they equal 2.9013 and 11.6346, respectively, or some other values (which might not be the same for each person). If the data are so noisy that it is difficult to identify critical positivity ratios via visual inspection, or if a more formal analysis is needed or desired, the statistical methods developed for identifying “change points” in regression models and time-series analysis (e.g., Cudeck & Klebe, 2002) should be useful. The results of the within-person across-time analyses can then be summarized across all persons.

Of course, this research strategy is more costly, labor intensive, and time consuming than that used in Fredrickson and Losada’s (2005) empirical study. Fredrickson (2009, p. 130) asserted that what matters most is the positivity ratio achieved not within a day but over time; hence, each respondent’s positivity ratio was computed for a time period of about one month in Fredrickson and Losada’s (2005) empirical study. Thus, positive and negative affect would need to be assessed daily for many months to enable the computation of multiple monthly positivity ratios. In addition, well-being would need to be assessed once at the beginning of each month. These assessments might not be as arduous as they first seem, however, because recent advances in technology (e.g., mobile phones, electronically activated voice recorders, Internet websites) make recording many observations over a long time period much easier than has been the case in the past (Conner, Tennen, Fleeson, & Barrett, 2009). In fact, the positive and negative affect data used by Fredrickson and Losada (2005) to compute positivity ratios were collected by having their respondents log into a secure website nightly for about one month, although the one-time measure of well-being seems to have been a standard paper-and-pencil test. A more serious problem is that some respondents—perhaps most—would never achieve tipping-point values of the positivity ratios (if they exist, and whatever they might be), or never move from one state of well-being to the other, or both, so that a very large sample of respondents would be needed for a long period of time to be able to collect enough data for an adequate test of the theory, necessitating consideration of ways to maintain respondent motivation and limit respondent attrition. It is important to realize that such practical problems do not in and of themselves justify collecting within-time across-persons data to test a within-person across-time theory.

Fredrickson (2013) herself, in her reply to Brown at al. (2013), remarked that “[m]ost valuable to the maturation of this work will be longitudinal field studies and experiments that use densely repeated measures of emotions and relevant outcomes” (p. 820). It is not clear what “maturation” means here but it seems to suggest “improvement.” As noted earlier, the result of a within-time across-persons analysis is not some sort of an on-average equivalent of the results of multiple within-person across-time analyses, so it is not the case that the use of longitudinal (what I have called within-person across-time) studies necessarily offers improvement of the results of cross-sectional (within-time across-persons) studies. The two kinds of analyses can yield completely different results. Fredrickson’s (2013) remark also raises the question of why, if she understood that a longitudinal study is needed to test her theory, this was not acknowledged in the initial article (Fredrickson & Losada, 2005), in the book that highlights the research of that article (Fredrickson, 2009, Chapter 7), or in the correction to the original article (Fredrickson & Losada, 2005/2013), the latter of which insists on the validity of the results of the within-time across-persons empirical study. Indeed, why was the appropriate study design not used in the first place? After all, it has been pointed out many times over many years (e.g., Block, 1996; Borsboom, Mellenbergh, & van Heerden, 2003; Buck, 1980; Collins, Graham, & Flaherty, 1998; Davidson & Morrison, 1982, 1983; Epstein, 1983; Firebaugh, 1980; Jaccard, 1981; Jaccard & Dittus, 1990; Keren, 1993; Mandler, 1959; Michaela, 1990; Mitchell, 1974; Molenaar, 2004, 2005; Nesselroade, 2002; Nickerson, 1999, 2007; Nickerson & McClelland, 1988, 1989; Norman, 1967; Pagel & Davidson, 1984; Pelham, 1993; Rodgers, Cleveland, van den Oord, & Rowe, 2000; Runyan, 1983) that the data and the analyses used to test a theory should match that theory and that tests of a within-person theory nearly always require within-person data and analyses. Doing research the wrong way, while delaying doing it the right way “until later,” is not acceptable after so many years of discussions of this issue.

Fredrickson and Losada’s (2005) Empirical Study

Although Fredrickson and Losada’s (2005) critical positivity ratio theory is a within-person across-time theory that should be tested using within-person across-time data and analyses (unless the assumptions noted earlier can be shown to hold), it is the case (if the theory is valid) that, at any one time point, persons who have positivity ratios greater than or equal to 2.9013 should be flourishing, whereas persons who have positivity ratios less than 2.9013 should not. This point may have been Fredrickson and Losada’s (2005) rationale for their within-time across-persons study design, except that their independent t-test analyses, strangely and without explanation, reversed the predictor variable (positivity ratio) and the criterion variable (well-being), so that the actual hypothesis being tested was that persons who are flourishing should have positivity ratios greater than or equal to 2.9013, whereas persons who are not flourishing should have positivity ratios less than 2.9013.² (The results of an analysis with well-being dichotomized and used to predict the positivity ratio are unlikely to be the same as those of an analysis with the positivity ratio dichotomized and used to predict well-being.) Given the specific nature of the theory (positivity ratio less than 2.9013: nonflourish; positivity ratio greater than or equal to 2.9013: flourish), it is surprising that Fredrickson and Losada (2005) did not perform a 2 × 2 contingency-table analysis (in which case the reversal of the predictor variable and the criterion variable would not have mattered). In any case, none of these three analysis alternatives constitutes a test of the critical minimum positivity ratio, not the analysis performed by Fredrickson and Losada (2005), contrary to Fredrickson’s (2009, pp. 129-131) claim, and not the other two alternatives, either. I shall return to this point after first describing the details of Fredrickson and Losada’s (2005) empirical study.

Study Design

Although not explicitly mentioned by Fredrickson and Losada (2005), but indicated by Fredrickson (2009, pp. 129-131), the data used to test their hypothesis had been collected from two samples (Ns = 87 and 101) of college-student respondents for some other purpose. Well-being was assessed first using a 33-item measure of “positive psychological and social functioning” (Fredrickson & Losada, 2005, p. 683) adapted from the Mental Health Continuum–Long Form developed by Keyes (2002; available from https://www.pdffiller.com). The adapted measure consisted of six 3-item scales assessing “psychological well-being” (Keyes, 2002, p. 211) and five 3-item scales assessing “social well-being” (Keyes, 2002, p. 212). To be classified as flourishing, a respondent must have achieved a high score on 6 of the 11 scales, where “high score” was defined as being in the top half of the respondent score distribution for Sample 1 and in the top tertile of the respondent score distribution for Sample 2. Fredrickson and Losada’s (2005) choice of the top tertile as a cutpoint for Sample 2 followed Keyes (2002). A more lenient cutpoint was used for Sample 1 because respondents in that sample had been screened for depression. Respondents not classified as flourishing were classified as nonflourishing. Note that Keyes (2002) had classified his respondents as languishing, moderately mentally healthy, or flourishing. Fredrickson (2009) referred to the nonflourishing respondents in Fredrickson and Losada’s (2005) study as “languishing” (pp. 129-131), so it appears that in Fredrickson and Losada’s (2005) study, the moderately mentally healthy respondents were grouped with the languishing respondents, a point that they did not make explicit. Keyes (2002) admitted that his use of tertiles as cutpoints was arbitrary and that the classification of respondents was relative to the sample under consideration. Fredrickson and Losada (2005) made no such admissions.

To assess affect, each respondent completed an online questionnaire each evening to indicate the extent to which he or she felt each of 19 emotions—11 positive and 8 negative—during the past 24 hours. The scale for each emotion ranged from 0 to 4. Each respondent received 1 point for each positive emotion rated 2, 3, or 4, and 1 point for each negative emotion rated 1, 2, 3, or 4. Positive affect and negative affect were determined by tallying positive-emotion points and negative-emotion points, respectively, across 28 days. Then each respondent’s positivity ratio was computed by dividing his or her positive-affect tally by his or her negative-affect tally. Fredrickson and Losada (2005) did not report either the possible or the actual range of the positivity ratio across respondents; the possible range must equal 0/224 (minimum positive affect and maximum negative affect every day) to 308/0 (maximum positive affect and minimum negative affect every day). Supposing that a denominator of 0 can reasonably be replaced with a denominator of 1 (to avoid the division-by-zero problem), the possible range of the positivity ratio must equal 0 to 308.

To test their hypothesis that “positivity ratios at or above 2.9 [rounded from 2.9013] also characterize nonpatient samples in flourishing mental health” (p. 683), Fredrickson and Losada (2005) used the independent t test to compare the mean positivity ratio of respondents classified as flourishing with the mean positivity ratio of respondents classified as nonflourishing for each of the two samples, reporting that

[f]or Sample 1, the mean [positivity] ratio for flourishing individuals was 3.2. For the remaining individuals, it was 2.3, t(85) = 2.32, p = .01 (one-tailed), ω² = .05. For Sample 2, the mean [positivity] ratios were 3.4 and 2.1, respectively, t(99) = 1.62, p = .05 (one-tailed), ω² = .02. (p. 684)

No measure of variability for the positivity ratios in the two groups of respondents was reported for either sample.

Study Problems

Fredrickson and Losada’s (2005) empirical study has a number of serious methodological and statistical problems, including (a) the use of a ratio variable in the t-test analyses, (b) drawing conclusions from the independent t-test analyses as if the data were collected in an experimental rather than in a nonexperimental study, (c) the choice of a one-tailed significance test and the misreporting of statistical significance, (d) the neglect of the effect size, and (e) the dichotomization of the well-being variable.

Ratio Variable

Ratio variables are simple to compute but difficult to analyze. If there is no relation between the numerator and the denominator, the computation of the ratio will create one. If there is a relation between the numerator and the denominator, use of the ratio in statistical analysis is generally only appropriate if the graph of the denominator (on the y axis) against the numerator (on the x axis) forms a straight line through the origin. If this condition is not satisfied, the ratio will misrepresent the true relation between the numerator and the denominator. A between-groups analysis (such as an independent t test) may fail to find a difference between groups that exists, or may find a difference between groups that does not exist (Curran-Everett, 2013). Fredrickson and Losada (2005) did not provide any evidence that their positivity ratio satisfied the required condition.

Nonexperimental Study

The independent t test can be used to analyze data collected in either an experimental or a nonexperimental study; the type of data analyzed determines the conclusions that can be drawn from the results of the test. The independent t test and other across-persons statistical techniques assume that persons in an analysis group are exchangeable. In an independent t test, this assumption applies to the persons whose data appear in each of the two groups created by manipulating the levels of the predictor (independent) variable. In an analysis of variance with, say, three levels of one independent variable crossed with four levels of a second independent variable, this assumption applies to the persons whose data appear in each of the 12 groups, and so on. In an experiment, persons are randomized into groups, so that systematic differences between them are assumed to average out. Any remaining differences between persons within a group are considered to be random error. The data aggregated within each group, usually reported as the mean, can then be regarded as though they are representative of a single ideal person who is not affected by any variable not controlled in the study (Sidman, 1960, p. 162). Moreover, the relation between the predictor (independent) variable and the criterion (dependent) variable can be assumed to be causal. These assumptions are not appropriate when persons are not randomized into groups. An unknown “third variable” may be systematically affecting the apparent relation between the predictor variable and the criterion variable.

Fredrickson and Losada’s (2005) study was not an experiment, so their implicit assumptions that the mean of the positivity ratios for each of the two well-being groups in each of the two samples can be assumed to be representative of each person in that group, and that the relation between the positivity ratio and well-being is causal, are not valid. The validity of the conclusions that can be drawn from an independent t test depend on whether there was randomization of persons into groups and not on the fact that this test was used. Fredrickson and Losada (2005) reported only mean positivity ratios without any measures of variability, but the values of t and ω² suggest that there was considerable variability around the mean positivity ratio in each group for each sample; given how close the mean positivity ratios for the nonflourishing group (2.3 and 2.1) and the flourishing group (3.2 and 3.4) in each sample were to the (rounded) tipping-point value of 2.9, it is likely that some of the positivity ratios for the respondents in the nonflourishing group were greater than or equal to the tipping-point value and that some of the positivity ratios for the respondents in the flourishing group were less than the tipping-point value. This may have reflected random error, as suggested by Fredrickson (“impurit[ies]” and “imprecision,” 2009, p. 129), or may have reflected systematic error related to the fact that respondents were not randomized into groups. Unfortunately, the frequencies of respondents with positivity ratios below and above the tipping point in each group were not reported for either sample.

But Waugh and Fredrickson (2006, p. 100) provided, for what appears to be Fredrickson and Losada’s (2005) Sample 2, the frequencies of respondents with positivity ratios less than 2.9 (75/101) and greater than or equal to 2.9 (26/101). Using these frequencies as the row marginals, and the frequencies of nonflourishing (92/101) and flourishing (9/101) respondents reported by Fredrickson and Losada (2005) as the column marginals, of a contingency table allows one to determine the degree of mismatch between the positivity-ratio classification and the well-being classification. For the best and the worst case, the numbers of mismatches equal 17/101 (17%) and 35/101 (35%), respectively. The number of nonflourishing respondents with positivity ratios greater than or equal to 2.9 ranges from 17/92 (18%) to 26/92 (28%). In a clever reverse-engineering of the results of the t tests reported by Fredrickson and Losada (2005), Brown et al. (2014a) estimated that 36% of the nonflourishing respondents in each sample had positivity ratios greater than or equal to 2.9, higher than the maximum of the range estimated with my contingency-table analysis. The reason why their estimate was higher is not clear. It may be that Sample 2 in Fredrickson and Losada’s (2005) empirical study was not in fact the sample used by Waugh and Fredrickson (2006),³ or it may be that Brown et al.’s (2014a) assumptions of equal variances and normal distributions were not valid. In any case, the degree of mismatch between the positivity ratio classification and the well-being classification from either analysis does not constitute impressive support for Fredrickson and Losada’s (2005) deterministic theory. It is unlikely that the degree of mismatch is due only to random error.

Significance Tests

For both samples, Fredrickson and Losada (2005) used one-tailed rather than two-tailed statistical-significance tests on their computed t values without explanation or justification and did not indicate whether the decision to use one-tailed tests was made before the analyses were performed, as should have been the case. Presumably, one-tailed tests were used because of the directional nature of the hypothesis. Doing so requires that Fredrickson and Losada (2005) considered whether they would have viewed a difference between the means in the unexpected direction no differently from a difference between the means in the expected direction that was not strong enough to justify rejection of the null hypothesis, which seems unlikely. It is worth noting that the t value for Sample 1 would have been significant with either a one-tailed or a two-tailed test, but the t value for Sample 2 would not have been significant with a two-tailed test. Indeed, despite Fredrickson and Losada’s (2005) claim to the contrary, the t value for Sample 2 was not in fact significant even with a one-tailed test; the p value reported as equaling .05 actually equaled .0542 and did not meet the traditional standard of p < .05 for rejecting the null hypothesis. Elsewhere Fredrickson has implied or asserted that the results of the analyses for both samples were statistically significant (e.g., in her book: Fredrickson, 2009, p. 253; in her presentation to the Danielson Institute, Center for the Study of Religion and Psychology: Fredrickson, 2010, at about 00:29:00; in her reply to Brown et al., 2013: Fredrickson, 2013, pp. 817-818; in her “Correction to Fredrickson and Losada (2005)”: Fredrickson & Losada, 2005/2013, p. 882, Item d; in her letter to the editor of the Chronicle of Higher Education: Fredrickson, August 30, 2013), which was not the case. Of course, this p < .05 standard, although traditional, is arbitrary; too much emphasis can be placed on p values (Rosnow & Rosenthal, 1989). The problem here is that Fredrickson and Losada (2005) accepted the arbitrary standard, did not quite meet it for Sample 2, but did not in any way acknowledge not meeting it, either in their original article or in subsequent publications and presentations, but simply asserted that the results of their analyses were significant. Given the unacknowledged not-quite-significant results for Sample 2, one must wonder whether Fredrickson and Losada (2005) chose to use one-tailed tests simply because significance would not have been obtained with a two-tailed test for Sample 2. Brown et al. (2014a) also questioned Fredrickson and Losada’s (2005) use of one-tailed tests⁴ and noted that the p value for Sample 2 equaled .0542, not .05 as reported.

Effect Size

Fredrickson and Losada (2005) reported an effect size for the t-test analysis of each of their two samples, although after reporting them, they completely ignored them. Given the strength of Fredrickson and Losada’s (2005) assertion that “individuals . . . must meet or surpass a specific positivity ratio to flourish” (p. 681), these effect sizes—ω² of .05 and .02 for Sample 1 and Sample 2, respectively—are surprisingly low by standard rules of thumb. These values indicate that 5% and 2%, respectively, of the variability in positivity ratios can be accounted for by the two-group well-being classification. (This interpretation does not make much sense because the predictor variable and the criterion variable have been reversed.) It is certainly the case that small effect sizes can have important real-world effects. For example, in the influential study examining the role of aspirin in the primary prevention of cardiovascular disease, the correlation between aspirin use and myocardial infarction (heart attack) was only .034 (computed from data reported by the Steering Committee of the Physicians’ Health Study Research Group, 1988, p. 262, Table 1). Ideally, the substantive significance of a result, as reflected by an effect size, should be interpreted in a meaningful context rather than by rules of thumb.

The important point here is that the implications of these ω² values were not discussed at all by Fredrickson and Losada (2005) and not even reported in the other publications and presentations describing their empirical study. Rather, these other publications and presentations described the results of Fredrickson and Losada’s (2005) empirical study as if they were deterministic: a positivity ratio less than 2.9 (or 3.0 in some citations) results in nonflourishing (or languishing), a positivity ratio greater than or equal to 2.9 (or 3.0) results in flourishing, period. Apparently, Fredrickson and Losada (2005) considered any deviation from this pattern to be due only to difficulties in quantifying affect (“computed positivity ratios invariably reflect the conceptual and temporal resolution of the underlying affect-measurement instruments,” p. 685), and presumably, although not mentioned, to similar difficulties in assessing well-being, that should not be taken into account.

Dichotomization of Well-Being

In commenting on the results of their empirical study, Fredrickson and Losada (2005) stated that “[m]ore critical to our hypothesis, however, in each sample, these mean [positivity] ratios flanked the 2.9 ratio” (p. 684). Fredrickson (2013), in her reply to Brown et al. (2013), wrote that

[a]lthough the [mean positivity] ratios obtained in each of the two samples closely flank the critical positivity ratio pinpointed by Losada’s mathematical work, to the extent that Losada’s mathematical work may have been flawed, inappropriately applied, or both, the apparent empirical support for Losada’s critical “tipping point” ratio offered by these data may have reflected chance, albeit chance striking twice. (pp. 817-818)

Fredrickson (2013) seemed to be questioning Brown et al.’s (2013) critique, implying that it would be unusual to obtain mean positivity ratios flanking the theorized critical minimum positivity ratio for not just one, but two, samples of respondents if there were no such critical minimum positivity ratio.

Fredrickson’s (2013) reasoning was flawed. Even if the mathematical work were accurate and there existed a critical minimum positivity ratio equal to 2.9013, the fact that the two mean positivity ratios obtained in each of the two samples closely flanked the theorized critical minimum positivity ratio would not have provided empirical support for a tipping point between nonflourishing and flourishing. Indeed, the design of Fredrickson and Losada’s (2005) empirical study made it impossible to test for the existence of the theorized tipping point. That would be so regardless of which of the three alternative analyses described earlier were used.

The reason for this impossibility is that Fredrickson and Losada (2005) used the continuous scores from the scales of the measure of well-being to create two groups of respondents, those who were flourishing and those who were not. Dichotomizing a continuous predictor variable (or both a continuous predictor variable and a continuous criterion variable) completely obscures the existence of a tipping point or any other nonlinearity; two points determine a line. The results of Fredrickson and Losada’s (2005) empirical study were compatible with either a linear or a nonlinear relation between well-being and the positivity ratio. Brown et al. (2014a) made the same point.

It is puzzling that Fredrickson and Losada (2005) unnecessarily dichotomized well-being, thereby making it impossible to detect the phenomenon that was the focus of their theory and their article. In general, dichotomization of a continuous variable, whether at the median or at some other cutpoint, is problematical both conceptually and statistically. Fredrickson and Losada (2005) did not provide an explicit reason for their dichotomization, but their theory (positivity ratio less than 2.9013: nonflourish; positivity ratio greater than or equal to 2.9013: flourish) suggests that they believed that there actually existed two distinct classes of respondents (one not flourishing, one flourishing), that assignment of respondents to these classes better represented well-being than would the continuous well-being variable, and that therefore an analysis using the classification rather than the continuous variable was more appropriate. Conceptually, the question arises as to whether these two groups of respondents are really distinct groups or whether they are simply arbitrary groups defined only by partitioning the sample of respondents according to their scores on the scales of the measure of well-being. Constructing groups by partitioning on the basis of obtained scores does not necessarily result in groups that are psychologically meaningful or valid. Whether a psychological variable is more appropriately regarded as a continuum or as a classification is an empirical question and should be determined using a method such as cluster analysis, mixture modeling, latent class analysis, or various taxometric procedures. Statistically, dichotomizing a continuous variable discards information, often resulting in lessened statistical power, decreased effect size, lessened reliability, the inability to detect anything other than a linear relation between the variables under consideration, and the inaccurate characterization of respondents⁵ (McCallum, Zhang, Preacher, & Rucker, 2002). Although there may be some unusual situations that justify the dichotomization of the predictor variable, the criterion variable, or both, Fredrickson and Losada’s (2005) empirical study does not seem to be one of them.

Note that, had the correct within-person across-time study design been adopted instead of the incorrect within-time across-persons design, dichotomization of the well-being variable would not have resulted in it being impossible to detect a tipping point between nonflourishing and flourishing (assuming that the positivity ratio and well-being were assessed at multiple time points; two time points are not adequate for studies of developmental processes or change, Rogosa, 1995). This fact should not be construed as a recommendation for dichotomization, however.

Other Studies as Empirical Evidence

In her reply to Brown et al. (2013), Fredrickson (2013; see also Anthony, January 18, 2014) insisted that substantial empirical evidence exists both for critical minimum and maximum positivity ratios and, more generally, for a (unspecified) nonlinear relation between the positivity ratio and well-being, describing several studies that she believed provided such evidence (Diehl, Hay, & Berg, 2011; Gottman, 1994; Larsen & Prizmic, 2008; Rego, Sousa, Marques, & Pina e Cunha, 2012; Schwartz et al., 2002; Shrira et al., 2011; Waugh & Fredrickson, 2006).

Despite Fredrickson’s (2013) claim to the contrary, none of these studies provided empirical support for the existence of a critical minimum and/or maximum positivity ratio or for a more general nonlinear relation between the positivity ratio and well-being. As noted earlier, Fredrickson and Losada’s (2005) theory of critical positivity ratios clearly described a psychological process that occurs within a person across time and should be tested using within-person across-time data and analyses, unless it can be demonstrated that respondents are exchangeable and that the structure of the across-persons variation between the positivity ratio and well-being is the same as the structure of the within-person variation. All seven of these studies used within-time across-persons data and analyses, and for that reason alone, did not provide empirical support for the existence of a critical minimum and/or maximum positivity ratio or for a more general nonlinear relation between the positivity ratio and well-being. None of these seven studies demonstrated that their respondents were exchangeable; indeed, the standard deviations or standard errors reported for the positivity ratios in the studies that used the positivity ratio as a criterion variable predicted by two or three well-being groups (Diehl et al., 2011; Schwartz et al., 2002; Waugh & Fredrickson, 2006, pp. 101-102) indicate that there was substantial variability of the positivity ratio around the means of the well-being groups. Because respondents were not randomized into these groups, there is no basis for assuming that this variability was purely random. None of these studies showed any awareness of the exchangeability assumption or of the assumption that the structure of the across-persons variation between the positivity ratio and well-being must be the same as the structure of the within-person variation. Some articles (Diehl et al., 2011; Rego et al., 2012; Shrira et al., 2011) did list their cross-sectional design as a limitation of their research and recommend the use of a longitudinal design, but for a reason or reasons other than the need for the data and the analyses used to test a theory to match that theory (e.g., for determining the causal direction of the relation between the positivity ratio and well-being or for examining age and cohort effects).

The seven studies used a variety of within-time across-persons analyses to examine the relation between the positivity ratio and well-being. Two of these studies (Gottman, 1994; Schwartz et al., 2002) predated the Fredrickson and Losada (2005) article and so did not examine the theorized critical positivity ratios per se. Gottman’s (1994, pp. 93-95, 118, Chapter 10) book developed his balance theory of marriage, finding that husbands, wives, and couples in “regulated” and “nonregulated” marriages had positivity ratios of about 5 and less than 1, respectively; regulated marriages were defined as those for which the cumulative difference between emotionally positive and emotionally negative utterances increased for both spouses over the course of a conversation about a marital problem; nonregulated marriages otherwise. Fredrickson and Losada (2005, p. 683) asserted that these ratios provided empirical support for their theorized critical minimum positivity ratio because they flanked 2.9. But in a second study, Gottman (1994, Chapter 13) altered his measure of affect to include more negative items and then found that husbands and wives in regulated marriages no longer had positivity ratios of about 5 but instead of about 2.4 and 0.63, respectively, a point not acknowledged by Fredrickson and Losada (2005) or Fredrickson (2009, 2013). Schwartz’s (1997) balanced states of mind model proposed a number of cognitive–affective setpoints linking the ratio of positive affect to total affect to pathological, normal, and optimal psychological functioning. The setpoints for normal and optimal functioning were determined to be .72, with a range of .67 to .77, and .81, with a range of .78 to .84, respectively. In an empirical study of men who had undergone either cognitive–behavioral therapy or psychotherapy for depression and entered remission, Schwartz et al. (2002) found that those who had achieved optimal functioning according to clinical criteria had affect balances that averaged .81, whereas those who had achieved normal functioning had affect balances that averaged .70. Fredrickson and Losada (2005, p. 681) and Fredrickson (2009, pp. 133, 253) reported these ratios to be about 4.3 and 2.3, respectively, stating that they had algebraically transformed Schwartz et al.’s (2002) ratios of positive affect to total affect into their preferred ratios of positive affect to negative affect, and again asserted that these ratios provided empirical support for their theorized critical minimum positivity ratio because they flanked 2.9. It is not clear how this transformation was accomplished. In addition to reporting mean affect–balance ratios, Schwartz et al. (2002) reported the means of positive affect and negative affect. If Fredrickson and Losada (2005) computed their preferred positivity ratios by dividing mean positive affect by mean negative affect, this would not be correct; the ratio of the means is not mathematically equivalent to the mean of the ratios. In any case, as noted earlier in regards to Fredrickson and Losada’s (2005) empirical study, even if the mean positivity ratios did flank 2.9, neither the results of Gottman (1994) nor those of Schwartz et al. (2002) can be considered empirical evidence of a tipping point between nonflourishing and flourishing or of a nonlinear relation between well-being and the positivity ratio because the two-group predictor variables obscured any nonlinearity that might exist.

Two other studies also adopted the strategy of sorting respondents into groups and then comparing the mean positivity ratios of those groups (Diehl et al., 2011; Waugh & Fredrickson, 2006, pp. 101-102). Waugh and Fredrickson (2006, pp. 101-102) split their respondents into two groups, those with high (0.5 standard deviation above the mean) and those with low (0.5 standard deviation below the mean) “relationship building.” Respondents with relationship building between these cutpoints were omitted from the analysis. The high-relationship-building group and the low-relationship-building group had mean positivity ratios of 2.91 and 2.13, respectively. No independent t test was reported. Like Fredrickson and Losada (2005), Waugh and Fredrickson (2006, pp. 101-102) asserted that the fact that the two mean positivity ratios flanked the critical minimum positivity ratio supported their theory. But, as noted earlier, dichotomizing the predictor variable completely obscures the existence of a tipping point or any other nonlinearity. Also, the closeness of the mean positivity ratio for the high-relationship-building group (2.91) to the critical minimum positivity ratio (2.9), the omission of some respondents from the analysis, and the lack of an independent t test makes one wonder whether the groups were constructed specifically to obtain a mean positivity ratio greater than or equal to 2.9 for the high-relationship-building group. Diehl et al.’s (2011) study had three (languishing, moderately mentally healthy, and flourishing) rather than two well-being groups for each of three age groups (young adults, middle-aged adults, and older adults). For the middle-aged and the older adults, the mean positivity ratios were greater than or equal to 2.9 for all three well-being groups. For the young adults, the mean positivity ratios equaled 1.6, 3.0, and 4.3 for the languishing, moderately mentally healthy, and flourishing groups, respectively. Diehl et al. (2011) commented that “in our sample, the value of 2.9 sat exactly on the cusp between languishing and flourishing mental health among young adults, with moderately mentally healthy young adults having a mean positivity ratio of 3” (p. 890). Fredrickson (2013) then asserted that “the data for young adults in Diehl and colleagues’ (2011) sample mapped well onto the prediction . . . that the critical positivity ratio that sets flourishers apart from others is about 3:1” (p. 819). But note that the mean positivity ratios for the three groups (1.6, 3.0, 4.3) were nearly equally spaced. Although Diehl et al.’s (2011) use of a three-group classification of well-being as the predictor variable eliminated the possibility of doing a formal trend analysis following their multivariate analysis of variance to demonstrate a quadratic trend,⁶ these mean positivity ratios provided no hint that the relation between well-being and the positivity ratio was anything other than linear for the young adults. Thus, the results of the studies by Waugh and Fredrickson (2006) and Diehl et al. (2011) provided no empirical support for a tipping point between nonflourishing and flourishing or for a nonlinear relation between well-being and the positivity ratio.

The articles by Waugh and Fredrickson (2006, pp. 100-102) and Diehl et al. (2011) also included analyses that correctly used the positivity ratio as the predictor variable and some measure of well-being as the criterion variable. Waugh and Fredrickson (2006, pp. 100-101) dichotomized respondents according to the 2.9 tipping point theorized by Fredrickson and Losada (2005) and then used independent t tests to predict two continuous well-being variables (residual change from Time 1 to Time 2 in “self-other overlap” and in “complex understanding”). They followed each independent t test with two single-sample t tests, one for the group of respondents with positivity ratios greater than or equal to 2.9 and the other for the group of respondents with positivity ratios less than 2.9. The single-sample t test showed a significant mean increase in each criterion variable from Time 1 to Time 2 for the group of respondents with positivity ratios greater than or equal to 2.9 but not for the group of respondents with positivity ratios less than 2.9.⁷ Fredrickson (2013, pp. 817-818) considered this pattern to be empirical support for a “critical change point” (apparently equated with a tipping point). But, as noted earlier, dichotomizing a continuous predictor variable completely obscures the existence of a tipping point or any other nonlinearity in the relation between the positivity ratio and well-being. It would have been more appropriate to have used both the positivity ratio and well-being as continuous variables in a segmented regression analysis and shown that (a) the first segment had essentially a flat (zero or nonsignificant positive) slope, (b) the second segment had a significant positive slope, and (c) the breakpoint occurred at a positivity ratio of about 2.9.

Diehl et al. (2011) used discriminant function analysis to predict the mental health status (well-being) of their respondents from years of education, self-rated health, life satisfaction, physical symptoms, and the positivity ratio. This analysis was not capable of providing evidence of a tipping point between nonflourishing and flourishing or of a nonlinear relation between the positivity ratio and well-being.

Diehl et al. (2011) also performed a contingency-table analysis (positivity ratio: less than 2.9, greater than or equal to 2.9, by well-being group: languishing, moderately mentally healthy, flourishing). If one collapses the moderately mentally healthy group into the languishing group to create a nonflourishing group, as did Fredrickson and Losada (2005), then 88/239 (37%) of the respondents in the entire sample had a mismatch between their positivity-ratio classification (less than 2.9, greater than or equal to 2.9) and their well-being classification (nonflourishing, flourishing). Of those respondents in the nonflourishing group, 78/178 (44%) had positivity ratios greater than or equal to 2.9. For the young adults, 25/81 (31%) of the respondents had a mismatch between their positivity-ratio classification and their well-being classification. Of those respondents in the nonflourishing group, 18/65 (28%) had positivity ratios greater than or equal to 2.9 (data kindly provided by Manfred Diehl, February 19, 2014). The results of these analyses are similar to those of the reverse-engineered analyses of Brown et al. (2014a) of Fredrickson and Losada’s (2005) two samples and my contingency-table analysis of Fredrickson and Losada’s (2005) Sample 2. Although my contingency-table analyses of Diehl et al.’s (2011) data yield significant chi-square values, the high rates of mismatch between the positivity ratio classification and the well-being classification do not provide convincing support for Fredrickson and Losada’s (2005) deterministic theory.

The remaining studies (Larsen & Prizmic, 2008; Rego et al., 2012; Shrira et al., 2011) used both the positivity ratio and a well-being variable(s) as continuous variables in correlation and regression analyses with the positivity ratio as the predictor variable. Larsen and Prizmic (2008) reviewed a study by Larsen (2002) that concluded that the average person must have three good days (positive affect exceeds negative affect) to one bad day (negative affect exceeds positive affect) to maintain average well-being and that the standardized weight for negative affect was about three times that for positive affect in a regression analysis predicting well-being from positive affect and negative affect. Neither statistic is mathematically equivalent to a positivity ratio of 2.9 or 3.0 as computed by Fredrickson and Losada (2005) and Fredrickson (2009, Chapter 7). Moreover, average well-being cannot be considered flourishing.

Rego et al. (2012) found a concave curvilinear relation (inverted-U curve) between the positivity ratio and creativity for retail-shop assistants. The turning point of the curve equaled 3.6, quite different from Fredrickson and Losada’s (2005) theorized critical maximum positivity ratio of 11.6346. Fredrickson (2013), apparently equating the turning point of a curvilinear function⁸ with a discontinuous tipping point, suggested that this difference may have resulted from Rego et al. (2012) having restricted the range of their positivity ratio to 1/7 to 7/1. Be that as it may, examination of the data in Rego et al.’s (2012) scatterplot (p. 259, Figure 2) of the relation between the positivity ratio and creativity showed no evidence at all for a tipping point and very little visual evidence for concave curvilinearity, conclusions also drawn by Brown et al. (2014a). The overall pattern of the data points suggests that the sample may have consisted of two different kinds of respondents, those who had the ability to be very creative and those who did not. A nonhomogeneous sample can result in an apparent but spurious curvilinear relation between the predictor variable and the criterion variable. In any case, no evidence that the respondents were exchangeable and that the structure of the across-persons variation between the positivity ratio and creativity could be expected to be the same as the structure of the within-person variation was provided, so the results of Rego et al.’s (2012) analysis cannot be considered empirical support for Fredrickson and Losada’s (2005) within-person across-time theory. Rego et al. (2012) themselves commented that “being carried out at a single moment, the study does not capture the dynamics that occur over the course of time involving changes in affective states and their effects on . . . creativity” (pp. 262-263).

Shrira et al. (2011) predicted general psychological distress and illness cognitions—helplessness, acceptance, and disease benefits—from the positivity ratio for gastric-cancer outpatients. The relations between the positivity ratio and general psychological distress, helplessness, acceptance, and disease benefits were convex curvilinear (U curve), linear, concave curvilinear (inverted-U curve), and concave curvilinear, respectively. The turning point of the curve equaled about 3.0 for each of the criterion variables having a curvilinear relation with the positivity ratio. Shrira et al. (2011) also predicted traumatic distress and its components—intrusive thoughts, hypervigilance, and avoidance—from the positivity ratio for hospital personnel exposed to missile attacks during the Second Lebanon War. The relations between the positivity ratio and traumatic distress, intrusive thoughts, and hypervigilance were all convex curvilinear; the relation between the positivity ratio and avoidance (apparently) was linear. The turning point of the curve equaled about 3.0 for the relation between the positivity ratio and traumatic distress but was not reported for the other curvilinear relations. The turning points of 3.0 for the two samples were quite a bit lower than Fredrickson and Losada’s (2005) theorized critical maximum (or minimum, for a U curve) positivity ratio of 11.6346. Fredrickson (2013) attributed this difference to Shrira et al. (2011) having restricted the range of their positivity ratio to 1/4 to 4/1, concluding from the results of the studies of Rego et al. (2012) and Shrira et al. (2011) that “an inverted-U inflection point exists. Where precisely it falls remains an important target for future research” (p. 818), but in fact, floor and ceiling effects can result in relations between variables that are truly linear appearing to be curvilinear. Unlike Rego et al. (2012), Shrira et al. (2011) graphed only the form of the relation between the positivity ratio and each criterion variable, and not the data points themselves, so that it is not possible to ascertain the reason for the obtained curvilinear relations and the strangely consistent values of the turning points. Suffice it to say that the evidence in Shrira et al.’s (2011) study for curvilinearity was weak, with the increases in percentage-of-variance-accounted-for (R²) achieved by fitting a curve instead of a line to the data being quite small, as admitted by Shrira et al. (2011) and also noted by Brown et al. (2014a). Again, no evidence that the respondents were exchangeable and that the structure of the across-persons variation between the positivity ratio and the criterion variables could be expected to be the same as the structure of the within-person variation was provided, so the results of Shrira et al.’s (2011) analyses also cannot be considered empirical support for Fredrickson and Losada’s (2005) within-person across-time theory.

In regards to the analyses in the Rego et al. (2012) and Shrira et al. (2011) articles, Brown et al. (2014a) remarked that

the regression curve of “degree of flourishing” cannot possibly be a perfectly linear function of the positivity ratio, for the simple reason that the “degree of flourishing” is bounded (for instance, Rego et al.’s measure of “creativity” runs from 1 to 5) whereas the positivity ratio can in principle become arbitrarily large. (p. 631)

This statement is not correct. The positivity ratio, as computed by Rego et al. (2012) and Shrira et al. (2011) had an upper bound of 7 and 4, respectively. (The positivity ratio, as computed by Fredrickson and Losada, 2005, Waugh and Fredrickson, 2006, and Diehl et al., 2011, however, could in principle become arbitrarily large.) Brown et al. (2014a) then pointed out that the curvilinearity in the relation between the positivity ratio and well-being in the articles by Rego et al. (2012) and Shrira et al. (2011) could be reduced substantially by replacing the positivity ratio P/N by the positivity proportion P/(P + N), which contains the same information and eliminates any division-by-zero problem (which could not have occurred in the Rego et al., 2012, and Shrira et al., 2011, articles). Although this is the case, the use of P/(P + N) rather than P/N does not eliminate the difficulties of using a ratio variable in statistical analyses discussed earlier.

For the most part, Fredrickson (2013) seems to have accepted without question the results of the empirical studies that she viewed as providing evidence for Fredrickson and Losada’s (2005) proposed tipping-point values (Diehl et al., 2011, young adults; Gottman, 1994; Larsen & Prizmic, 2008; Schwartz et al., 2002; Waugh & Fredrickson, 2006), and questioned only the results of those empirical studies with results disconfirming those values (Rego et al., 2012; Shrira et al., 2011). As noted earlier, the basis of her criticism was that the disconfirming empirical studies did not use the same method of assessing positive and negative affect and constructing the positivity ratio that Fredrickson and Losada (2005) had used. More specifically, she criticized the use of rating scales that restricted the maximum value of the positivity ratio to 7 (Rego et al., 2012) or 4 (Shrira et al., 2011). But Fredrickson and Losada’s (2005) method of assessing positive and negative affect and constructing a positivity ratio was unusual.

Typically, an instrument assessing affect contains positive-affect items and negative-affect items to which a respondent replies using a rating scale. Then the value of positive affect and of negative affect is computed by taking the mean of those replies across the positive-affect items and the negative-affect items, respectively. Thus, the maximum value of either type of affect is the maximum value of the rating scale. The positivity ratio would be computed by dividing mean positive affect by mean negative affect. As explained earlier, in Fredrickson and Losada’s (2005) empirical study, each respondent completed an online questionnaire to indicate the extent to which he or she felt each of 11 positive and 8 negative emotions on a scale of 0 to 4. Each respondent received 1 point for each positive emotion rated 2, 3, or 4, and 1 point for each negative emotion rated 1, 2, 3, or 4. Positive affect and negative affect were determined by tallying positive-emotion points and negative-emotion points, respectively, across 28 days. On a daily basis, the maximum value of either type of affect would be the number of items assessing that affect; it would not be the maximum value of the rating scale.

Moreover, in Fredrickson and Losada’s (2005) method, the possible range of the positivity ratio depends on the number of days for which information about positive and negative affect is collected. If information is collected for just 1 day, then the tallies for positive affect and negative affect can range from 0 to 11 and 0 to 8, respectively, so that the maximum positivity ratio (assuming at least 1 count for negative affect) equals 11. If the information is collected for 2 days, then the tallies for positive affect and negative affect can range from 0 to 22 and 0 to 16, respectively, so that the maximum positivity ratio equals 22, and so on. In short, the possible range of the positivity ratio is heavily dependent on the number of days for which the affect information is collected. But the values of the lower and the upper tipping point are fixed at 2.9013 and 11.6346, respectively, regardless of the number of days for which the affect information is collected. This is questionable.

Clearly, the method of assessing positive and negative affect and of computing the positivity ratio will affect the value of the positivity ratio. But the mathematical model proposed by Losada (1999), Losada and Heaphy (2004), and Fredrickson and Losada (2005) has no free parameters⁹ to accommodate the arbitrary choices made in assessing positive and negative affect and constructing the positivity ratio. On that basis alone, it was absurd for Fredrickson and Losada (2005) to claim that there existed a universally applicable critical minimum positivity ratio that distinguishes between individuals (or marriages or work groups) who are flourishing and those who are not. Fredrickson (2013) may have realized this in remarking on the need for “future research that computes positivity ratios independently of scaling parameters” (p. 818).¹⁰ This is a tall order.

Conclusions

In response to Brown et al.’s (2013) demonstration that the mathematical work underlying Fredrickson and Losada’s (2005) theorized critical minimum and maximum positivity ratios was both flawed and misapplied, Fredrickson (2013) insisted that, regardless of the incorrect mathematical work, substantial empirical evidence exists both for critical minimum and maximum positivity ratios and, more generally, for a nonlinear relation between the positivity ratio and well-being. This article has shown that Fredrickson’s (2013) insistence is not warranted, by first noting that there was a mismatch between Fredrickson and Losada’s (2005) theory and the data used to test it, then describing the methodological and statistical problems of Fredrickson and Losada’s (2005) empirical study that invalidate its results, and, finally, demonstrating that Fredrickson (2013) has not interpreted correctly the results of the seven other studies that she cited as providing substantial empirical evidence.

Apparently undaunted by Brown et al.’s (2013) critique, Fredrickson (2013) repeatedly urged continued investigation into the possible existence and values of critical minimum and maximum positivity ratios:

Whether the Lorenz equations—the nonlinear dynamic model we’d adopted—and the model estimation technique that Losada utilized can be fruitfully applied to understanding the impact of particular positivity ratios merits renewed and rigorous inquiry. (p. 814)

The question newly raised by Brown et al.’s (2013) critique is whether positivity ratios obey one or more critical tipping points, and if so, whether those critical tipping points coincide with the ones identified by Losada’s mathematical work for all individuals, samples, and subgroups. Clearly these questions merit further test. (p. 818)

Whether the outcomes associated with positivity ratios show discontinuity and obey one or more change points, however, merits further test. (p. 819)

The point of this comment is not that there exists no tipping point between nonflourishing and flourishing, or no nonlinear relation between the positivity ratio and well-being, but, rather, that no convincing evidence has yet been presented for such. This lack of evidence suggests that continued investigation into the possible existence and values of critical minimum and maximum positivity ratios, as urged by Fredrickson (2013), is likely to prove a fool’s errand.

Footnotes

Acknowledgements

Appreciation is due Nick Brown, Alan Sokal, and Harris Friedman for engaging discussions.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes

Author Biography

Carol A. Nickerson, PhD, is a quantitative psychologist with a strong interest in social psychology and related areas.

References

Anthony

(2014, January 18). The British amateur who debunked the mathematics of happiness. Observer. Retrieved from http://www.theguardian.com/science/2014/jan/19/mathematics-of-happiness-debunked-nick-brown

Block

(1996). Some jangly remarks on Baumeister and Heatherton. Psychological Inquiry, 7, 28-32. doi:10.1207/s15327965pli0701_5

Borsboom

Mellenbergh

G. J.

van Heerden

(2003). The theoretical status of latent variables. Psychological Review, 110, 203-219. doi:10.1037/0033-295X.110.2.203

Brown

N. J. L.

Sokal

A. D.

Friedman

H. L.

(2013). The complex dynamics of wishful thinking: The critical positivity ratio. American Psychologist, 68, 801-813. doi:10.1037/a0032850

Brown

N. J. L.

Sokal

A. D.

Friedman

H. L.

(2014a). The persistence of wishful thinking. American Psychologist, 69, 629-632. doi:10.1037/a0037050

Brown

N. J. L.

Sokal

A. D.

Friedman

H. L.

(2014b). Positive psychology and romantic scientism. American Psychologist, 69, 636-637. doi:10.1037/a0037390

Buck

(1980). Nonverbal behavior and the theory of emotion: The facial feedback hypothesis. Journal of Personality and Social Psychology, 38, 811-824. doi:10.1037/0022-3514.38.5.811

Cattell

R. B.

(1946). Description and measurement of personality. Yonkers-on-Hudson, NY: World Book Company.

Collins

L. M.

Graham

J. J.

Flaherty

B. P.

(1998). An alternative framework for defining mediation. Multivariate Behavioral Research, 33, 295-312. doi:10.1207/s15327906mbr3302_5

10.

Conner

T. S.

Tennen

Fleeson

Barrett

L. F.

(2009). Experience sampling methods: A modern idiographic approach to personality research. Social and Personality Psychology Compass, 3, 292-313. doi:10.1111/j.1751-9004.2009.00170.x

11.

Cudeck

Klebe

K. J.

(2002). Multiphase mixed-effects models for repeated measures data. Psychological Methods, 7, 41-63. doi:10.1037/1082-989X.7.1.41

12.

Curran-Everett

(2013). Explorations in statistics: The analysis of ratios and normalized data. Advances in Physiology Education, 37, 213-219. doi:10.1152/advan.00053.2013

13.

Davidson

A. R.

Morrison

D. M.

(1982). Social psychological models of decision making. Research in Marketing, Supplement 1, 91-112.

14.

Davidson

A. R.

Morrison

D. M.

(1983). Predicting contraceptive behavior from attitudes: A comparison of within- versus across-subjects procedures. Journal of Personality and Social Psychology, 45, 997-1009. doi:10.1037/0022-3514.45.5.997

15.

Diehl

Hay

E. L.

Berg

K. M.

(2011). The ratio between positive and negative affect and flourishing mental health across adulthood. Aging & Mental Health, 15, 882-893. doi:10.1080/13607863.2011.569488

16.

Epstein

(1983). Aggregation and beyond: Some basic issues on the prediction of behavior. Journal of Personality, 51, 360-392. doi:10.1111/j.1467-6494.1983.tb00338.x

17.

Firebaugh

(1980). Cross-national versus historical regression models: Conditions of equivalence in comparative analysis. Comparative Social Research, 3, 333-344.

18.

Fredrickson

B. L.

(2009). Positivity: Groundbreaking research reveals how to embrace the hidden strength of positive emotions, overcome negativity, and thrive. New York, NY: Crown.

19.

Fredrickson

B. L.

(2010, March 20). The dynamics of positive opposites [Video file]. Presentation to the Danielson Institute, Center for the Study of Religion and Psychology, Boston University. Retrieved from http://www.youtube.com/watch?v=jvPHF3u5zL8

20.

Fredrickson

B. L.

(2013, August 30). Recalculating a positivity ratio, and finding a metaphor [Letter to the Editor]. Chronicle of Higher Education. Retrieved from http://www.chronicle.com/blogs/letters/recalculating-a-positivity-ratio-and-finding-a-metaphor/

21.

Fredrickson

B. L.

(2013). Updated thinking on positivity ratios. American Psychologist, 68, 814-822. doi:10.1037/a0033584

22.

Fredrickson

B. L.

Losada

M. F.

(2005). Positive affect and the complex dynamics of human flourishing. American Psychologist, 60, 678-686. doi:10.1037/0003-066X.60.7.678 (Correction published 2013, American Psychologist, 68, p. 822. doi:10.1037/a0034435)

23.

Gottman

J. M.

(1994). What predicts divorce? The relationship between marital processes and marital outcomes. Hillsdale, NJ: Erlbaum.

24.

Guastello

S. J.

(2014). Nonlinear dynamical models in psychology are widespread and testable. American Psychologist, 69, 628-629. doi:10.1037/a0036980

25.

Hämäläinen

R. P.

Luoma

Saarinen

(2014). Mathematical modeling is more than fitting equations. American Psychologist, 69, 633-634. doi:10.1037/a0037048

26.

Hammond

K. R.

McClelland

G. H.

Mumpower

(1980). Human judgment and decision making: Theories, methods, and procedures. New York, NY: Praeger.

27.

Jaccard

(1981). Attitudes and behavior: Implications of attitudes toward behavioral alternatives. Journal of Experimental Social Psychology, 17, 286-307. doi:10.1016/0022-1031(81)90029-9

28.

Jaccard

Dittus

(1990). Idiographic and nomothetic perspectives on research methods and data analysis. In Hendrick

Clark

M. S.

(Eds.), Research methods in personality and social psychology (pp. 312-351). Newbury Park, CA: Sage.

29.

Jones

(n.d.). Do multilevel models ever give different results? Retrieved from http://www.bristol.ac.uk/media-library/sites/cmm/migrated/documents/different-results.pdf

30.

Keren

(1993). Between- or within-subjects design: A methodological dilemma. In Keren

Lewis

(Eds.), A handbook for data analysis in the behavioral sciences: Vol. 2. Methodological issues (pp. 257-272). Hillsdale, NJ: Erlbaum.

31.

Keyes

C. L. M.

(2002). The mental health continuum: From languishing to flourishing in life. Journal of Health and Social Behavior, 43, 207-222. doi:10.2307/3090197

32.

Larsen

R. J.

(2002). Differential contributions of positive and negative affect to subjective well being. In Da Silva

J. A.

Matsushima

E. H.

Ribeiro-Filho

N. P.

(Eds.), Proceedings of the Eighteenth Annual Meeting of the International Society for Psychophysics: In a new continent, for a new psychophysics (pp. 186-190). Rio de Janeiro, Brazil: International Society for Psychophysics.

33.

Larsen

R. J.

Prizmic

(2008). Regulation of emotional well-being: Overcoming the hedonic treadmill. In Eid

Larsen

R. J.

(Eds.), The science of subjective well-being (pp. 258-289). New York, NY: Guilford Press.

34.

Lefebvre

V. A.

Schwartz

R. M.

(2014). An empirical ratio in search of a theory. American Psychologist, 69, 634-635. doi:10.1037/a0036949 (Correction published 2014, American Psychologist, 69, p. 935. doi:10.1037/a0038276)

35.

Losada

(1999). The complex dynamics of high performance teams. Mathematical and Computer Modelling, 30, 179-192. doi:10.1016/S0895-7177(99)00189-2

36.

Losada

Heaphy

(2004). The role of positivity and connectivity in the performance of business teams: A nonlinear dynamics model. American Behavioral Scientist, 47, 740-765. doi:10.1177/0002764203260208

37.

Mandler

(1959). Stimulus variables and subject variables: A caution. Psychological Review, 66, 145-149. doi:10.1037/h0043276

38.

McCallum

R. C.

Zhang

Preacher

K. J.

Rucker

D. D.

(2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19-40. doi:10.1037/1082-989X.7.1.19

39.

Michaela

J. L.

(1990). Within-person correlational design and analysis. In Hendrick

Clark

M. S.

(Eds.), Research methods in personality and social psychology (pp. 279-311). Newbury Park, CA: Sage.

40.

Mitchell

T. R.

(1974). Expectancy models of job satisfaction, occupational preference, and effort: A theoretical, methodological, and empirical appraisal. Psychological Bulletin, 81, 1053-1077. doi:10.1037/h0037495

41.

Molenaar

P. C. M.

(2004). A manifesto on psychology as idiographic science: Bringing the person back into scientific psychology, this time forever. Measurement: Interdisciplinary Research and Perspectives, 2, 201-218. doi:10.1207/ s15366359mea0204_1

42.

Molenaar

P. C. M.

(2005). Rejoinder to Rogosa’s commentary on “A manifesto on psychology as idiographic science.” Measurement: Interdisciplinary Research and Perspectives, 3, 116-119. doi:10.1207/s15366359mea0302_4

43.

Musau

(2014). The place of mathematical models in psychology and the social sciences. American Psychologist, 69, 632-633. doi:10.1037/a0037068

44.

Nesselroade

J. R.

(2002). Elaborating the differential in differential psychology. Multivariate Behavioral Research, 37, 543-561. doi:10.1207/S15327906MBR3704_06

45.

Nickerson

(2007). Theory/analysis mismatch: Comment on Fredrickson and Joiner’s (2002) test of the broaden-and-build theory of positive emotions. Journal of Happiness Studies, 8, 537-561. doi:10.1007/s10902-006-9030-5 (Erratum published 2017, Journal of Happiness Studies, 8, p. 563. doi:10.1007/s10902-007-9071-4)

46.

Nickerson

C. A.

(2014). No empirical evidence for critical positivity ratios. American Psychologist, 69, 626-628. doi:10.1037/a0036961

47.

Nickerson

C. A.

McClelland

G. H.

(1988). Extended axiomatic conjoint measurement: A solution to a methodological problem in studying fertility-related behaviors. Applied Psychological Measurement, 12, 129-153. doi:10.1177/014662168801200203

48.

Nickerson

C. A.

McClelland

G. H.

(1989). Across-persons versus within-persons tests of expectancy-value models: A methodological note. Journal of Behavioral Decision Making, 2, 261-270. doi:10.1002/bdm.3960020405

49.

Nickerson

C. A. E.

(1999). Assessing convergent validity of health-state utilities obtained using different scaling methods. Medical Decision Making, 19, 487-496. doi:10.1177/0272989X9901900417

50.

Norman

W. T.

(1967). On estimating psychological relationships: Social desirability and self-report. Psychological Bulletin, 67, 273-293. doi:10.1037/h0024414

51.

Pagel

M. D.

Davidson

A. R.

(1984). A comparison of three social-psychological models of attitude and behavioral plan: Prediction of contraceptive behavior. Journal of Personality and Social Psychology, 47, 517-533. doi:10.1037/0022-3514.47.3.517

52.

Pelham

B. W.

(1993). The idiographic nature of human personality: Examples of the idiographic self-concept. Journal of Personality and Social Psychology, 64, 665-677. doi:10.1037/0022-3514.64.4.665

53.

Rego

Sousa

Marques

Pina e Cunha

(2012). Optimism predicting employees’ creativity: The mediating role of positive affect and the positivity ratio. European Journal of Work & Organizational Psychology, 21, 244-270. doi:10.1080/1359432X.2010.550679

54.

Rodgers

J. L.

Cleveland

H. H.

van den Oord

Rowe

D. C.

(2000). Resolving the debate over birth order, family size, and intelligence. American Psychologist, 55, 599-612. doi:10.1037/0003-066X.55.6.599

55.

Rogosa

(1995). Myths and methods: “Myths about longitudinal research” plus supplemental questions. In Gottman

J. M.

(Ed.), The analysis of change (pp. 3-66). Mahwah, NJ: Erlbaum.

56.

Rosnow

R. L.

Rosenthal

(1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44, 1276-1284. doi:10.1037/0003-066X.44.10.1276

57.

Runyan

W. M.

(1983). Idiographic goals and methods in the study of lives. Journal of Personality, 51, 413-437. doi:10.1111/j.1467-6494.1983.tb00339.x

58.

Schwartz

R. M.

(1997). Consider the simple screw: Cognitive science, quality improvement, and psychotherapy. Journal of Consulting and Clinical Psychology, 65, 970-983. doi:10.1037/0022-006X.65.6.970

59.

Schwartz

R. M.

Reynolds

C. F.

III Thase

M. E.

Frank

Fasiczka

A. L.

Haaga

D. A. F.

(2002). Optimal and normal affect balance in psychotherapy of major depression: Evaluation of the balanced states of mind model. Behavioural and Cognitive Psychotherapy, 30, 439-450. doi:10.1017/S1352465802004058

60.

Shrira

Palgi

Wolf

J. J.

Haber

Goldray

Shacham-Shmueli

Ben-Ezra

(2011). The positivity ratio and functioning under stress. Stress & Health, 27, 265-271. doi:10.1002/smi.1349

61.

Sidman

(1960). Tactics of scientific research: Evaluating experimental data in psychology. New York, NY: Basic Books.

62.

Snijders

T. A. B.

Bosker

R. J.

(1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London, UK: Sage.

63.

Steering Committee of the Physicians’ Health Study Research Group. (1988). Preliminary report: Findings from the aspirin component of the ongoing Physicians’ Health Study. New England Journal of Medicine, 318, 262-264. doi:10.1056/NEJM198801283180431

64.

Torgerson

W. S.

(1958). Theory and methods of scaling. New York, NY: Wiley.

65.

Waugh

C. E.

Fredrickson

B. L.

(2006). Nice to know you: Positive emotions, self-other overlap, and complex understanding in the formation of a new relationship. Journal of Positive Psychology, 1, 93-106. doi:10.1080/17439760500510569