Reliability Generalization as a Seal of Quality of Substantive Meta-Analyses: The Case of the VIA Inventory of Strengths (VIA-IS) and Their Relationships to Life Satisfaction

Abstract

Reliable test scores are essential to interpret the results obtained in statistical analyses correctly. In this study, we used the Values in Action Inventory of Strengths (VIA-IS) as an example of a widely applied assessment instrument to analyze its metric quality in what is known as reliability generalization (RG). In addition, we conducted a meta-analysis of the correlations between character strengths and life satisfaction to examine the potential relationship between the reliability of test scores and the intensity of these correlations. The overall variability of alpha coefficients supports the argument that reliability is sample dependent. Indeed, there were statistically significant mean reliability differences for scores across the 24 scales, with the highest level of reliability observed for Creativity and the lowest for scores on Self-regulation. Significant moderators such as the standard deviation of the scores and the sample type contribute to understand the high variability observed in the reliability estimation. The second meta-analysis showed that Zest, Hope, Gratitude, Curiosity, and Love were the character strengths that were highly related to life satisfaction, while Modesty and Prudence were less related to life satisfaction. Furthermore, the high heterogeneity between samples might be an indicator of the relationship between the variability of reliability of character strengths' scores and the intensity of their correlations with life satisfaction. Those character strengths with high-potential RG are related or unrelated to life satisfaction, whereas character strengths with less-potential RG showed unstable correlation patterns. The results of both studies point out the role of the relationship between the reliability of test scores and substantive studies, such as Pearson's correlations meta-analysis.

Keywords

Character strengths VIA-IS life satisfaction meta-analysis reliability generalization Pearson's correlation

Introduction

“What you measure affects what you do. If you don't measure the right thing, you don't do the right thing.” (Joseph Stiglitz)

Psychological tests are the main tools used by researchers and clinical professionals to quantify the level of an individual's psychological construct. Although both validity and reliability are psychometrical concepts that help to assess the quality of a psychological instrument (Kimberlin & Winterstein, 2008), the importance of the reliability was often overlooked (Vacha-Haase & Thompson, 2011). Reliability refers to the extent to which a test or any measuring procedure yields the same result on repeated trials (Carmines & Zeller, 1979). It is considered to be a necessary but not sufficient condition for validity (Sullivan, 2011; Warne, 2008).

There are important reasons why reliability should be reported in each study. For instance, reliable instrument scores are essential to interpret the results obtained by a statistical analysis correctly. On the contrary, the unreliability of the test scores would question the use of General Linear Model analysis (e.g., regression, analysis of variance, etc.; Vacha-Haase & Thompson, 2011). In other words, unreliable scores may call into question the validity of the statistical conclusion obtained. Therefore, the first step in any quantitative research study should be the assessment of the reliability of the instrument scores (Sullivan, 2011; Vacha-Haase & Thompson, 2011; Wilkinson & APA Task Force, 1999).

Considering the significance and the possible negative consequences of the unreliability of the test scores, Vacha-Haase (1998) proposed the reliability generalization (RG) meta-analysis. The RG meta-analysis was based on the validity generalization meta-analytic method, developed by Schmidt and Hunter (1977). Through systematical exploration of the different results of the reliabilities of any specific test, an RG meta-analysis contributes to a better comprehension of the factors involved in the variability of the reliability and what role they play in the study results. Moreover, this type of study can be very helpful to applied researchers, test administrators, and decision-making groups of people and individuals (Vacha-Haase, Henson, & Caruso, 2002) since RG studies would allow (a) to know to what extent the accuracy of a measure is generalizable across samples and studies, (b) to establish the sources of variability (methodological vs. substantive sources) that may affect the reliability estimation, and (c) to define the potential generalizability of reliability by taking into account different sources of error (measurement error vs. estimation error) that might be confounded when interpreting a substantive result (Vacha-Haase, 1998).

It is appropriate to apply an RG only if the test under study was applied widely and if there is a reasonable number of empirical studies that estimated the reliability of the scores (Henson & Thompson, 2002). In this current study, we considered the Values in Action Inventory of Strengths (VIA-IS), a measure based on character strengths (Peterson & Seligman, 2004), since these requisites are accomplished. It is considered that a concrete application of the VIA-IS (or any test) to a specific sample is not enough to establish psychometric properties and the quality of the measure, since the results are related to scores on a particular application, which may vary in a second application even when it is applied to the same sample (Botella, Suero, & Gambara, 2010).

The VIA-IS

The model of strengths and virtues known as the VIA Classification of Character Strengths and Virtues was designed after reviewing several documents related to historic, literature, moral, and religious traditions (Dahlsgaard, Peterson, & Seligman, 2005; Peterson, 2006; Peterson & Seligman, 2004). As a result, Peterson and Seligman (2004) proposed 6 cross-cultural virtues and 24 character strengths (see Supplementary Table 1 for a description of each character strength). Character strengths are defined as positively valued trait-like individual differences with demonstrable generality across different situations and stability across time (Peterson & Seligman, 2004). They are fulfilling; intrinsically valuable, in an ethical sense (gifts, skills, aptitudes, and expertise can be squandered, but character strengths and virtues cannot). They are nonrivalrous as well as not the opposite of a desirable trait (a counterexample is steadfast and flexible, which are opposites but are both commonly seen as desirable). In addition, character strengths are trait-like (habitual patterns that are relatively stable over time); personified (at least in the popular imagination) by people made famous through history; yet, they are not a combination of the other character strengths. Finally, they are nurtured by societal norms and institutions which explains why they may be absent in some individuals (Peterson & Seligman, 2004). Character strengths were thus conceptualized as more personal prosocial trait variables compared with other positive traits such as talents (McGrath, 2016).

This elaboration served as the foundation for the creation of the VIA-IS. This self-assessment instrument is composed of 240 items that measures behaviors that are representative of different character strengths (Peterson & Park, 2006, 2009; Peterson, Park, & Seligman, 2005; VIA Institute, 2012). There are 10 items for each of the 24 strengths in the VIA classification, and a Likert 5-point scale (1: Not at all like me to 5: Very much like me) is used to rate them. The VIA-IS was initially created as a universal measure of what Peterson and Seligman (2004) called the good character. The inclusion of cross-cultural character strengths allowed the translation of the instrument in several languages such as Spanish, Croatian, Chinese, Dutch, Israeli, German, and Hindi (Azañedo, Fernández-Abascal, & Barraca, 2014; Brdar & Kashdan, 2010; Duan et al., 2011; Jónsdóttir, 2010; Littman-Ovadia, & Lavi, 2012a; Ruch et al., 2010; Singh & Choubisa, 2009, 2010). Its large application to different populations and settings in psychological research as well as in different countries supports the objective of examining whether its psychometric properties, and in particular the scores' reliability, can be generalized across studies that have used this inventory.

The study of character strengths occupies a central role in Positive Psychology research because the good character contributes to variables such as pleasure, flow, well-being, life satisfaction, and other positive experiences (Niemiec, 2013; Park & Peterson, 2009; Proyer, Gander, Wellenzohn, & Ruch, 2013). Regarding the relationship between character strengths and life satisfaction, Hope, Zest, Gratitude, Curiosity, and Love have a consistent relationship with character strengths (Brdar & Kashdan, 2010; Ruch, Huber, Beermann, & Proyer, 2007; Shimai, Otake, Park, Peterson, & Seligman, 2006). In contrast, the associations between life satisfaction and Modesty, Creativity, Appreciation of beauty and excellence, Judgment, and Love of learning (Park, Peterson, & Seligman, 2004a, 2004b) are lower or even nonsignificant. However, the substantive interpretation of the correlations between variables might be affected if there is a high heterogeneity of the reliability estimation (Vacha-Haase, 1998). In other words, the variability in the reliability of the test scores alters the probability of Type II error (decreases the statistical power) and directly weakens the correlations.

The present study

The aim of the present study is twofold. First, it attempts to investigate the reliability of scores obtained by VIA-IS applied in a diverse set of contexts. Second, after examining the heterogeneity of the estimation of the reliability of scores (RG), we aim to determine how this influences the relationship between character strengths and life satisfaction. Therefore, this article aims to illustrate the need to perform an RG study before performing a meta-analytical study, since different sources of error may be confounded when interpreting the correlation patterns (measurement error vs. estimation error). For example, the measurement error is associated with a lack of precision of the instrument, and this might increase the estimation errors of the correlations between two variables (Vacha-Haase, 1998). To achieve these aims, we present two studies: the RG (Study I) and the correlations meta-analyses (Study II).

Study I: RG meta-analysis

Method

Literature search

We retrieved articles from PsychINFO, PsicARTICLES, Psicodoc, PubMed, and Psychological Abstracts databases. The literature search encompassed articles published between 2004 (January) and 2016 (December). The articles were searched for by using the following two groups search terms and keywords: Group 1: Values In Action Inventory of Strengths; VIA-IS; Character strengths; VIA; Group 2: Psychometric properties, reliability, internal consistency, alpha, Cronbach's alpha, coefficient alpha. The truncation symbol was added to the most basic word stem for each keyword to ensure all associated terms were included in the search.

To prevent the file drawer problem related to missing values, we contacted the VIA Institute on Character and seven investigators in the field, two of which reported information about missing values that was then included in the current study. The criteria to contact these authors were to have at least one publication that met the inclusion criteria (and none of the exclusion criteria), but all data related to character strengths' reliability were not reported in the published article. The procedure was to write an e-mail explaining the aim of our research and to ask for unpublished data. E-mail content was the same in all cases.

All articles were reviewed by three referees, and discrepancies were resolved through discussion (see Supplementary Table 2 for a description of the inclusion and exclusion criteria, and Supplementary Table 3 for the studies included in the current meta-analysis). Those discrepancies were mostly related to the number of items in each scale and to reliabilities not reported in the studies selected. Interrater reliability for Study I was .92. Flow diagram for Study I is reported in Figure 1.

Figure 1.

Flow chart for Study I.

Data analysis

To carry out the RG study, the Cronbach's alpha of each character strength was collected from each sample. The pooled estimate of the internal consistency of each trait was computed using Bonett's transformation (Bonett, 2010) considering the inverse of its sample variance (Sánchez-Meca & López-Pina, 2008) and assuming a model of random effects. Q test was used to determine whether there were any significant violations of homogeneity in the effect size distributions. Since Q test has a low power because of the small number of the studies (Higgins & Thompson, 2002), I² was also calculated. A categorization of values for I² would tentatively assign adjectives of low, moderate, and high to I² values of 25%, 50%, and 75% (Huedo-Medina, Sánchez-Meca, Marín-Martínez, & Botella, 2006). Trim-and-fill procedures (Duval, 2005; Sterne et al., 2011) were used to account for a potential publication bias (see Supplementary Table 5 for more details). This method can be used to estimate the number of studies missing from a meta-analysis due to the suppression of the most extreme results on one side of the funnel plot.

The effect of the moderating variables (standard deviation, test version, and sample type) on the variability of the reliability estimates was evaluated by means of analyses of variance for categorical variables and regression model for the continuous variable, assuming a model of mixed effects. The data were analyzed using R software (R Core Development Team, 2018) and metafor package (Viechtbauer, 2010).

Results

Estimate of average reliability

After applying the inclusion and exclusion criteria, we found 24 samples for the RG meta-analysis. Supplementary Table 4 presents the studies included and the sample characteristics. Table 1 shows the main summary statistics for the coefficients alpha of each VIA-IS scale. For the transformed coefficients, the (weighted) mean coefficient alpha varies from .71 (Self-regulation) to .86 (Creativity). Coefficients alpha for each character strength showed a statistically significant heterogeneity, ranging from 87.91% (Social intelligence) to 98.51% (Spirituality). Almost all I² values exceed 90%, suggesting that there are several moderator variables that might play a role in reliability variability.

Table 1.

Pooled estimates of character strengths' scores reliability coefficients.

	Weighted analyses				Homogeneity
	N	$\bar{α}$	95% CI		Q	I² (%)
Creativity	23	.86	.78	.91	441.50	96.78
Curiosity	23	.80	.75	.85	208.70	89.58
Judgment	23	.81	.71	.88	1126.67	96.66
Love of learning	23	.81	.75	.85	166.29	90.85
Perspective	23	.78	.72	.83	117.46	91.04
Bravery	23	.75	.64	.83	416.53	95.29
Persistence	23	.84	.71	.81	1380.92	98.14
Honesty	24	.76	.56	.87	2855.64	98.33
Zest	24	.78	.66	.86	1027.31	96.58
Love	23	.75	.68	.81	306.99	90.90
Kindness	23	.75	.67	.82	327.38	92.77
Social intelligence	23	.76	.70	.81	231.37	87.91
Teamwork	23	.76	.68	.81	178.76	91.32
Fairness	23	.78	.73	.83	133.99	88.53
Leadership	22	.77	.65	.85	200.28	96.42
Forgiveness	24	.81	.67	.89	1047.78	97.99
Modesty	24	.75	.63	.84	496.25	96.08
Prudence	23	.74	.65	.81	138.06	93.19
Self-regulation	22	.71	.59	.79	304.71	94.35
Appreciation of beauty	24	.80	.69	.87	452.67	96.70
Gratitude	24	.80	.72	.85	363.25	93.78
Hope	24	.81	.71	.87	575.07	96.37
Humor	24	.85	.75	.91	755.85	97.38
Spirituality	24	.84	.69	.92	1254.41	98.51

Note: Alpha back-transformed from Bonett's procedure. All Q indices were significant (p < .001). N: samples included.

Moderator analyses

Characteristics such as the standard deviation of VIA-IS scores, the test version (English vs. translated versions of the inventory), and the sample type (college-general population) were analyzed to explain the variability of the reliability. Detailed moderator analyses are presented as supplementary material (see Supplementary Tables 6 to 8).

As it can be noted in Table 2, the variability of the test scores (character strengths' standard deviation) affected reliability estimates. The highest explained variance proportion of all of the moderator variables is tested here. In those strengths in which sample type and test version are significant moderators, the explained variance is lower.

Table 2.

Moderator analyses model summary.

	SD	Test version	Sample type	Intercept	$R_{adj}^{2}$
Creativity	1.26	.20	.26*	0.98	.60
Curiosity	2.03**	.11*	–	0.51	.79
Judgment	1.35***	–	.17*	0.89***	.70
Love of learning	–	–	.15**	1.54***	.30
Bravery	–	.36*	–	0.44	.44
Zest	2.01**	.14	.08	0.32	.78
Love	1.64*	.10	–	0.14	.73
Fairness	1.63*	–	–	0.72	.33
Forgiveness	3.00**	–	–	−0.08	.44
Modesty	1.79**	–	.26***	0.27	.78
Prudence	2.29***	–	–	0.09	.77
Self-regulation	1.33*	–	.17**	0.36	.85
Appreciation of beauty	–	.26*	–	1.60***	.19
Gratitude	1.13	.18*	.08	0.89**	.67
Hope	0.50	.13	.24**	1.25**	.54
Humor	−0.09	–	.40***	1.72*	.52
Spirituality	1.61*	–	–	0.50	.18

Note. Only character strengths with significant moderators are included. SD: standard deviation. Predictor's raw weight is included. $R_{adj}^{2}$ = proportion of variance explained; Test version: English (1), translated (0). Sample type: College (0), general population (1).

*p<.05. **p<.01. ***p<.001.

Study II: Applied meta-analysis

Method

Literature search

To identify studies for the correlational meta-analysis, a literature search in the electronic databases such as PsycINFO, PsicARTICLES, Psicodoc, PubMed, and Psychological Abstracts was carried out to find empirical studies that computed Pearson's correlations between character strengths' scores and Satisfaction With Life Scale (Diener, Emmons, Larsen, & Griffin, 1985). The following keywords and search terms were combined in the electronic search for the period between January 2002 and December 2016: Group 1: Values In Action Inventory of Strengths; VIA-IS; Character strengths; VIA, Group 2: Satisfaction with life; Life Satisfaction; Satisfaction With Life Scale (SWLS).

Following the same procedure as Study I, and to prevent the file drawer problem associated to the presence of unreported correlations, we contacted the VIA Institute on Character and investigators in the field using the same criteria as Study I, but data to report were correlations between character strengths and SWLS. Three of the authors then sent information about those correlations computed that were not included in the article published.

All articles were analyzed by three independent reviewers, and discrepancies were resolved through discussion (see Supplementary Table 3 for more details). Interrater reliability was .94 (see Supplementary Table 9 for a description of the inclusion and exclusion criteria and Supplementary Table 10 for the studies included in Study II). Figure 2 shows the flow diagram for Study II.

Figure 2.

Flow chart for Study II.

Statistical analyses

Pearson's correlation coefficients (r) were extracted from the included studies. Following Hafdahl and Williams' (2009) recommendations, correlation coefficients were transformed using Fisher's Zr. A random effects model was considered because this type of model accounts for the heterogeneity of studies through a statistical parameter representing the interstudy variation. Furthermore, the pooled estimate of Pearson's correlation was weighted according to its inverse-variance. Resulting Zr values were transformed back into meta-analytic Pearson's coefficients for reporting. Homogeneity was estimated using the Q statistic (Hunter & Smith, 1990) and I² index (Higgins & Thompson, 2002). Trim-and-fill procedures (Duval, 2005; Sterne et al., 2011) were used to account for publication bias (see Supplementary Table 11 for more details). The analysis was carried out using R software (R Core Development Team, 2018) and meta (Schwarzer, 2014) and metafor packages (Vietchbauer, 2014).

Results

After applying the inclusion and exclusion criteria, we found 30 samples for this meta-analysis. The results of the meta-analysis related to correlations between character strengths and life satisfaction are provided in Table 3. As a rule of thumb, a correlation of .10 is a small effect, .30 a medium effect, and .50 a large effect (Cohen, 1992). Character strengths that showed large effects were Hope and Zest, while Gratitude, Love, Curiosity, Perspective, and Persistence displayed medium effect correlations. In contrast, Modesty, Prudence, and Self-regulation were less related to life satisfaction. The statistical homogeneity analysis showed significant high heterogeneity (I²>75) except for Love of Learning in which moderate heterogeneity was found (26 < I²<74).

Table 3.

Overall effect sizes and homogeneity: Correlations between character strengths and SWLS Scores.

	N	Metacorrelations		Homogeneity analysis
	N	R	95% CI	Q	I² (%)
Creativity	28	.19***	[.13, .24]	160.48***	91.3
Curiosity	28	.40***	[.38, .42]	27.89*	49.8
Judgment	28	.18***	[.14, .22]	86.66***	83.8
Learning	28	.18***	[.13, .23]	126.09***	88.9
Perspective	29	.35***	[.32, .38]	47.52***	70.5
Bravery	28	.27***	[.24, 30]	55.24***	74.7
Persistence	28	.32***	[.26, .37]	204.71***	93.2
Honesty	28	.24***	[.21, .27]	63.99***	78.1
Zest	30	.52***	[.44, .59]	710.14***	98.0
Love	29	.44***	[.38, .50]	325.82***	95.7
Kindness	28	.21***	[.19, .23]	30.07**	53.4
Social intelligence	29	.29***	[.26, .32]	56.47***	75.2
Teamwork	28	.23***	[.19, .27]	94.63***	85.2
Fairness	28	.16***	[.13, .19]	47.73***	70.7
Leadership	29	.24***	[.20, .29]	141.46***	90.1
Forgiveness	30	.22***	[.17, .27]	132.3***	89.4
Modesty	28	.06**	[.02, .10]	66.98***	79.1
Prudence	28	.18***	[.14, .21]	63.29***	77.9
Self-regulation	28	.27***	[.23, .32]	116.06***	87.9
Beauty	30	.15***	[.12, .17]	29.16**	52.0
Gratitude	30	.44***	[.39, .48]	147.38***	90.5
Hope	29	.56***	[.48, .64]	790.81***	98.2
Humor	29	.28***	[.26, .30]	24.31*	42.4
Spirituality	29	.29***	[.28, .30]	232.81***	95.7

Note: SWLS: Satisfaction With Life Scale.

*p<.05. **p<.01. ***p<.001.

Discussion

The aims of these two meta-analyses were first to examine the scales that are more affected by a high heterogeneity in the reliability estimation and, second, to establish how this might influence the relationship between character strengths and life satisfaction. Hence, it is shown that heterogeneity of reliability might play an important role in evaluating the quality of substantive studies such as meta-analyses of Pearson's correlations. This research thus aimed to show the need to perform an RG before performing a meta-analytical study. The RG allows to distinguish between different sources of error such as the measurement error (inherent to the test scores precision) and the estimation error (related to the statistical analyses) when interpreting the correlation patterns. The VIA-IS was chosen as an example of a widely used inventory.

The RG study focused on 25 internal consistency reliability estimates (alpha coefficients) obtained from 23 articles. Character strengths internal consistency varies from .71 (Self-regulation) to .86 (Creativity). These values are usually considered recommendable when applying tests for exploratory research purposes, as well as when taking the lower limit of .80 for general research purposes (Nunnally & Bernstein, 1994). However, the reliability estimates show a high heterogeneity (22 scales obtained I² values higher than 90%), suggesting that there is some systematic variance that cannot be explained by sampling error. Therefore, the values of reliability are not directly generalizable through all the applications. Different characteristics of the studies such as the standard deviation of total scores, sample type, and the test version explain part of the heterogeneity of character strengths' reliability.

To have an overview of the impact of moderating variables, a model that incorporated all the relevant moderator variables was generated. The moderator variable that exhibited the strongest relationship with reliability was the standard deviation of total scores. This moderator variable that has a statistical origin (Crocker & Algina, 1986) affected most of the scales of the VIA-IS, varying the $R_{adj}^{2}$ from .18 (Spirituality) to .77 (Prudence). On the other hand, the test version had a high mediating effect on the reliability of the scores of scales such as Creativity, Curiosity, Bravery, Zest, Love, Forgiveness, Appreciation of Beauty, Gratitude, and Spirituality accounting for 17% to 61% of the reliability variance. These results suggest that this source of variability might be influenced by the translation process of the items of the VIA-IS to each specific language. The adaptation of a psychological instrument to different cultural contexts is a process that involves not only the mere translation of its items, but it has to ensure that the items measure the same construct and guarantees that irrelevant variance is not included (Messick, 1995).

Although the percentage of variance explained in most cases is high, it is important to note that the generalizability of reliability is not measured by the $R_{adj}^{2}$ obtained, but by the results of the I² and Q indices. If a high percentage of variance ( $R_{adj}^{2}$ ) of these indicators is explained by known sources of variation, it is possible to distinguish to what extent the meta-analytical heterogeneity indexes (Q and I²) vary systematically or are associated with unidentified sources or sampling error. When applying this criterion and the recommendations for detecting a strong $R_{adj}^{2}$ ( $R_{adj}^{2} = . 64$ ), moderate ( $R_{adj}^{2} = . 25$ ) and recommended minimum effect size representing a practically significant effect (RMPE; $R_{adj}^{2} = . 04$ ; Ferguson, 2009) three potential levels of generalizability arise:

Character strengths that have high percentage of variance of reliability explained by the moderators considered in this study: Self-regulation, Curiosity, Zest, Modesty, Prudence, Love, Judgment, and Gratitude.

Character strengths with moderate percentage of variance explained by the moderators analyzed in this study: Creativity, Hope, Humor, Bravery, Forgiveness, Fairness, and Love of learning.

Character strengths with RMPE: Appreciation of beauty and Spirituality.

In the second meta-analysis, it is shown that character strengths in the VIA classification were associated with life satisfaction (Peterson & Seligman, 2004; Wood, Linley, Maltby, Kashdan, & Hurling, 2011). The higher a given character strength, the more life satisfaction was reported. Nevertheless, some of these traits appeared to be more correlated to life satisfaction than others. As Park et al. (2004a, 2004b) showed in their initial studies, Curiosity, Zest, Gratitude, and Hope were among the strengths that were most related to life satisfaction. Other character strengths such as Modesty and Prudence were less associated with life satisfaction.

In this study, no moderators could be employed due to the sample size of primary studies and the presence of unreported data. However, with the intention of relating the potential RG and its role in substantive studies, a classification was designed taking into account the relationship between the RG potential and the intensity of the correlations. It was found that character strengths with high RG either have a high relation to life satisfaction or, on the contrary, are clearly unrelated (see Table 4). Character strengths with lower RG potential displayed correlation patterns of unstable intensity (the intensity of the relationship varies across the studies).

Table 4.

Summary: Reliability generalization and correlations.

Correlations	Potential RG
Correlations	Strong ( $R_{adj}^{2} > . 64$ )	Moderate ( $. 25 < R_{adj}^{2} < . 63$ )	Low ( $R_{adj}^{2} < . 25$ )
High (r>.50)	Curiosity Zest Love Gratitude	Hope	–
Moderate (.30 < r<.49)	–	Creativity Bravery Humor Love of learning	Perspective Persistence Spirituality
Low (r<.29)	Judgment Modesty Prudence Self-regulation	Fairness Forgiveness	Honesty Kindness Social intelligence Teamwork Leadership Appreciation of beauty

Note: Cutpoints for correlations are based on Cohen (1992) and for $R_{adj}^{2}$ on Ferguson (2009). Low includes RMPE and nonsignificant cutpoints.

RG: reliability generalization; RMPE: recommended minimum effect size representing a practically significant effect.

The lack of reporting of the reliability estimates for the test scores may lead to inappropriate interpretations and conclusions (Cousin & Henson, 2000). Moreover, the lack of precision of the test scores may lead to an inconclusive result and may generate sources of variability associated to the measurement error that is added to the estimation error. Our study allows to conclude that RG studies are a prerequisite for conducting a meta-analysis on other variables since the results of Study I and Study II taken simultaneously point out the role of the relationship between the reliability of test scores and substantive studies, such as Pearson's correlations meta-analysis.

Finally, these results might be of interest for researchers using the VIA-IS. For example, the reliability of the scores has shown limitations in terms of its generalizability to different populations (e.g., Appreciation of beauty, Honesty, Kindness, Social intelligence, Leadership, and Teamwork). However, other dimensions such as Curiosity, Zest, Love, and Gratitude have shown greater generalizability and have also shown a more consistent relationship with life satisfaction. In other words, our results show that there are some character strengths measured by the VIA-IS, which should be used with caution since it has been observed that the heterogeneity found in their reliability coefficients varies in an unknown way. Consequently, there may be differences in the intensity of the correlations that are not solely related to the defining elements of each strength but also to a low metric quality of the evaluated construct.

Limitations and directions for future research

The main limitation is related to the selection of primary studies for these meta-analyses. Although the VIA-IS is one of the most popular self-report inventories for the assessment of positive traits, there are few studies that consider its psychometric properties when assessing different populations, which reduces the number of eligible studies for both meta-analyses. In addition, the applied procedure for collecting relevant data is rather limited, since numerous publications that used the VIA-IS did not report detailed internal consistencies for all scales, and they were not considered (i.e., authors of these studies were not contacted and asked for further information). Furthermore, there were studies that were excluded because they only computed the internal consistency of the whole VIA-IS or made reliability induction. These (incorrect) practices did not allow the study of the accuracy of the measurement of each scale. In addition, two studies did not include the reliability of Self-regulation, and other did not include the reliability of Zest, and details about those exclusions were not provided. It may have influenced the confidence intervals of the reliability of the scores for these character strengths had wider amplitude. These issues might be related to the results of the trim-and fill analyses. In Study I, it is observed that there are slight variations in reliability in the case of character strengths with studies imputed with a trim-and-fill procedure. This implies that the risk of publication bias in this study is low. However, in Study II, the results suggest that there are certain character strengths whose correlations with life satisfaction might be affected by publication bias or even the presence of missing values.

It has also not been possible to use the same sample of studies for the RG and for the second meta-analysis. There are some studies included in Study II that did not include the reliability of character strengths scores (see Table 5). The percentage of overlapping studies used in the RG meta-analysis and the meta-analysis of correlations is 45%. However, none of the substantive studies reported the reliability of the SWLS, and that issue prevented the use of the correction for attenuation. Nevertheless, Nimon, Zientek, and Henson (2012) showed that an RG should be done to know the different sources of error and therefore whether it is appropriate or not to apply this type of correction. In our study, we found that the source of heterogeneity of certain character strengths is unknown, therefore performing this type of correction leads to serious biases.

Table 5.

Studies Included in Study I and Study II.

Reference	Study I	Study II
Azañedo et al. (2014)	X	X
Boström (2015) ^a		X
Brdar, Anić, & Rijavec (2011) ^a		X
Brdar & Kashdan (2010)	X
Buschor, Proyer, & Ruch (2013) ^a	X	X
Duan et al. (2011)	X
Hanks, Rapport, Waldron-Perrine, & Millis (2014)		X
Hool (2011) ^a		X
Huta & Hawley (2010)	X	X
Jónsdóttir (2010)	X
Karris (2007)	X	X
Khumalo, Wissing, & Temane (2008)	X	X
Lavy & Littman-Ovadia (2011)	X	X
Lee, Foo, Adams, Morgan, & Frewen (2015)	X	X
Linley et al. (2007)	X
Littman-Ovadia & Lavi (2012a)	X	X
Littman-Ovadia, & Lavi (2012b)	X
Macdonald, Bore, & Munro (2008)	X
Martínez-Martí & Ruch (2016)		X
Ovejero (2014)	X	X
Park et al. (2004a) ^a		X
Peterson, Ruch, Beermann, Park, & Seligman (2007) ^a	X	X
Proyer, Gander, Wyss, & Ruch (2011)	X	X
Proyer & Ruch (2009)	X
Ramírez, Ortega, & Martos (2015)		X
Ruch et al. (2007) ^a		X
Ruch et al. (2010)	X	X
Shimai et al. (2006)	X	X
Singh & Choubisa (2009)	X	X

^aStudies with more than one sample (see details in supplementary material).

Regarding the second meta-analysis, it is important to mention that causation cannot be inferred from these correlations. More studies are needed to quantify whether the character strengths truly promote life satisfaction, or if certain levels of satisfaction with life may enhance character strengths. Furthermore, the data meta-analyzed in the present study are strictly based on self-reports measures. This situation might lead to an inflation of the correlations due to shared method variance.

The results of the present study emphasize the great importance of reporting reliability in primary studies to increase and guarantee the validity, generalization, and quality of the results included in the research work (Bannigan & Watson, 2009; Wilkinson & APA Task Force, 1999). It is also important to take into account the heterogeneity of reliability estimation as a source of additional variation that may affect more substantive content meta-analysis (Dimitrov, 2002), such as meta-analysis of correlations.

Footnotes

Article Notes

References

*Azañedo

C. M.

Fernández-Abascal

Barraca

(2014) Character strengths in Spain: Validation of the Values in Action Inventory of Strengths (VIA-IS) in a Spanish sample. Clínica y Salud 25: 123–130. doi:10.1016/j.clysa.2014.06.002.

Bannigan

Watson

(2009) Reliability and validity in a nutshell. Journal of Clinical Nursing 18: 3237–3243. doi:10.1111/j.1365-2702.2009.02939.x.

Bonett

D. G.

(2010) Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods 15: 368–385. doi:10.1037/a0020142.

*Boström

(2015) Using your strengths. Strengths use and it relation to stress in Sweden, (Bachelor project): School for Learning and Environment Psychology, Kristianstad University, Sweden.

Botella

Suero

Gambara

(2010) Psychometric inferences from a meta-analysis of reliability and internal consistency coefficients. Psychological Methods 15: 386–397. doi:10.1037/a0019626.

*Brdar

Anić

Rijavec

(2011) Character strengths and well-being: Are there gender differences?. In: Brdar

(ed.) The human pursuit of well-being, New York, NY: Springer., pp. 145–156. doi:10.1007/978-94-007-1375-8_13.

*Brdar

Kashdan

T. B.

(2010) Character strengths and well-being in Croatia: An empirical investigation of structure and correlates. Journal of Research in Personality 44: 151–154. doi:10.1016/j.jrp.2009.12.001.

*Buschor

Proyer

R. T.

Ruch

(2013) Self- and peer-rated character strengths: How do they relate to satisfaction with life and orientations to happiness?. Journal of Positive Psychology 8: 116–127. doi:10.1080/17439760.2012.758305.

Carmines

E. G.

Zeller

R. A.

(1979) Reliability and validity assessment, London, England: SAGE.

10.

Cohen

(1992) A power primer. Psychological Bulletin 112: 155–159. doi:10.1037//0033-2909.112.1.155.

11.

Cousin, S. L., & Henson, R. K. (2000, January 27–29). What is reliability generalization and why is important? Paper presented at the Annual Meeting of the Southwest Educational Research Association, Dallas, TX.

12.

Crocker

L. M.

Algina

(1986) Introduction to classical and modern test theory, New York, NY: Holt, Rinehart and Winston.

13.

Dahlsgaard

Peterson

Seligman

M. E. P.

(2005) Shared virtue: The convergence of valued human strengths across culture and history. Review of General Psychology 9: 203–213. doi:10.1037/1089-2680.9.3.203.

14.

Diener

Emmons

Larsen

R. J.

Griffin

(1985) The Satisfaction with Life Scale. Journal of Personality Assessment 49: 71–75. doi:10.1207/s15327752jpa4901_13.

15.

Dimitrov

D. M.

(2002) Reliability: Arguments for multiple perspectives and potential problems with generalization across studies. Educational and Psychological Measurement 62: 783–801. doi:10.1177/001316402236878.

16.

*Duan

Bai

Zhang

Tang

Wang

Tingting

(2011) 优势行动价值问卷 (VIA-IS) 在中国大学生中的适用性研究 [Values in Action Inventory of Strengths in college students: Reliability and validity]. Chinese Journal of Clinical Psychology 19: 473–475.

17.

Duval

S. J.

(2005) The trim and fill method. In: Rothstein

H. R.

Sutton

A. J.

Borenstein

(eds) Publication bias in meta-analysis: Prevention, assessment, and adjustments, Chichester, England: Wiley, pp. 127–144.

18.

Ferguson

C. J.

(2009) An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research and Practice 40(5): 532–538. doi:10.1037/14805-020.

19.

Hafdahl

A. R.

Williams

M. A.

(2009) Meta-analysis of correlations revisited: Attempted replication and extension of Field's (2001) simulation studies. Psychological Methods 14: 24–42. doi:10.1037/a0014697.

20.

*Hanks

R. A.

Rapport

L. J.

Waldron-Perrine

Millis

S. R.

(2014) Role of character strengths in outcome after mild complicated to severe traumatic brain injury: A positive psychology study. Archives of Physical Medicine and Rehabilitation 95: 2096–2102. doi:10.1016/j.apmr.2014.06.017.

21.

Henson

R. K.

Thompson

(2002) Characterizing measurement error in scores across studies: Some recommendations for conducting “reliability generalization” studies. Measurement and Evaluation in Counseling and Development 35: 113–126.

22.

Higgins

J. P. T.

Thompson

S. G.

(2002) Quantifying heterogeneity in a meta-analysis. Statistics in Medicine 21: 1539–1558. doi:10.1002/sim.1186.

23.

*Hool, K. (2011). Character strengths, life satisfaction and orientations to happiness – A study of the Nordic countries (Master's thesis). University of Bergen, Norway.

24.

Huedo-Medina

T. B.

Sánchez-Meca

Marín-Martínez

Botella

(2006) Assessing heterogeneity in meta-analysis: Q statistic or I² index?. Psychological Methods 11: 193–206. doi:10.1037/1082-989x.11.2.193.

25.

Hunter

J. F.

Schmidt

F. L.

(1990) Methods of meta-analysis: Correcting error and bias in research findings, Newbury Park, CA: Sage.

26.

*Huta

Hawley

(2010) Psychological strengths and cognitive vulnerabilities: Are they two ends of the same continuum or do they have independent relationships with well-being and ill-being?. Journal of Happiness Studies 11: 71–93. doi:10.1007/s10902-008-9123-4.

27.

*Jónsdóttir, H. (2010). The VIA Inventory of Strengths (VIA-IS). Psychometric properties of a Dutch translation (Doctoral dissertation). Amsterdam University, the Netherlands.

28.

*Karris, M. A. (2007). Character strengths and well-being in a college sample (Doctoral dissertation). University of Colorado, Boulder, CO.

29.

*Khumalo

I. P.

Wissing

M. P.

Temane

Q. M.

(2008) Exploring the validity of the Values-In-Action Inventory of Strengths in an African context. Journal of Psychology in Africa 18: 133–142.

30.

Kimberlin

C. L.

Winterstein

A. G.

(2008) Validity and reliability of measurement instruments used in research. American Journal of Health-System Pharmacy 65: 2276–2284. doi:10.2146/ajhp070364.

31.

*Lavy

Littman-Ovadia

(2011) All you need is love? Strengths mediate the negative associations between attachment orientations and life satisfaction. Personality and Individual Differences 50: 1050–1055. doi:10.1037/e537902012-011.

32.

*Lee

J. N. T.

Foo

K. H.

Adams

Morgan

Frewen

(2015) Strengths of character, orientations to happiness, life satisfaction and purpose in Singapore. Journal of Tropical Psychology 5: 1–21. doi:10.1017/jtp.2015.2.

33.

*Linley

Maltby

Wood

A. M.

Joseph

Harrington

Peterson

Seligman

M. E. P.

(2007) Character strengths in the United Kingdom: The VIA Inventory of Strengths. Personality and Individual Differences 43: 341–351. doi:10.1016/j.paid.2006.12.004.

34.

*Littman-Ovadia

Lavy

(2012a) Character strengths in Israel. Hebrew adaptation of the VIA Inventory of Strengths. European Journal of Psychological Assessment 28: 41–50. doi:10.1027/1015-5759/a000089.

35.

*Littman-Ovadia

Lavy

(2012b) Differential ratings and associations with well-being of character strengths in two communities. Health Sociology Review 21: 299–312.

36.

*Macdonald

Bore

Munro

(2008) Values in action scale and the big 5: An empirical indication of structure. Journal of Research in Personality 42: 787–799. doi:10.1016/j.jrp.2007.10.003.

37.

*Martínez-Martí

M. L.

Ruch

(2016) Character strengths predict resilience over and above positive affect, self-efficacy, optimism, social support, self-esteem, and life satisfaction. The Journal of Positive Psychology 12: 110–119. doi:10.1080/17439760.2016.1163403.

38.

McGrath

(2016) Measurement invariance in translations of the VIA Inventory of Strengths. European Journal of Psychological Assessment 32: 187–194. doi:10.1027/1015-5759/a000248.

39.

Messick

(1995) Validity of psychological assessment. Validation of inferences from person's responses and performance as scientific inquiry into score meaning. American Psychologist 50: 741–749. doi:10.1037/0003-066x.50.9.741.

40.

Niemiec

R. M.

(2013) VIA character strengths: Research and practice (the first 10 years). In: Knoop

H. H.

Delle Fave

(eds) Well-being and cultures: Perspectives on positive psychology, New York, NY: Springer, pp. 11–30.

41.

Nimon

Zientek

L. R.

Henson

R. K.

(2012) The assumption of a reliable instrument and other pitfalls to avoid when considering the reliability of data. Frontiers in Psychology 3: 1–13. doi:10.3389/fpsyg.2012.00102.

42.

Nunnally

J. C.

Bernstein

I. H.

(1994) Psychometric theory, 3rd ed. New York, NY: McGraw-Hill.

43.

*Ovejero, M. (2014). Evaluación de fortalezas humanas en estudiantes de la Universidad Complutense de Madrid y diferencias de sexo: Relación con salud, resiliencia y rendimiento académico [Assessment of human strengths in students of the Complutense University of Madrid and sex differences: Relationship with health, resilience and academic performance] (Doctoral dissertation). Complutense University of Madrid, Spain.

44.

Park

Peterson

(2009) Character strengths: Research and practice. Journal of College and Character 10(4): 1–10. doi:10.2202/1940-1639.1042.

45.

*Park

Peterson

Seligman

M. E. P.

(2004a) Strengths of character and well-being. Journal of Social and Clinical Psychology 23: 603–619. doi:10.1521/jscp.23.5.603.50748.

46.

Park

Peterson

Seligman

M. E. P.

(2004b) Strengths of character and well-being: A closer look at hope and modesty. Journal of Social and Clinical Psychology 23: 628–634. doi:10.1521/jscp.23.5.628.50749.

47.

Peterson

(2006) The Values In Action (VIA) classification of strengths. In: Czikszentmihalyi

Czikszentmihalyi

I. S.

(eds) A live worth living. Contributions to positive psychology, New York, NY: Oxford University Press, pp. 29–48.

48.

Peterson

Park

(2006) Character strengths in organizations. Journal of Organizational Behavior 27: 1149–1154. doi:10.1002/job.398.

49.

Peterson

Park

(2009) El estudio científico de las fortalezas humanas [The scientific study of human strengths]. In: Vázquez

Hervás

(eds) La ciencia del bienestar. Fundamentos de una psicología positiva [The science of well-being. Fundamentals of a positive psychology], Madrid, Spain: Alianza Editorial, pp. 181–207.

50.

Peterson

Park

Seligman

M. E. P.

(2005) Assessment of character strengths. In: Koocher

G. P.

Norcross

J. C.

Hill

S. S.

III (eds) Psychologists' desk reference Vol 3, 2nd ed. New York, NY: Oxford University Press, pp. 93–98.

51.

*Peterson

Ruch

Beermann

Park

Seligman

M. E. P.

(2007) Strengths of character, orientations to happiness, and life satisfaction. Journal of Positive Psychology 2: 149–156. doi:10.1080/17439760701228938.

52.

Peterson

Seligman

M. E. P.

(2004) Character strengths and virtues. A handbook and classification, New York, NY: Oxford University Press.

53.

Proyer

R. T.

Gander

Wellenzohn

Ruch

(2013) What good are character strengths beyond subjective well-being? The contribution of the good character on self-reported health-oriented behavior, physical fitness, and the subjective health status. The Journal of Positive Psychology 8: 222–232. doi:10.1080/17439760.2013.777767.

54.

*Proyer

R. T.

Gander

Wyss

Ruch

(2011) The relation of character strengths to past, present, and future life satisfaction among German-speaking women. Applied Psychology: Health and Well-Being 3: 370–384. doi:10.1111/j.1758-0854.2011.01060.x.

55.

*Proyer

R. T.

Ruch

(2009) How virtuous are gelotophobes? Self-and-peer reported character strengths among those who fear being laughed at. International Journal of Humor Research 22: 145–163. doi:10.1515/humr.2009.007.

56.

R Core Development Team (2018) R: A language and environment for statistical computing, Vienna, Austria: R Foundation for Statistical Computing.

57.

*Ramírez

Ortega

A. R.

Martos

(2015) Las fortalezas en personas mayores como factor que aumenta el bienestar [Strengths in the elderly as a factor that increases wellbeing]. European Journal of Investigation in Health, Psychology and Education 5: 187–195. .

58.

*Ruch

Huber

Beermann

Proyer

R. T.

(2007) Character strengths as predictors of the “good life” in Austria, Germany and Switzerland. In Romanian Academy, “George Barit” Institute of History, Department of Social Research. (Eds.). Studies and researches in social sciences Vol 16, Cluj-Napoca, Romania: Argonaut Press, pp. 123–131.

59.

*Ruch

Proyer

R. T.

Harzer

Park

Peterson

Seligman

M. E. P.

(2010) Values in action inventory of strengths (VIA-IS). Adaptation and validation of the German version and the development of a peer-rating form. Journal of Individual Differences 31: 138–149. doi:10.1027/1614-0001/a000022.

60.

Sánchez-Meca

López-Pina

J. A.

(2008) El enfoque meta-analítico de generalización de la fiabilidad [The meta-analytical approach to reliability generalization]. Acción Psicológica 5: 37–64.

61.

Schmidt

F. L.

Hunter

J. E.

(1977) Development of a general solution to the problem of validity generalization. Journal of Applied Psychology 62: 529–540. doi:10.1037/0021-9010.62.5.529.

62.

Schwarzer G. (2014). Meta-analysis with R [statistical software]. CRAN repository. Retrieved from http://cran.r-project.org/.

63.

*Shimai

Otake

Park

Peterson

Seligman

M. E. P.

(2006) Convergence of character strengths in American and Japanese young adults. Journal of Happiness Studies 7: 311–322. doi:10.1007/s10902-005-3647-7.

64.

*Singh

Choubisa

(2009) Psychometric properties of Hindi translated version of Values in Action Inventory of Strengths (VIA-IS). Journal of Indian Health Psychology 4: 65–76.

65.

*Singh

Choubisa

(2010) Empirical validation of Values in Action-Inventory of Strengths in Indian context. Psychological Studies 55: 151–158. doi:10.1007/s12646-010-0015-4.

66.

Sterne

J. A. C.

Sutton

A. J.

Ioannidis

J. P. A.

Terrin

Jones

D. R.

Lau

Higgins

J. P. T.

(2011) Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials. BMJ 343: d4002. doi:10.1136/bmj.d4002.

67.

Sullivan

G. M.

(2011) A primer on the validity of assessment instruments. Journal of Graduate Medical Education 3: 119–120. doi:10.4300/jgme-d-11-00075.1.

68.

Vacha-Haase

(1998) Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement 58: 6–20.

69.

Vacha-Haase

Henson

R. K.

Caruso

J. C.

(2002) Reliability generalization: Moving toward improved understanding and use of score reliability. Educational and Psychological Measurement 62: 562–569. doi:10.1177/0013164402062004002.

70.

Vacha-Haase

Thompson

(2011) Score reliability: A retrospective look back at 12 years of reliability generalization studies. Measurement and Evaluation in Counseling and Development 44: 159–168. doi:10.1177/0748175611409845.

71.

VIA Institute. (2012). VIA Inventory of Strengths. Retrieved from www.viasurvey.org.

72.

Viechtbauer

(2010) Conducting meta-analyses in R with the metafor package. Journal of Statistical Software 36: 1–48.

73.

Vietchbauer, W. (2014). Meta-analysis package for R [statistical software]. CRAN repository. Retrieved from http://cran.r-project.org/.

74.

Warne

R. M.

(2008) Applied statistics: From bivariate through multivariate techniques, Thousand Oaks, CA: Sage.

75.

Wilkinson

APA Task Force on

Statistical Inference.

(1999) APA Task Force on Statistical Inference. Statistical methods in psychology journals: Guidelines and explanations. American Psychologist 54: 594–604. doi:10.1037/0003-066X.54.8.594.

76.

Wood

A. M.

Linley

Maltby

Kashdan

T. B.

Hurling

(2011) Using personal and psychological strengths leads to increases in well-being over time: A longitudinal study and the development of the strengths use questionnaire. Personality and Individual Differences 50: 15–19. doi:10.1016/j.paid.2010.08.004.