Abstract
We know that there are cross-cultural differences in psychological variables, such as individualism/collectivism. But it has not been clear which of these variables show relatively the greatest differences. The Survey of World Views project operated from the premise that such issues are best addressed in a diverse sampling of countries representing a majority of the world’s population, with a very large range of item-content. Data were collected online from 8,883 individuals (almost entirely college students based on local publicizing efforts) in 33 countries that constitute more than two third of the world’s population, using items drawn from measures of nearly 50 variables. This report focuses on the broadest patterns evident in item data. The largest differences were not in those contents most frequently emphasized in cross-cultural psychology (e.g., values, social axioms, cultural tightness), but instead in contents involving religion, regularity-norm behaviors, family roles and living arrangements, and ethnonationalism. Content not often studied cross-culturally (e.g., materialism, Machiavellianism, isms dimensions, moral foundations) demonstrated moderate-magnitude differences. Further studies are needed to refine such conclusions, but indications are that cross-cultural psychology may benefit from casting a wider net in terms of the psychological variables of focus.
Keywords
We know that psychological variables sometimes show cross-cultural differences. Indeed, large bodies of scientific literature have arisen around those variables presumed to best represent important cross-cultural differences. Such variables include dimensions such as individualism and collectivism (Oyserman, Coon, & Kemmelmeier, 2002; Triandis & Gelfand, 1998), tightness and looseness (Gelfand et al., 2011; Triandis, 1989), and multidimensional research domains such as values (Schwartz & Bilsky, 1990), social axioms (Leung & Bond, 2004), and normative practices (House, Hanges, Javidan, Dorfman, & Gupta, 2004). The focus of these variables corresponds to consensual definitions of culture in terms of shared beliefs, values, and norms. Investigations of such key cultural variables are a major focus of cross-cultural psychology.
But which of these variables show the greatest magnitude of differences between populations? Do all of them show sizable differences? Do other variables than these show even greater differences, and thus need more attention? Answers to these questions remain unclear, for at least two reasons. First, there have been few studies that compare many of these variables with respect to magnitude of cross-cultural differences. Second, there have not been clearly framed comparisons to distinguish large magnitudes of difference from smaller ones.
Among the studies relevant to addressing these questions, there have been significant limitations. One is the modest range of variables: It has been typical for each study to include at most roughly a dozen psychological variables. Moreover, the representation of countries typically gives an unfaithful picture of relative contributions to total world population: Western countries along with, in some cases, East Asian countries tend to be over-represented, with the rest of the world under-represented; even where the range of countries is unusually wide, one often sees a much denser sampling of European nations. For example, about 69% of the national happiness data found in the World Database of Happiness (Veenhoven, 2010) were collected from European and North American samples (Tov & Au, 2013). These tendencies might derive from how cross-cultural psychology projects are often organized, with participants found mainly in those countries with the largest concentrations of psychologists. The outcome is understandable, but leads to our having only a patchwork representation of populations as well as variables.
The Survey of World Views project was designed with the aim of overcoming limitations in addressing the crucial questions identified above: In what variables does one find greater or lesser between-population differences, and how does this correspond to common variable-selection preferences in cross-cultural psychology? To provide an incrementally clearer answer than previous studies have provided, this project differed from them in two crucial ways. It included a very large range of item-content: several dozen variables which might be compared with respect to degrees of similarity and difference across samples. And it featured a stronger proportional representation of the “global south” (Africa, south/southeast Asia, Latin America) than has been typical in these studies, so as to represent a majority of the world’s population by its selection of countries. Items predominantly referred to beliefs, some of which could be considered values and statements as to what are or should be norms. The categories of content are heterogeneous, and difficult to summarize in a concise label for the data set. Because these categories all involve “views” held by individuals, the label “Survey of World Views” seemed to fit: The project elicited a diverse range of views about the world, as reflected in many psychological variables, as experienced by respondents from many parts of the world.
Method
Participants
Data were collected online: 8,883 individuals in 33 countries provided some input data for this Survey of World Views in 2012. About 90% of them provided sufficient responses—given whatever challenges they faced with computer hardware, internet access, and various distractions that conduce to not finishing a survey—to be usable for analyses here.
The aim was to sample students from institutions of higher education, from some diversity of fields of study. College students were sought so as to enable recruitment within a short time frame and to enable standardized online administration. Recruiting students minimized between-population differences in level of education; fully representative populations from each country would have large between-population differences in education level. For some research purposes, it might be useful to remove a potential confound of country with education level (perhaps related to reading level, thus having an effect on understanding of the questionnaire items). Students with ample secondary education may show less response-bias (acquiescence) variance than do representative samples (Rammstedt, Kemper, & Borg, 2013). Diener, Diener, and Diener (1995) found, with reference to subjective well-being, that college-student samples give moderately accurate estimates of the between-country differences one finds with more representative surveys that are more difficult to obtain.
In every country, the method of recruitment was the same, and carried out locally, not over the worldwide web. Cooperating instructional faculty distributed flyers to students in as wide a variety of classes and fields of study as they could arrange. Each student received one flyer, containing information about the study, the compensation to be earned by participating, the online address of the data-collection website, and a login code unique to that flyer. Recipients could then elect whether to participate on their own time, through any internet access point available to them. In countries where it was practical to order from Amazon.com or an affiliate, participants were issued an Amazon gift certificate upon completion of the survey; this gift coupon was approximately US $20 in value, except in a few of the relatively affluent countries the value was set very slightly higher to provide needed incentive. In countries where an Amazon gift coupon was not practical, participants were sent $20 via Western Union money transfer.
For each country, a separate data-collection portal was constructed. In most cases, the survey and all materials appeared in the major national language, following translation (using back-translation checks) conducted under the auspices of the project. The languages used included English (Kenya, India, Singapore, United Kingdom, Ireland, Australia, Canada, United States), Spanish (Spain, Mexico, Peru, Argentina), Chinese (China, Taiwan), Arabic (Morocco, Egypt), plus the following languages that were country-specific for this project: Kiswahili (Tanzania), Amharic (Ethiopia), Turkish (Turkey), Bengali (Bangladesh), Nepali (Nepal), Malay (Malaysia), Filipino/Tagalog (Philippines), Thai (Thailand), Korean (Korea), Japanese (Japan), Russian (Russia), Ukrainian (Ukraine), Polish (Poland), Greek (Greece), German (Germany), Dutch (Netherlands), and Portuguese (Brazil).
Where feasible, to contribute to diversity in sampling, participants were recruited from more than one site (i.e., educational institution) in each country. Moreover, we sought to have participating students within any one institution come from diverse fields of study (e.g., business, education, humanities, social science). A large proportion (nearly 60%) of the cooperating faculty were not in psychology departments.
A population-sampling strategy was set out in advance. Countries were included in an attempt to represent the world, both in terms of demographic footprint and of economic impact. The first and second authors created a ranking of the countries of the world based on their relative contributions to world population and to the aggregated world gross domestic product (GDP). The larger the average of these ranked contributions, the higher the priority placed on identifying and coordinating with instructional faculty within the country. We did not succeed in locating cooperating faculty in some high-priority countries (e.g., Iran, Pakistan, Nigeria), but in most cases potentially cooperating faculty could be identified.
Those 33 countries sampled have aggregated populations amounting to some 67.3% (4.7 billion) of the world’s population; and when the GDP of these 33 countries is aggregated, the total makes up some 76.2% of the gross aggregate domestic products of all countries in the world (Central Intelligence Agency, 2012). It is fair to say that young people from most of the world (whether in terms of demographic footprint or economic impact) are represented in the 33 countries in this project. Henrich, Heine, and Norenzayan (2010) have pointed to the predominance of “WEIRD” samples (from Western, Educated, Industrialized, Rich Democracies) in psychological and even much cross-cultural research. The countries sampled here are clearly not biased in favor of Western industrialized democracies, though the method of online survey administration did make it more practical to recruit rather educated samples that are probably richer than average for their country.
The recruitment goal was roughly 300 participants (plus or minus 100) per country, a sample size sufficient for multivariate analyses within each country was desirable. Variation in sample size by country arose due to local, practical factors (e.g., numbers of flyers distributed).
Table 1 displays demographic characteristics for samples from the 33 countries, grouped into geographic regions for easier understanding of how world populations were sampled. Table 1 includes sample size, gender, mean age, and mean-percent-missing data. There is substantial variation between countries in degree of missing data, much of this clearly due to country-specific challenges in our online surveys. In some countries, the data-collection portal sometimes operated more fitfully due to slow connection speeds, and at times the U.S. computer server was slowed when too much data were arriving all at once. Thus, mean-percent-missing should not be interpreted as a substantive country or cultural characteristic.
Demographic Characteristics for 33 Countries, Grouped by Region.
Note. Mean-%-missing is the mean percent of missing responses across 281 prime survey items.
Materials
The survey used items drawn from measures of nearly 50 variables drawn from 17 distinct sources, each involving in some way shared beliefs, values, and norms that might be shared across persons (thus fitting a rather consensual definition of “cultural”). The goal was to be as comprehensive as possible within a moderate-length questionnaire. Cross-cultural psychology has no unified theory to guide selection of variables, and a heterogeneous selection, representing diverse theoretical approaches, might provide the most fuel for future theoretical development. These sources are described here in brief summary form.
GLOBE normative practices
There were 43 items drawn from House et al. (2004), indexing dimensions of performance orientation, future orientation, humane orientation, gender egalitarianism, assertiveness, collectivism, power distance, and uncertainty avoidance (the last four substantially related to scales of Hofstede, 2001). Items referred to as-is societal practices (not values) with a “referent-shift” format—respondents described characteristics of people in their country rather than of themselves, with items usually beginning “In this society . . . ”
Cultural tightness–looseness
Six items, drawn from Gelfand et al. (2011), also had a referent-shift format, all beginning “In this country . . . ” Except as indicated, for all other sources described below, item presentation did not involve a referent-shift format.
Social axioms
Thirty core items (defined based on most consistent univocal associations with the intended dimension) were drawn from Leung et al. (2002). Social-axiom dimensions include cynicism, fate control, religiosity, social complexity, and reward for application.
Individualism and collectivism (idiocentrism and allocentrism)
The 16 items of Triandis and Gelfand (1998) were included: 4 items each for vertical (hierarchical) individualism, vertical collectivism, horizontal (egalitarian) collectivism, and horizontal individualism.
Values
The full 10 items of the Short Schwartz Values Survey (SSVS; Lindeman & Verkasalo, 2005) were included. Each of the 10 values-clusters proposed by Schwartz (Schwartz & Bilsky, 1990) is represented on this brief form by one item.
Family values
There were eight items from a measure of family values (Georgas, 1989), with four selected (based on van de Vijver, Mylonas, Pavlopoulos, & Georgas, 2006, Table 7.8) for each of two dimensions: hierarchy (focused on gender roles) and relationships (i.e., cohesiveness reputation, obligations).
Isms dimensions
Forty-six items represented factors (Saucier, 2000, 2013) defined from the domain of dictionary terms ending in –ism: Tradition-Oriented Religiousness, Subjective Spirituality, Unmitigated Self-Interest, Communal Rationalism, and Inequality-Aversion.
Moral foundations
Included was the 22-item short form of the Moral Foundations Questionnaire (Graham et al., 2011), which assesses five major criteria used to distinguish right from wrong: Harm/Care, Justice/Fairness, Loyalty, Authority, and Purity/Divinity.
Religiousness and devout behaviors
The five items of the Duke Religion Index (DRI; Koenig, Patterson, & Meador, 1997) reference not only the value accorded to religion, but also religious experiences, practices, and meeting-attendance.
Materialism
Four items drawn from a synthesis of the empirical literature on materialist values (Shen-Miller, Saucier, & Pan, 2013) were included.
Machiavellianism
Five items drawn from studies of core content in measures of Machiavellianism (Saucier, Chen, & Bettenhausen, 2014) were included.
Nationalism
There were four items capturing ethnonationalism as in the theory of Anthony D. Smith, and two items capturing a multiculturalist civic nationalism (Saucier, 2014).
Extremist thinking patterns
Seventeen items came from a very brief overall measure of extremist thinking styles (from Saucier, Akers, Shen-Miller, Stankov, & Knezevic, 2009; Stankov, Saucier, & Knezevic, 2010).
Proneness to aggress
Three items from previous work (Henry, 2009) captured readiness to aggress vengefully to insults or slights to honor (within a “culture of honor” syndrome).
Amoralism
Fourteen items based on the construct as defined by Stankov and Knezevic (2005).
Personality
Although it does reflect norms, values, and beliefs to some degree, personality was not expected to generate strong population differences (Poortinga, van de Vijver, & van Hemert, 2002), making it an interesting domain for comparison. As benchmarks for this domain, there were 40 items referencing individual behavioral dispositions in the Big Six model (Saucier, 2009), including all of the items in the 36QB6 measure (Thalmayer, Saucier, & Eigenhuis, 2011). The Big Six model is akin to the HEXACO model of personality structure but is based on a broader range of studies of personality-language.
Regularity-norm behaviors
The selection of variables detailed above tends to omit content oriented to a type of social norms that sociologists describe, distinct from the more restrictive norms about what one ought or ought not to do. These norms involve “behavioral regularities that generate social expectations without any moral obligations,” although deviations from common practice can still lead to costs being imposed (Hechter & Opp, 2001, p. xiii). The literature in anthropology (e.g., Levinson & Malone, 1980) and cultural psychology (e.g., Heine, 2008) suggested six kinds of regularity-oriented social norms differing across populations (involving alcohol, sex, sleeping arrangements, and beliefs about ancestors, spirit-possession, and sorcery and witchcraft); these were represented in referent-shift items beginning “In this society . . . ”
Analyses
The present analyses are conducted entirely at the item level, treating each item as a variable on its own (as in Funder, Furr, & Colvin, 2000; Westen & Shedler, 2007). This approach allows for a look at big-picture patterns in the data without delving into the relative strengths and weaknesses of various scales that are composed of aggregated items. Thus, these analyses leave aside the issue of whether various collections of items function similarly together as scales measuring one or another intended construct; indeed, it would be impossible to report all required analyses around this issue (for nearly 50 intended scales) in a reasonably sized paper. The present analyses also leave aside the issue of whether there are between-population differences in response biases, operating on the assumption that these differences are no more than moderate in size and tend to operate rather similarly across a large range of content. Given the likelihood that studies of response biases and of specific scales in these data will provide some refinements to exact estimates provided here, the focus here is only on those relatively large and dramatic effects least likely to be affected by such refinements.
To assess how well individual items discriminate between different country samples, we report eta-squared coefficients, which reflects the “proportion of the variation in Y that is associated with membership of the different groups defined by X” (Richardson, 2011, p. 136). In other words, eta-squared reflects how well a particular item differentiates between respondents from different countries, referencing the proportion of variance that occurs between groups/samples. Similarly, we generated intraclass correlations (ICC[1]) as the ratio of the intercept to the total (intercept + residual) variance via Restricted Maximum Likelihood (REML) estimation (of covariance parameters) under the Mixed procedure in SPSS. Although eta-squared is generally thought to include some upward bias not present in ICC estimates, particularly when group/sample sizes are small (Bliese & Halverson, 1998), some recent work indicates that it may be less biased at low effect sizes (Shieh, 2012). Both indices (eta-squared and ICC) reward variables for having either low variation within groups or high variation between groups; a variable with large variation (i.e., disagreement) within groups can only achieve a large index value by showing extraordinarily large variation between groups. The focus being on directionless effect size measures, that is, the raw magnitude of effect, squared coefficients are reported. The caution of Matsumoto, Grissom, and Dinnel (2001) is pertinent: Squared coefficients give an illusory smallness, an eta-squared of .10 is not a small effect.
The aforementioned coefficients were generated for 281 items, using the maximum sample sizes available for each item. To observe results under conditions in which response biases are summarily removed, the same analyses were repeated with data that had been ipsatized (standardized within-subject to eliminate individual differences in use of response scales) after rescaling all responses scales to the same 1-to-6 range. Some analyses grouped the 281 items in the total item-pool into 18 categories based on their provenance (i.e., the various sources described above).
In the present analyses, data from only 30 countries were utilized. The de-selection of small-sample data from Australia, Ireland, and the Netherlands removed only 1.5% of the participants (i.e., 130 cases), but lessened the tendency toward an over-representation of European-origin populations. It reduced the proportion of countries from Europe (plus United States, Canada, and Australia) to one third of the total set of countries, rather than nearly 40% if these three countries had been included. For reference, some 16% of the global human population resides in Europe, United States, Canada, and Australia combined; the countries comprised therein have more than a 50% share of globally aggregated GDP (Central Intelligence Agency, 2012).
The analyses reported here excluded all of those relatively few participants who indicated they were not a student at any higher education institution.
Results
Unsurprising given the sample size, all items had statistically significant country effects.
Table 2 presents the items showing the largest cross-population differences across 30 countries. The 42 items with the largest eta-square values are shown, and include all items that had either an eta-square or an ICC value of at least .20 (a large effect for country-of-origin). Eta-square and ICC tended to be quite similar, although eta-square was more often the higher of the two. Ipsatizing lowered the coefficients more markedly, most often by .04 to .07 in Table 2 (whether eta-square or ICC). For each item, the source (among the 17 described above) is indicated in the table by a single-letter code.
Items Showing the Largest Cross-Population Differences Across 30 Countries.
Note. n ranges from 7,268 to 7,871 depending on the item. η2 and ICC indicate the proportion of between-individual variance in the item accounted for by between-country differences. Letters in parentheses indicate the item-pool source: (a) social axioms, (c) collectivism, (d) Duke Religion Index, (e) extremist thinking patterns, (f) family values, (g) GLOBE normative practices, (h) Machiavellianism, (i) isms, (m) moral foundations, (n) ethnonationalism, (p) proneness to aggress (culture of honor), (r)—added “regularity norm” items derived from anthropological literature. See supplementary document for similar analyses conducted across all 33 nations, and for other supplementary notes. η2 = eta-squared; ICC = intraclass correlation (ICC[1]); ips = in ipsatized data.
We believe that the 30-country selection has the strongest rationale. But results were indistinguishable whether one used the selected 30 countries, all 33 countries, or just those 27 countries that had sample sizes above 150. Across the 281 items, the eta-squared values from these three varying selections of countries correlated .9995 or higher with each other, and no eta-squared value differed by more than .007 across these three ways of computing the values.
Violating expectation, the largest differences were not on those contents most frequently emphasized in cross-cultural psychology (e.g., social axioms, cultural tightness, individualism and collectivism, Schwartz values). None of the items from these sources had eta-squared values exceeding .20. For normative practices (from GLOBE), large differences arose for only three items referencing how children are reared or relate to their parents. Overall, for contents from any source country-of-origin typically accounted for only about 10% of variance in the item. According to conventional standards (J. Cohen, 1992), these are medium-sized effects.
The larger differences were in contents involving religion, regularity-norms, and ethnonationalism. These were large effects: Country-of-origin accounted for 20% to 40% of the variance in the item. Four of the five items with the highest eta-squared were from the DRI, the remaining DRI item had the 16th highest, and the 3rd highest eta was for a non-DRI item involving the importance of religion. All four of the ethnonationalism items had coefficients near to or above .20, and five of the six regularity-norm items did. No other source had more than one third of its items beyond this .20 threshold.
Table 3 provides the mean of eta-squared values for the items derived from each source. It documents the correspondence between item-content and cross-cultural differences in a summary way, ranking the sources based on average eta-squared value of items in their part of the item pool. Consistent with the portrayal just offered based on individual items, the DRI, ethnonationalism, and the regularity-norm items showed markedly more between-population differences than did items from any other source.
Comparison of Item-Pool Sources: Average Eta-Squared Values Across Items From Each Source.
Note. Means computed across eta-squared values derived from analyses with N ranging from 7,268 to 7,871. “No. Items” refers to the number of items in each source, across which the respective mean is computed.
Not shown in Table 3 are the average eta-squared values for dimensions within each item-pool source, three of which deserve a brief mention. First, within the “isms” source, eight items are intended to measure beliefs associated with Tradition-Oriented Religiousness, four of which appear in Table 2; the average eta-squared across the eight items was .21 (.20 with ipsatized data). This provides further support for religious behaviors/beliefs as a key location for cross-cultural differences. Second, four of the “family values” items measure hierarchy, that is, traditional gender roles. These four items had an average eta-squared of .20 (although only .14 with ipsatized data). Third, four of the GLOBE normative-practices items measure in-group (or family) collectivism, and the four together had an average eta-squared of .20 (again, .14 with ipsatized data). Their content concerns differing generations, closeness between generations within the family, as did two regularity-norm items (those involving parent–child sleeping arrangements and the impact of ancestors). This family-oriented collectivism (Vandello & Cohen, 1999), as distinct from the collectivism captured in survey measures of Triandis and Gelfand (1998), is another possible content area with substantial cross-cultural differences.
In sum, the largest cross-cultural differences were found to reflect four kinds of content: behaviors and beliefs indicating devotion to religion, ethnonationalism, hierarchical family values, and aspects of family-oriented collectivism. The other sources of items showed average eta-squared values in the vicinity of .10. In Table 3, the content most often studied cross-culturally—GLOBE normative practices, social axioms, Triandis individualism and collectivism, Schwartz values, and tightness–looseness—is found intermixed with content such as personality, isms, and moral foundations that have attracted far less interest for capturing differences between populations. This is in line with cultural effect-size estimates for values and personality in previous studies (see, for example, Fischer & Schwartz, 2011, Table 1; van Hemert, 2011, Table 5.1; Fischer and Schwartz likewise observed elevated effect sizes for values related to religiosity).
Discussion
Major Implications
The central message of these findings is quite clear. If a cross-cultural psychologist wishes to focus on variables that generate strong differences between populations, one good strategy is to focus on beliefs connected to religion (or the metaphysical), and especially on practices and behaviors that reflect the everyday impact of religion on persons. The central message here resonates with recent arguments by others (Tarakeshwar, Stanton, & Pargament, 2003; also Fischer & Schwartz, 2011; Georgas, van de Vijver, & Berry, 2004). Religiousness tends to have high within-country variation (see, for example, Fischer & Schwartz, 2011) but the between-country variation is so great as to yield high ICC values nonetheless.
Such a psychologist should include in the high-priority list “regularity-norms” (Hechter & Opp, 2001): ways of doing things that are widespread, conventional, only partly moralized, distinct to one culture versus another, and less systematized than those associated with religion. Culture might be conceived as mainly a rather loose association of multitudinous conventions (Poortinga, 2011). The contrast between religious- and regularity-norms is potentially quite strong, as the former involves explicit and the latter more implicit cultural models. These may be two quite different levels of culture of nearly equal importance.
In addition, ethnonationalist sentiments should make that priority list. But this may be due to their quasi-religious character. Anthony D. Smith, on whose work the present ethnonationalism items are based (Saucier, 2014), has characterized it as a “political religion” or a “surrogate religion” (Smith, 2001, p. 35). Ethnonationalism has appeal and endurance based on “deep-rooted, enduring religious beliefs and sentiments, and a powerful sense of the sacred” requiring “absolute loyalty” (Smith, 2003, p. vii). Ethnonationalism is important beyond cross-cultural psychology, as it seems to play a large and creative role in the formation of independent nation-states while also creating some risk for conflict and violence (e.g., ethnic cleansing).
Values related to family roles and regularity-norms for family living arrangements—those aspects most associated with tradition (e.g., tendencies toward three generations in one household, and toward families with institutionalized father-dominance, that is, “patriarchal”)—should also have a higher profile in cross-cultural psychology. These tend to show between-population differences above what is typical for psychological variables. Cultural contexts in which parents and children are more likely to live—and sleep—together appear to be those in which transmitted culture takes place proportionally more across-generations rather than peer-to-peer—what Margaret Mead (1970) called postfigurative (rather than cofigurative) cultures.
These results might provide a spur to theoretical development. What theory of culture can best make sense of the high profile of religion, regularity-norms, and ethnonationalism, and perhaps also traditional hierarchical family values, in how populations differ? Such a theory might have more power than many of the theoretical frames current in cross-cultural psychology.
These results argue against insularity. They highlight the overlap of cultural psychology with the psychology of religion, political psychology, and family sociology. These disciplines may be artificially compartmentalized and separated from one another. Such an observation has been made before. Renshon (2002) argued that political psychology rests on cultural foundations. A. B. Cohen (2009; see also Geertz, 1973) argued that definitions of culture and of religion are interrelated, and both involve shared beliefs and values that are transmitted across generations. Durkheim (1982, p. 129) postulated that a “religion is a unified system of beliefs and practices,” which would be shared and thus partially cultural in nature.
Possible Rival Hypotheses
The conclusions just presented derive from rather substantial differences in effect size: what emerges when emphasis is placed on large as contrasted with medium effects. However, there are various potential objections. These might potentially offer important qualifications or nuance to the basic conclusions just reviewed. Even if not resolved here, they provide a stimulus to further research inquiries. The possible objections are as follows:
Poor translations could affect effect-size estimates. Standard back-translation procedures were employed, so we do not consider this a likely story. But interested readers can judge for themselves after reviewing the translations (http://psychometriglossia.uoregon.edu/).
It may be that various kinds of content are differentially easy to translate; those easier to translate might yield items and data with less measurement error and higher effect sizes. Thus, a plausible (though we think unlikely) rival hypothesis deserves some consideration: Items regarding religious practices and family roles and living arrangements are particularly easy to translate, whereas those regarding values, social axioms, and so on, are less easy to translate.
Table 2 provides coefficients both for original and ipsatized ratings. If results are about the same for ipsatized as for original ratings, it indicates that individual differences in use of the rating scale (acquiescent, middle, or extreme responding) are probably not affecting results in a major way. Ipsatization usually overshoots the mark in correcting for response bias: It forces all individuals to have the same response mean and variance, even though some portion of the variation in response means and variances is probably valid—reflecting that some people have naturally more or less to agree with in a selection of survey items, and some have naturally more versus less intensity in this agreement. Here, coefficients with and without ipsatization typically differed little, suggesting that findings cannot be attributed to response-bias differences between populations. But further studies are needed to confirm and evaluate this conclusion. Response biases clearly contribute to statistical differences between populations (van Hemert, 2011). Precise estimates of effect sizes will be impossible until the response-bias component is isolated.
Reference-group effects (Heine, Lehman, Peng, & Greenholtz, 2002), which arise from subjective standards in use of rating scales that differ across populations, can wash out real effects. This would provide a reasonable account of findings presented here if high-cultural-difference items (e.g., religious behaviors and beliefs) were more behaviorally concrete or used less subjective rating scales than the lower-difference items. Indeed, here, the only two items involving a behavior-count (never, once a week, etc.) were those two DRI items that showed the very largest differences between populations. However, other DRI items had large differences while referring to beliefs regarding and valuing of religion rather than concrete behaviors, and while using rather subjective rating scales (how true, on a 5-point scale). Generally, the items on the survey differed very little in how subjective the rating scale was, yet had wide variance in size of effects. It seems that concrete-behavior reference and a non-subjective rating scale both contribute to larger cross-population differences, but content is also a powerful contributor. A more comprehensive research design, systematically varying concreteness and rating scale (as well as standard vs. referent-shift format) for each kind of content, would be needed to draw conclusions as to the relative power of these contributors to difference.
The present study drew participants from institutions of higher education. This no doubt has some impact on effect-size estimates. Arguably, college students are relatively cosmopolitan, and are located within institutions patterned in Western academic models, and so would tend to be similar across countries, to a greater degree than general populations would be; this might attenuate country differences on some or all variables. Another argument would be that college students in societies with lower average levels of education are a high-status elite, unlike in countries with more-educated people (see, for example, Bourdieu & Wacquant, 1999); this argument does not seem to account well for differences in religiousness (here “elite” students from less-educated countries scored as much more religious generally than the presumably more plebeian students from more-educated countries), but it could conceivably account for unexpectedly small country differences in individualism/collectivism. A third possibility: Students might be prone to vary more than general populations do on some variables (e.g., religious practices and beliefs?) and vary less than general populations do on other variables (e.g., individualism/collectivism?). This would make sense if general populations are generally quite religious (and/or prone to vary highly on collectivism/individualism), but contrastingly student populations in some though not all countries are distinctly non-religious (and/or if student populations were all about equally individualistic). While plausible, this scenario is not one supported by Study 3 of Fischer and Schwartz (2011), in which a 62-nation set of representative samples gave generally similar results to those reported here (e.g., ICC of .30 for an item referencing the importance of God in one’s life, lower ICC for other types of content). These are difficult issues: “It is almost impossible to select a subgroup in one cultural population so that it will precisely match a subgroup in another culture . . . matching on one variable almost inevitably leads to mismatching on other variables” (Berry, Poortinga, Breugelmans, Chasiotis, & Sam, 2011, p. 22). The best remedy may be replication across studies using varying selection rationales.
Perhaps certain kinds of content are very easy to measure in a survey format, whereas others are not. And, those that are easy to measure generate more apparent cross-population differences. Ease-to-measure would be reflected in higher internal consistency in groups of items scored together, but a better index is probably retest stability because it can be investigated at the single-item level, and for truly easy-to-measure variables a single item might be a sufficient measure. The data used here had no retest component. It is possible that, for reasons beyond accuracy of translation or response biases, religious behaviors and beliefs are uniquely easy to measure (perhaps because people have more easily retrievable schemas for them). This possibility cannot be evaluated with the present data, but deserves attention in future research.
Perhaps conventional cross-cultural psychology variables reflect where important differences were found a generation or two ago, but such things are fluid: Now, the clearly biggest differences are in religious behaviors and beliefs, etc., even if this was not the case at the founding of cross-cultural psychology. By this account, the present results are just a snapshot of 2012, and may not generalize to other periods. This account would seem quite strong if our standard for comparison were the last 100 years: Many countries now relatively indifferent to religion were more highly religious a century ago. Religion has been declining in some countries while remaining strong in others (Inglehart & Baker, 2000). Cross-cultural psychology is not quite that old, but historical change may account for some portion of the findings presented here, given historical trends that have seen religion decline in some but not all locations on the globe.
Suggestions for Future Studies
As noted earlier, the populations sampled in this study were not predominantly from Western industrialized democracies as in most psychology studies, but were more educated (and probably rich) than nationally representative samples would have been. It would be useful to repeat this approach in data sets that have more representative samples, particularly where the variable selection is very wide as here. It would also be useful to extend this approach to a truly wide diversity of human cultures, such as those represented in the often much smaller-scale societies sampled in the Human Relations Area Files (Ember, 1997).
Analyses here focused entirely on the item level, and so are most directly relevant to a particular situation, not atypical in large-N surveys where participant time is expensive and precious, where one might wish to capture large-magnitude cross-cultural differences with a few items. For that situation, Tables 2 and 3 give information about what to expect if differing kinds of content are selected. The tables do not enable inferences regarding the measurement properties of any scales from which the items come; these would require a different, larger set of analyses.
Moreover, present analyses focused entirely at the individual level, using country only as an independent grouping variable to demarcate differing populations. It would be useful to examine the isomorphism of these individual-level results with what might be found at the country level. This would be in keeping with lines of research (e.g., Minkov, 2012) that regard culture as mainly a collective-level phenomenon, not optimally approached with individual-level data. One must acknowledge of course the limitations of letting “nation” stand for “culture.”
Specifically, further studies should address the degree to which the items studied here can be meaningfully grouped into the originally intended scales, preferably demonstrating partial or full measurement invariance. Non-invariance might affect the estimates presented here, which are preliminary and broad-brush. Studies of measurement invariance will allow more precise estimates and interpretations, and identify areas in which assessment tools need improvement.
The largest cross-cultural differences were found to reflect four kinds of content (behaviors and beliefs relating to religion, ethnonationalism, hierarchical family values, and aspects of family-oriented collectivism). It would be useful to examine the structure of these kinds of content, their degree of intercorrelation both between and within populations. There are indications of some common threads among them: According to Inglehart, Norris, and Welzel (2002), traditional attitudes including religion are associated with inegalitarian gender roles.
Finally, this study emphasized the relative effect sizes of cross-cultural difference. Ideally, the field would “strike a balance between similarities and differences in such a way that we can interpret differences against a background of similarities (or the other way around)” (van de Vijver, Chasiotis, & Breugelmans, 2011, p. 15). Studies of cross-cultural similarities are potentially complementary to the approach taken here, which emphasized differences.
Conclusion
Survey of World Views data is unique in its combination of diverse sampling of countries with extensive sampling of variables. First, analyses of the data seem to point quite clearly to particular paths—roads usually not taken—with respect to research and theory in the field. Cross-cultural psychology would do well to cast a wider net in terms of the psychological variables of focus. Consistent with some other studies, we found that the most popular variables in cross-cultural psychology show only medium-sized effects for nation/culture; there are clearly numerous other psychological variables showing effects of this magnitude. But with a large effect size, cultures differ in religious/supernatural beliefs and especially in religious behaviors. They differ in the intensity of ethnonationalist sentiments (which are, arguably, quasi-religious). They differ in what sociologists call regularity norms, including some involving family living arrangements. And they differ in their views of appropriate family roles, especially perhaps as related to gender. These findings suggest that the now-standard compartmentalization—by which the psychology of culture is separated from psychology of religion, family sociology, and political psychology—may hinder both empirical discovery and theoretical integration.
Footnotes
Acknowledgements
For help and advice, thanks to Lazar Stankov, Michele J. Gelfand, Kateryna Maltseva, David S. Miller, Eman Gaad, Deniz Tahiroglu, Suhasini Sanyal, Crystal Shackleford, Pinit Ratanakul, Viren Swami, Reinout De Vries, Fabio Iglesias, and Surafel Gelgelo.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project benefited from support from grant FA-9550-09-1-0398, Air Force Office of Scientific Research.
