Abstract
This study explored motivation and engagement among North American (the United States and Canada; n = 1,540), U.K. (n = 1,558), Australian (n = 2,283), and Chinese (n = 3,753) secondary school students. Motivation and engagement were assessed via students’ responses to the Motivation and Engagement Scale–High School (MES–HS). Confirmatory factor analysis using Mplus found good fit for each of the four samples. Multi-group invariance tests demonstrated comparable factor structure, reliability, distributional properties, and correlations with a set of validational factors across the four groups. Results hold implications for international assessment of motivation and engagement, research, and data analysis.
With the rise of international educational assessment exercises (e.g., Programme for International Student Assessment: PISA; Trends in International Mathematics and Science Study: TIMMS; Progress in International Reading Literacy Study: PIRLS) and the accompanying emphasis on international comparisons, it has become important to develop and test instrumentation that has multi-nation validity and applicability. Unless it can be demonstrated that measurement tools are invariant across international contexts, any comparisons of regions and nations are not valid (Meredith, 1993). Motivation and engagement instrumentation is no exception. Researchers have emphasized the importance of studying motivation and engagement in different cultural contexts (e.g., Elliott & Bempechat, 2002; Hau & Ho, 2008; Martin & Hau, 2010; Martin, Yu, & Hau, 2013). The present study assesses motivation and engagement using a multi-dimensional instrument (the Motivation and Engagement Scale [MES]; Martin, 2007, 2009, 2010) in four international contexts: North America (the United States and Canada), the United Kingdom, Australia, and China. We suggest findings hold implications for future research efforts into motivation and engagement in diverse national and regional contexts.
Cross-Cultural Motivation and Engagement
Segall, Lonner, and Berry (1998) reported that important aims for cross-cultural research are to transport and test our current psychological knowledge and perspectives by using them in other cultures; to explore and discover new aspects of the phenomenon being studied in local cultural terms; and to integrate what has been learned from these first two approaches in order to generate more nearly universal psychology, one that has panhuman validity. (p. 1102)
Strong tests of the cross-cultural generalizability of constructs and their instrumentation are possible when researchers collect responses to the same instrument in different cultures, regions, or countries (Leung, Ginns, & Kember, 2008; Marsh, Hau, Artelt, Baumert, & Peschar, 2006; Martin & Hau, 2010; Martin et al., 2013).
The present study extends two earlier studies of cross-cultural motivation and engagement. Both studies investigated Australian and Chinese (Hong Kong and Beijing) students, finding no difference in measurement properties, but some mean-level differences suggesting Australian students provide higher self-reports of motivation and engagement than Chinese students (Martin & Hau, 2010; Martin et al., 2013). The extension in this study is to include a new sample of Chinese students (beyond Beijing schools to schools in other mainland China provinces) and two large samples from North America (the United States and Canada) and the United Kingdom. Although not representative of all major regions in the world, they do constitute well-established sites of international student testing (in the case of the United States, Canada, the United Kingdom, and Australia) and sites that have more recently entered international testing programs (in the case of China).
Measurement Properties Across Contexts
The present study assesses multi-dimensional motivation and engagement across major “Western” (North America, the United Kingdom, Australia) and “Eastern” (China) contexts. Specifically, it examines (a) instrument psychometric properties as a function of international context, (b) invariance in factor structure between the four contexts, and (c) correlations between motivation and engagement and a set of validational factors (educational intentions, class participation, enjoyment of school, academic buoyancy) as a function of context. If there is measurement and correlational invariance across the four international contexts, we can conclude that the factor structure and inter-factor relationships for major motivation and engagement constructs are broadly congruent across North American, United Kingdom, Australian, and Chinese samples.
Motivation and Engagement in the Present Study
Calls for integrative approaches to research and theorizing in motivation and engagement (Murphy & Alexander, 2000; Pintrich, 2003) led to the development of a multi-dimensional conceptual framework in the form of the Motivation and Engagement Wheel (Figure 1) and its accompanying measurement instrument, the MES (Liem & Martin, 2012; Martin, 2007, 2009, 2010). Aligning with each of the dimensions in the Wheel, the MES comprises 11 first order factors subsumed under four clusters: adaptive cognition (or, sometimes referred to as “adaptive motivation”: self-efficacy, valuing school, mastery orientation), adaptive behavior (“adaptive engagement”: planning, task management, persistence), maladaptive cognition (“maladaptive motivation”: anxiety, failure avoidance, uncertain control), and maladaptive behavior (“maladaptive engagement”: self-handicapping, disengagement). The MES has been validated in Australian- and Chinese-only samples (e.g., Martin, 2007, 2009; Nagabhushan, 2013; Plenty & Heubeck, 2011, 2013) and in dual Australian and Chinese contexts (Martin & Hau, 2010; Martin et al., 2013). The present study extends this to a multi-region assessment involving students from North America (the United States and Canada), the United Kingdom, Australia, and China.

Motivation and Engagement Wheel.
To further understand the properties of the MES across contexts and following prior cross-cultural work (Martin & Hau, 2010; Martin et al., 2013), we include four validational correlates that are conceptually cognate to many factors in the MES (and hence should be empirically aligned with the MES). The first three validational measures reflect the tripartite engagement model proposed by Fredricks, Blumenfeld, and Paris (2004). This tripartite approach proposed three engagement dimensions: cognitive, behavioral, and emotional. Consistent with prior cross-cultural studies (Martin & Hau, 2010; Martin et al., 2013), the three engagement measures selected for this purpose are positive academic intentions (cognitive), class participation (behavioral), and enjoyment of school (emotional). To these three factors, we add a fourth validational factor: academic buoyancy. Academic buoyancy refers to students’ capacity to successfully deal with academic setback and difficulty and has been associated with a variety of motivation and engagement factors (Martin & Marsh, 2009). In line with prior cross-cultural findings (Martin & Hau, 2010; Martin et al., 2013), adaptive and maladaptive dimensions of the MES should correlate positively and negatively, respectively, with these four validational engagement and buoyancy factors.
Method
Sample and Procedure
Students in secondary school in North America (the United States and Canada), the United Kingdom, Australia, and China were administered the MES and four validational measures.
North America
The North American sample (n = 1,540) comprised students from the United States (n = 978 students; three schools) and Canada (n = 562 students; three schools). The U.S. and Canadian schools were pooled due to the relatively small number of schools representing each national context. To note is that fit indices for the MES in each country were acceptable: the United States, χ2 = 3,775.31, df = 847, comparative fit index (CFI) = .96, root mean square error of approximation (RMSEA) = .06, standardized root mean square residual (SRMR) = .048; Canada, χ2 = 4,028.41, df = 847, CFI = .95, RMSEA = .08, SRMR = .055. In addition, using criteria recommended by Chen (2007) and Cheung and Rensvold (2002) described in detail in “Statistical Analyses” below, there was evidence of scalar invariance when factor loadings and item intercepts were held invariant across the two countries, χ2 = 8,034.64, df = 1749, CFI = .95, RMSEA = .068, SRMR = .052; residual invariance when loadings, intercepts and uniquenesses were held invariant, χ2 = 7,933.97, df = 1793, CFI = .95, RMSEA = .067, SRMR = .053; structural invariance when loadings, intercepts, uniquenesses, and correlations were held invariant, χ2 = 8,164.14, df = 1848, CFI = .95, RMSEA = .067, SRMR = .058; and stronger structural invariance when loadings, intercepts, uniquenesses, correlations, and means were held invariant, χ2 = 8,215.50, df = 1859, CFI = .95, RMSEA = .067, SRMR = .061. Together, these findings provide support for pooling data from the two nations. Schools were drawn from a larger project involving boarding schools and so represent a somewhat higher socio-economic and achievement profile. Students ranged from 10 to 19 years (M = 16.04; SD = 1.50 years). Just under half (45%) of the participants were female. Students completed the survey online.
The United Kingdom
The U.K. sample comprised 1,558 students from six schools in England. As with the North American sample, U.K. schools were drawn from a larger project involving boarding schools and so represent a somewhat higher socio-economic and achievement profile. Students ranged from 10 to 19 years (M = 14.58; SD = 1.90 years). More than half (59%) of the participants were female. Students completed the survey online.
Australia
The Australian sample was randomly drawn from an archive data set of 33,778 students (compiled over a 10-year period, 2001-2012, across numerous research projects; see Liem & Martin, 2012 for a review). So as not to have disproportionate impact on results, it was considered appropriate to randomly select a smaller sample of Australian students from the large archive. The sample size selected represented the average of the North American, U.K., and Chinese samples. This random sample comprised 2,283 students from 54 schools, ranging from 12 to 18 years (M = 14.33; SD = 1.59 years). Just under half (45%) of the participants were female.
China
In China, 3,753 students participated in the study, ranging from 11 to 19 years (M = 13.80; SD = 0.89 years). They were from 10 schools in four major areas in China: Beijing, Chongqing, Shandong, and Shanxi. Participating schools were of mixed ability. Approximately half (52%) of the participants were female. Consistent with previous Chinese MES research (Martin et al., 2013), the instrument was translated by two native Chinese speakers. Then it was back-translated into English by the second (native Chinese) author. Minor discrepancies were discussed and resolved in the final stages of translation. Considerations on test adaptation for diverse cultural contexts (International Test Commission [ITC], 2000) guided translation, including using native Chinese speakers to navigate cultural and linguistic differences and seeking to generate findings confirming that the language in items and instructions was appropriate to Chinese students.
Materials
MES
The MES comprises a suite of three instruments each tailored to elementary school, high school, and university/college students. In this study of high school students, we used the Motivation and Engagement Scale–High School 1 (MES–HS; Martin, 2010). The scale comprises 11 first order factors, with four items in each factor (thus, a 44-item instrument). For information about the history and development of this scale, see Liem and Martin (2012). Adaptive cognition (or, adaptive motivation) comprises self-efficacy (e.g., “If I try hard, I believe I can do my schoolwork well”), mastery orientation (e.g., “I feel very pleased with myself when I really understand what I’m taught at school”), and valuing (e.g., “Learning at school is important to me”). Adaptive behavior (or, adaptive engagement) comprises persistence (e.g., “If I can’t understand my schoolwork at first, I keep going over it until I understand it”), planning (e.g., “Before I start an assignment I plan out how I am going to do it”), and task management (e.g., “When I study, I usually study in places where I can concentrate”). Maladaptive cognition (or, maladaptive motivation) comprises anxiety (e.g., “When exams and assignments are coming up, I worry a lot”), failure avoidance (sometimes referred to as performance avoidance; e.g., “Often the main reason I work at school is because I don’t want to disappoint my parents”), and uncertain control (e.g., “I’m often unsure how I can avoid doing poorly at school”). Maladaptive behavior (or, maladaptive engagement) comprises self-handicapping (e.g., “I sometimes don’t study very hard before exams so I have an excuse if I don’t do as well as I hoped”) and disengagement (e.g., “I often feel like giving up at school”). These measures were rated on a 1 (strongly disagree) to 7 (strongly agree) scale. Psychometric properties for all four samples are presented in Table 1.
Descriptive Statistics, Cronbach’s Alphas, and Standardized CFA Factor Loadings.
Note. Standardized factor loadings are reported. CFA = Confirmatory factor analysis.
Validational correlates
To examine the pattern of relations between the MES and validational correlates, students were also administered scales that explored their enjoyment of school (four items; e.g., “I like school”), class participation (four items; e.g., “I get involved in things we do in class”), positive academic intentions (four items; e.g., “I intend to complete school”), and academic buoyancy (four items; e.g., “I think I’m good at dealing with schoolwork pressures”). These measures were rated on a 1 (strongly disagree) to 7 (strongly agree) scale. They were drawn from Martin (2007; see also Martin & Hau, 2010; Martin & Marsh, 2008; Martin et al., 2013) who has shown them to be reliable and of sound factor structure.
Statistical Analyses
Confirmatory factor analysis (CFA) was conducted using Mplus Version 6.12 (Muthén & Muthén, 2008) to assess the extent to which the items loaded as expected under each hypothesized factor. Maximum likelihood was the method of estimation. Model fit was evaluated via RMSEA, SRMR, and CFI. RMSEA and SRMR values at or below .05 and .08, respectively, suggest good and acceptable fit (see Jöreskog & Sörbom, 1993). CFI values at or above .90 and .95, respectively, suggest acceptable and good fit (McDonald & Marsh, 1990).
Multi-group CFA invariance tests were performed to formally test for any differences across the four international contexts in terms of metric and scalar invariance (invariance in factor loadings and item intercepts), residual invariance (invariance in uniquenesses), and structural invariance (invariance in correlations and latent means). Although differences in chi-square are the most direct means of comparing nested models, it is known there are problems associated with such tests (e.g., see McDonald & Marsh, 1990; Tabachnick & Fidell, 2013). Hence, in comparing nested models, we give emphasis to differences in CFI and RMSEA, with a difference above .01 in CFI and a difference above .015 in RMSEA deemed to indicate differences between groups (Chen, 2007; Cheung & Rensvold, 2002). We adjust for the clustering of students within schools through the Mplus “cluster” command using the “complex” method. This procedure provides adjusted standard errors and so does not bias tests of statistical significance due to clustering of students within schools (Muthén & Muthén, 2008).
Results
CFA
In the first instance, an 11-factor model for each group was examined using CFA. The CFAs yielded acceptable fit to the data for North America (χ2 = 4,492, df = 847, CFI = .95, RMSEA = .054, SRMR = .047), the United Kingdom (χ2 = 3,501, df = 847, CFI = .93, RMSEA = .045, SRMR = .045), Australia (χ2 = 3,175, df = 847, CFI = .93, RMSEA = .035, SRMR = .041), and China (χ2 = 6,031, df = 847, CFI = .93, RMSEA = .041, SRMR = .046). Standardized factor loading means are presented in Table 1. For each group, the loadings are acceptable (all loadings significant at p < .001), as are the reliability coefficients presented in Table 1 (all above α = .72; Tabachnick & Fidell, 2013). Also in Table 1 are skew and kurtosis values; these suggest each factor is approximately normally distributed for each sample. Inter-factor correlations are presented in Table 2. Correlations are broadly consistent across region (also see invariance tests below) and broadly consistent with previous measurement work with the MES (see Liem & Martin, 2012, for a review).
Correlations Between MES Factors and With Validational Constructs.
Note. Decimal points omitted. North America: Correlations > |.09| significant at p < .05; the United Kingdom: Correlations > |.10| significant at p < .05; Australia: Correlations > |.06| significant at p < .05; China: Correlations > |.05| significant at p < .05. MES = Motivation and Engagement Scale; SE = self-efficacy; MAST = mastery orientation; VAL = valuing school; PLAN = planning; TASK = task management; PERS = persistence; ANX = anxiety; AVOID = failure avoidance; UNCERT = uncertain control; S HAND = self-handicapping; DISENG = disengagement.
Multi-Group CFA and Invariance Tests
Four models were tested in each of the multi-group CFAs across international context. The first held factor loadings and item intercepts invariant across groups (Fit: χ2 = 21,454, df = 3575, CFI = .91, RMSEA = .047, SRMR = .050); the second held factor loadings, item intercepts, and correlations invariant (Fit: χ2 = 23,285, df = 3740, CFI = .90, RMSEA = .048, SRMR = .074); the third held factor loadings, item intercepts, and uniquenesses invariant (Fit: χ2 = 23,571, df = 3707, CFI = .90, RMSEA = .049, SRMR = .062); the fourth held factor loadings, item intercepts, uniquenesses, and correlations invariant (Fit: χ2 = 25,288, df = 3872, CFI = .90, RMSEA = .049, SRMR = .083); and, the fifth held factor loadings, item intercepts, uniquenesses, correlations, and means invariant (Fit: χ2 = 26,942, df = 3894, CFI = .89, RMSEA = .051, SRMR = .087). The fit indices reported here indicate that when successive elements of the factor structure are held invariant across the four samples, there is metric and scalar (loadings and intercepts), residual (uniquenesses), and predominant structural (correlations and latent means) invariance across parameters as evidenced by no change greater than .01 in CFI (except in the case of the latent means 2 ) and no change greater than .015 in RMSEA (Chen, 2007; Cheung & Rensvold, 2002). 3 This suggests invariance in the MES factor structure across North American, U.K., Australian, and Chinese samples.
Motivation, Engagement, and Validational Correlates
We explored the relationships between each MES factor and a set of four validational constructs to examine the extent to which these associations were invariant across the four countries. The four correlates were as follows: positive academic intentions (reliability: North America α = .84, the United Kingdom α = .83, Australia α = .80, China α = .79), class participation (reliability: North America α = .90, the United Kingdom α = .90, Australia α = .89, China α = .89), school enjoyment (reliability: North America α = .92, the United Kingdom α = .91, Australia α = .91, China α = .90), and academic buoyancy (reliability: North America α = .83, the United Kingdom α = .82, Australia α = .79, China α = .80). In these analyses, we first estimated the correlations between the MES and the four validational factors (shown in Table 2), while holding invariant the MES factor structure and means, the MES inter-factor correlations, and the factor structure of the four validational constructs (Fit: χ2 = 36,822, df = 7038, CFI = .90, RMSEA = .043, SRMR = .077). Second, in addition to these constraints, we then held invariant the correlations between the MES factor and the four validational factors (Fit: χ2 = 37,451, df = 7152, CFI = .90, RMSEA = .043, SRMR = .078) to see whether there was a significant decline in fit, which would suggest that these validational correlations varied across samples. Correlations in Table 2 suggest broadly similar correlations between MES and validational factors across the four groups. In addition, from the first to the second correlational model, there was no change greater than .01 in CFI (Cheung & Rensvold, 2002) and no change greater than .015 in RMSEA (Chen, 2007). This suggests invariance in validational correlations across North American, U.K., Australian, and Chinese samples. That is, correlations between the MES factors and a set of external correlates do not appear to vary markedly across the four samples.
Discussion
Motivation and Engagement Across International Contexts
In the main, findings show that there is measurement congruence across the samples from the four international regions. Reliabilities were highly similar, distributional properties of factors (skew and kurtosis) were congruent, factor structure was invariant across groups, MES inter-factor correlations were broadly invariant as were latent means, and correlations between MES factors and validational factors were also broadly invariant. We therefore conclude there exists preliminary evidence that the MES demonstrates cross-cultural generalizability. These findings extend previous research by testing—for the first time—samples from North America (the United States and Canada) and the United Kingdom, and a new Chinese sample comprising students from provinces beyond Beijing (the site of previous Chinese research; Martin et al., 2013).
Substantive and Methodological Implications
The finding that the MES demonstrates good fit across four international regions (and also across the United States and Canada) has substantive and research implications. One substantive implication is that the multi-dimensional factors of motivation and engagement appear to be generalizable across diverse contexts. Students across these contexts differentiate motivation and engagement factors in similar ways. Reliability, loadings, and item intercepts are quite parallel—as are distributional properties—and relationships between MES factors and the validational constructs such as class participation, enjoyment of school, and academic buoyancy are broadly equivalent. Thus, the multi-dimensional instrumentation and the conceptual foundations from which the factors emanate (e.g., goal theory for mastery orientation; expectancy–value theory for self-efficacy and valuing; need achievement theory and self-worth motivation theory for failure avoidance, anxiety, and self-handicapping—see Martin, 2007, for a review of the theories underpinning the MES) show cross-cultural validity.
Findings are also important for motivation and engagement researchers exploring substantive issues. To draw valid substantive conclusions on a particular issue, it is essential that there is strong and relatively invariant instrumentation underpinning the research. For example, if significant effects emerge in cross-cultural motivation and engagement research, it is important to know these effects are due to the substantive issue at hand and not due to differences in measurement properties underlying the research instrumentation.
Implications for data analysis are also noteworthy. With the rise of international testing, it is important to show invariance in measurement before integrative data analysis can take place. One important feature of a data set that determines how analysis should proceed is its invariance across contexts. If instrumentation is not invariant across contexts, then data analysis ought to be disaggregated and not based on pooled data (Tabachnick & Fidell, 2013). Our findings showed that the factor structure and other important components of the structure are invariant across North American (and across the United States and Canada), U.K., Australian, and Chinese samples. This provides measurement support for pooling data in statistical analysis in relation to multi-dimensional motivation and engagement factors assessed under the MES (Martin, 2010).
Limitations, Future Directions, and Conclusion
There are some limitations important to consider when interpreting results and which suggest direction for future research. First, all data were self-reported and so future validation across these four contexts should include objective measures such as verification from teachers and/or parents and also achievement data. Second, data from North America and the United Kingdom were collected online whereas Australian and Chinese data were collected in hard copy (although the study found comparable fit for all contexts). This difference in methodology should be recognized when interpreting findings and future research might collect and compare hard copy data from the former two contexts and online data from the latter two contexts. Third, North American and U.K. data were drawn from a boarding school project and these schools were higher in socio-economic status and achievement (again, though, the study found comparable fit for all contexts). Future work might collect North American and U.K. data from other school sectors. Finally, based on a difference in CFI (but not RMSEA), there was partial evidence of a lack of invariance in latent means across samples and so future research might look to identify whether these mean-level effects represent cultural differences in response bias (e.g., acquiescence) and/or cultural differences in actual levels of motivation and engagement. Notwithstanding these issues, the present study represents a large-scale effort confirming a multi-dimensional framework and instrumentation across four major international contexts. Findings provide preliminary support for the generalizability of multi-dimensional motivation and engagement, a noteworthy finding given increasing attention to international assessment in education.
Footnotes
Acknowledgements
The authors would like to thank all participating schools, the Australian Research Council, and Beijing Normal University for their assistance and support in this research.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The first author is also the author of the Motivation and Engagement Scale, a commercially available survey instrument. Otherwise, the authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.
Funding
The author(s) received funding from the Australian Resarch Council, the Australian Boarding School Association, and Beijing Normal University to assist with the research in this article.
