Abstract
Until recently, there were four sources of large-scale self-report survey data on victim rates, cross-nationally: EU Kids Online, Global School Health Survey, Trends in International Mathematics and Science Study, and Health Behaviour of School-aged Children. Smith, Robinson, and Marchi (2016) examined the internal validity and external validity of these data sets, comparing country victimization rates. While internal validity correlations were high, external validity correlations ranged from moderate to zero, raising concerns about using these cross-national data sets to make judgements about which countries are higher or lower in victim rates. Another cross-national source of victim rates was released by PISA in 2016, and here we compare this PISA data with the earlier data sets, and the most recent data sets from HBSC and TIMSS. Correlations obtained were generally more acceptable than in the previous comparisons, and especially satisfactory for comparing PISA with TIMSS. Implications of the findings are discussed.
Recent years have seen a very rapid increase in publications on the topic of bullying, especially in schools; but also concern about validity, transparency, and comparability of the large number of research findings now being published (Volk, Veenstra, & Espelage, 2017). One aspect of this is comparability of studies across different countries, which raises many issues, including: age range; sampling issues; dates of survey; administration procedures; questionnaire issues; definitions of bullying; time reference period; types of bullying assessed; frequency scales; and linguistic issues (Scheithauer, Smith, & Samara, 2016).
Victimization is generally defined as being subjected to intentional harmful actions, often repeated, and (in the case of bullying victimization) with the victim unable to defend themselves effectively (Olweus, 1999). There are a number of small-scale cross-national comparisons of bullying, typically comparing two or three countries using the same measurement techniques. However there are also some large-scale surveys, typically covering some 25–50 countries, with large sample sizes (usually a minimum of N = 1,000 in each country), and with standard questionnaire assessments in each survey. Smith et al. (2016) compared the validity of four such cross-national data sets on victimization rates, all of which used self-report data: EU Kids Online (EUKO), Global School Health Survey (GSHS), Trends in International Mathematics and Science Study (TIMSS), and Health Behaviour of School-aged Children (HBSC). The findings were variable; across countries, there was modest agreement between TIMSS and HBSC; between TIMSS and GSHS; and between EUKO and HBSC, taking the most age-appropriate correlations. However there was near-zero or negative agreement between EUKO and TIMSS.
Since that analysis, a fifth source for victimization rates across different countries has been released by the Programme for International Student Assessment (PISA) (https://nces.ed.gov/surveys/pisa/). PISA is organised by the OECD. It measures students’ reading, mathematics, and science literacy every three years; students are aged between 15 years 3 months and 16 years 2 months at the time of assessment, and have completed at least 6 years of formal schooling. PISA had previously reported data from one item in their school questionnaire on whether teachers regard bullying as a problem in their school; this was very different from the more extensive pupil-report data from the other four surveys. However the latest publication (OECD, 2017) has victimization data from pupil self-report, like the other surveys; this data, from 52 countries, was gathered in 2015, with an average of 7,500 students per country. This assessment was mainly computer-based, although countries had the option to use a paper-based version. Here, we see how well the country differences reported by PISA match up with those of EUKO, GSHS, HBSC, and TIMSS.
Method
Full characteristics of the surveys by EUKO, GSHS, HBSC, and TIMSS are given in Smith et al. (2016) and can be found on the respective surveys’ websites. They all sample students in the early/mid adolescent period. EUKO sampled 9–16 year olds, and GSHS 11 to 18 year olds. TIMSS reports separately for two age groups (4th grade and 8th grade), and HBSC for three age groups (11, 13, and 15 years). Three used school-based surveys, but EU Kids Online gave a face-to-face interview in survey format. TIMSS gave no definition of bullying; EU Kids Online gave a definition but without power imbalance; and both GSHS and HBSC gave an Olweus-type definition. All gave a time frame, but this varied: from the past 30 days (GSHS), to the past couple of months (HBSC), to the past 12 months (EUKO) or past year (TIMSS). Only EUKO asked explicitly about cyberbullying; GSHS and TIMSS asked about various (different) types of bullying, whereas HBSC asked a global question (the latest HBSC 2013/14 survey also asked two questions about being cyberbullied, not included in analysis here). All had a frequency scale for response, but with differing scale points.
Like TIMSS, PISA does not give a definition of bullying. The assessment asked how frequently they have been exposed to eight types of bullying behaviours over the past 12 months. The eight types were: called names by other students; got picked on by other students; other students left me out of things on purpose; other students made fun of me; I was threatened by other students; other students took away or destroyed things that belonged to me; I got hit or pushed around by other students; other students spread nasty rumours about me. Response options were: never or almost never; a few times a year; and a few times a month or once a week or more (these last two being merged for analysis).
Two overall measures are presented: One is the percentage of pupils who have been bullied by any of the eight types of bullying at least a few times a month, labelled ‘any type of bullying act’ in the PISA tables (OECD, 2017, Overview, p.17). The second is an index of exposure score, based on the six types of bullying experience which were found to be most reliable in internal analyses (including confirmatory factor analysis); it excludes I got called names by other students, and I got picked on by other students, which did not load well onto a unidimensional construct and did not correlate strongly with the other six items. This index had an average Cronbach’s alpha reliability of 0.83 (range across countries: 0.71 to 0.90) (OECD, 2017, p.253). We used the measure labelled ‘percentage of frequently bullied students’ in the PISA tables, which is the percentage of students in that country who are in the top 10% of the index of exposure to bullying among all countries/economies (OECD, 2017, p.370). Here, we use both these publicly available measures of victimization.
To compare with PISA, we selected surveys as reported in Smith et al. (2016), namely: 2010 data from EUKO; 2009/2010 data from HBSC; 2011 data from TIMSS (scale score); and surveys between 2002 and 2012 from GSHS (scale scores) (see Table 2 in Smith et al., 2016). All the data used is readily available in publications (Currie et al., 2012; Livingstone et al., 2011; Mullis, Martin, Foy, & Arora, 2012) and on the surveys’ websites.
In addition, more recent survey data was available from TIMSS and HBSC, which would match better with the PISA survey date of 2015. We therefore also included: 2015 data from TIMSS (http://timss2015.org/; Martin, Mullis, & Hooper, 2016); and 2013/2014 data from HBSC (Inchley et al., 2016). Besides comparisons with PISA, we also took the opportunity to see how consistent country differences were within a survey, across the two time points, for TIMSS and for HBSC.
We compared the victim prevalence rates from PISA with the other four surveys, pairwise for countries in common. PISA had an overlap of 21 countries with EUKO, 9 with GSHS, 32 with TIMSS 4th grade (2011 and 2015), 21 with TIMSS 8th grade (2011 and 2015), and 27 with HBSC 2009/10 and 26 with HBSC 2013/14 (both for all 3 ages). We conducted both Pearson’s and Spearman’s correlations using SPSS version 22. Pearson’s correlations are appropriate in terms of considering actual prevalence rates, as has normally been done in use of these surveys in previous publications. However, Spearman’s correlations are more appropriate if the interest is in comparing by the rank order of countries across surveys, rather than actual prevalence rates. Significance is reported as * = p < 0.05; ** = p < 0.01. Throughout, n refers to the number of countries in each calculation.
Results
Comparisons Within Surveys
The two PISA measures used correlated (Pearson’s; Spearman’s) significantly, r = 0.59**; 0.63** (n = 52). TIMSS correlations across the two survey points (at same grades) were for 4th grade, r = 0.84**; 0.83** (n = 42); and for 8th grade, r = 0.85**; 0.84** (n = 33).
HBSC correlations across the two survey points (at same ages) were for 11 years, r = 0.93**; 0.91**, 13 years, r = 0.91**; 0.92**, and 15 years, r = 0.85**; 0.83** (all n = 37).
Comparisons of PISA with Other Surveys
The correlations of the two PISA measures with those from the other four surveys are shown in Table 1. There are moderate correlations with EUKO; near-zero correlations with GSHS (but with only 9 countries in common); quite high correlations with TIMSS, especially for 2015 8th grade; and only modest correlations with HBSC.
Cross-National Correlations (Pearson’s; Spearman’s) between PISA Survey and EUKO, GSHS, TIMSS, and HBSC
* = p < 0.05; ** = p < 0.01.
Countries Involved in the Various Comparisons
Discussion
PISA has recently introduced a pupil-based measure of being bullied at school, and has provided two measures: any type of bullying act based on all eight types assessed, and an index of percentage of frequently bullied students, based on six types of being bullied. The two measures correlate substantially and significantly across countries (0.59; 0.63) but are clearly far from identical. In so far as cross-country concordance with other surveys is concerned, results in Table 1 suggest that the percent bullied measure yields somewhat higher correlations for EUKO, GSHS, and TIMSS, with not much difference for HBSC.
As before, we found that within a survey, cross-country variations are quite consistent. In this case, very high correlations were found between TIMSS 2011 and 2015; and between HBSC 2009/10 and 2013/14. These surveys are producing reliable cross-national figures on victimization, even if they show only modest agreement with each other.
So far as comparing the surveys is concerned, the present findings are most encouraging in comparing PISA with TIMSS (Table 1). Here, the correlations across countries are mostly substantial and significant. Furthermore, they are highest where predicted, namely for the 2015 TIMSS survey (best matching PISA’s 2015 date) and for TIMSS 8th grade (matching PISA’s sample age of 15 years). The correlations with PISA percent bullied here (0.81, 0.82) are much the same as for TIMSS 2011 to 2015 (0.83 to 0.85) and are thus as high as might be reasonably expected.
The correlations of PISA with EUKO are moderate, and mostly reach significance. They are somewhat higher than the modest (and non-significant) correlations reported between HBSC and EUKO (Smith et al., 2016). The correlations of PISA with GSHS are disappointingly low, but need to be treated with caution, as there are only nine countries in common.
The correlations of PISA with HBSC are uniformly very modest, all within the range 0.15 to 0.40. At a given age, the correlations are slightly higher with the more recent, 2013/14, survey, which matches more closely with PISA’s survey date of 2015. However, against expectations, correlations are not higher (in fact they are lower) with HBSC 15 years, which is the age match with PISA’s sample.
Why do PISA and TIMSS agree substantially more closely with each other, than each does with the other surveys? A similarity between PISA and TIMSS is that neither gives a definition of bullying (unlike EUKO and HBSC), with both rather giving a range of experiences and asking how often someone has been a victim of these. Both also have the same time frame (12 months/a year). EUKO differs noticeably in using face-to-face interviews rather than anonymous survey methods; and in explicitly covering online experiences. HBSC differs in using the word bullying in the definition, which raises issues of translation and meaning in different languages/countries; plus, the definition explicitly mentions power imbalance. These and other issues were discussed in Smith et al. (2016).
Implications of the Findings
As noted before, we need to be cautious about judging how countries appear in terms of high or low prevalence rates for being bullied, especially if only one survey is relied on. However in terms of experiencing a range of victim-like behaviours, PISA and TIMSS are in high agreement. Because there is no explicit mention of imbalance of power in either of these surveys, they may be picking up a wider range of behaviours than does HBSC. Given that proviso, the agreement between these two surveys attests to a degree of validity of the country differences obtained from either of them.
By contrast, HBSC appears to be measuring bullying in the more traditional sense of clearly embodying an imbalance of power as well as repetition and intent (Olweus, 1999; Volk et al., 2017). This has more surface validity in terms of measuring bullying, but at present no other survey correlates very highly with it so far as country differences are concerned. EUKO appears intermediate in this respect, giving a definition but avoiding the word bullying; and having moderate correlations with PISA.
These arguments about possible reasons for agreement/disagreement between the surveys are speculative, and more research is needed to clarify which aspects are most important. PISA comes out well from our analyses, but it is regrettable that none of its eight types of victim experiences cover online or cyber attacks – a deficiency remedied by HBSC in its latest (2013/14) survey.
Strengths and Limitations
Some strengths and limitations of our study should be noted. To our knowledge this is the first study to compare PISA with the other four surveys, so far as victimization experiences are concerned. The high agreement between PISA and TIMSS is an important finding for validating these two independent measures of country differences. However, the number of countries in overlap was low with GSHS (n = 9), meaning the PISA-GSHS correlations must be treated with considerable caution. For other comparisons the number of countries overlapping is more satisfactory, ranging from 21 to 32. Another limitation is that the older survey dates for EUKO and GSHS do not match so well with the more recent survey date for PISA.
Author’s Notes
Leticia López Castro is grateful to the Xunta de Galicia and to the ESCULCA-USC research group for support for her stay at Goldsmiths, University of London while working on this project.
