Abstract
This study analyzed the Trends in International Mathematics and Science Study 2011 data on fourth-grade U.S. students’ mathematics performance to answer four research questions: (a) How did U.S. students’ geometry performance compare with their performance on the other mathematics subscales? (b) What were the patterns of student achievement among the mathematics subscales? (c) Was there a group of students who demonstrated specific difficulties in geometry only? and (d) Which demographic variables contributed to students’ classification in the group with geometry difficulties? We found that (a) U.S. students’ performance was poorer on the Geometry subscale than on other mathematics subscales; (b) using latent profile modeling, we identified a group of students with the lowest scores across all three mathematics subscales who showed a significant discrepancy between Geometry and the other subscales that did not exist within the high-achieving and average-achieving groups; and (c) gender, age, home language, race, and preference for mathematics and science significantly influenced the probability of being classified in the group with the lowest performance and the largest gap between Geometry and other mathematics subscales. Implications for educational theory and practice are discussed.
There are many reasons for recommending that schools pay particular attention to children’s achievement in geometry. As a subject in K-12 mathematics education, geometry instruction involves helping students to learn spatial relations and properties of shapes, which are crucial for students to succeed in science, technology, engineering, and mathematics (STEM) subjects at the college level (Hsi et al., 1997; Wai et al., 2010). However, geometry instruction is often overlooked in current education research and practice in the United States (Clements & Sarama, 2011). A historical review of the geometry curriculum in U.S. schools (Sinclair, 2008) suggested a long history of neglecting geometry instruction. Not until 1844, when geometry began to be required for college entrance, did U.S. high school administrators and teachers realize the necessity to add this subject to the mathematics curriculum. Therefore, the geometry curriculum was initiated with an elite college-bound status possibly because of Euclidean geometry’s link to ancient Greek scholars, whereas other mathematics domains are typically rooted in everyday problem solving and applications.
U.S. students’ performance in geometry can be considered less than satisfactory. Although fourth-grade U.S. students showed progress in numerical subscales in the Trends in International Mathematics and Science Study (TIMSS) 2015 report, there was a significant drop of 9 standardized points from the 2011 testing performance in the subscale of Geometric Shapes and Measures (Provasnik et al., 2016). An earlier study showed that although TIMSS reported significant improvement in U.S. eighth-graders’ algebra performance between 1999 and 2003, significant improvement was not found in their geometry performance during that period (Gonzales et al., 2004).
Theoretical Framework
The categorization of subtypes of math learning disabilities or mathematics difficulties has been in discussion. Some researchers have argued that students’ particular difficulties in geometry may represent a unique and specific mathematics learning disability subtype, different from other subtypes of mathematics difficulties. For example, Geary and Hoard (2005) described three subtypes of students with mathematics difficulties, including procedural (i.e., children present a delay in acquiring simple arithmetic strategies), semantic memory (i.e., children show deficits in retrieval of facts because of a long-term memory deficit), and visual-spatial difficulties (i.e., children show deficits in the spatial representation). Karagiannakis et al. (2014) categorized mathematics learning disabilities into four subcategories: core number, memory retrieval and processing, reasoning, and visual-spatial. Similar conceptual categorizations were also advocated by the National Center for Learning Disabilities (2006) who listed visuospatial impairments as a distinct area of weakness exhibited by many students with mathematics difficulties. Recently, Bartelet et al. (2014) assessed more than 200 elementary school children with mathematics learning disabilities with a battery of cognitive instruments. With a data-driven approach, they reported six distinct groups of mathematics learning disabilities, including a spatial difficulties group, who were particularly weak in visual working memory. Evidence from neuropsychology has demonstrated that spatial deficits may be associated with dysfunction in posterior regions of the right hemisphere and may be related to the parietal cortex of the left hemisphere (Malhotra et al., 2009).
Spatial abilities refer to the ability to understanding, reason, retain, retrieve, and transform visual images, and are highly predictive to students’ geometry performance (Clements et al., 1997; Giofrè et al., 2013; Kyttälä & Lehto, 2008; Spelke et al., 2010), and interestingly this correlation is even greater in students with poorer geometry performance (Battista, 1990). Deficits in spatial abilities have been documented to explain students’ difficulties with geometry (Clements et al., 1997; Passolunghi & Mammarella, 2012). Given the strong relation between the geometry subject and spatial abilities, it sounds plausible to hypothesize that geometry difficulties may represent a unique and specific learning difficulty due to the high reliance on spatial abilities.
Contrarily, the construct of “geometry difficulty” as a specific mathematics learning difficulty can be questioned for a few reasons. First, many mathematical tasks require spatial thinking (van Garderen, 2006), and a deficit in spatial abilities may affect not only geometry achievement but also a broad range of mathematics domains. Second, the reasons why students encounter geometry difficulties may be multifaceted: A recent meta-analysis (Peng et al., 2016) found that the literature (Giofrè et al., 2013, 2014; Passolunghi et al., 2008) reported a weak relation between geometry achievement and all categories of working memory, including visual-spatial working memory. On top of spatial abilities, students’ geometry achievement is also influenced by many other cognitive factors, such as verbal working memory (Bizarro et al., 2018), fluid intelligence, and reasoning (Giofrè et al., 2014). In particular, because geometry includes considerable proof-oriented problems, it is typically considered highly related to the deductive thinking (Dawkins, 2015) and verbal logical reasoning (Battista, 1990). Geometry learning difficulties can also be related to non-cognitive factors, including motivation and persistence (Nichols, 1996), emotions (Bailey et al., 2014), meta-cognition abilities (Aydın & Ubuz, 2010), knowledge (Bokosmaty et al., 2015), and how they use knowledge (Lawson & Chinnappan, 1994). And these non-cognitive problems universally exist in all mathematical domains for struggling students, which also challenges the hypothesis that geometry difficulties should be viewed as a specific and unique subtype of math learning difficulty.
That said, there has been sparse research that provides empirical evidence to support the above theoretical conjectures that geometry learning difficulties are specific and unique among mathematics learning difficulties. There is also little research on the relation between performance on geometry and other mathematics content domains. There is a need to empirically explore the existence of “specific geometry difficulties” by investigating whether there is a special group of students who demonstrate a discrepancy between geometry achievement and achievement in other mathematics areas. Specifically, we will examine whether (a) there is a group with difficulties in geometry who show average or above-average performance in other mathematics areas, or (b) there is a group showing difficulties with performance on multiple mathematics subscales and more severe difficulties on the Geometry subscale than on other mathematics subscales. The availability of existing large-scale achievement assessment data, such as TIMSS, makes it possible to examine the existence of such groups.
Research Questions
In the present study, we aimed to answer four research questions based on an analysis of TIMSS-2011 fourth-grade mathematics data:
Method
TIMSS in Mathematics
The TIMSS provides reliable and timely data on school mathematics and science achievement in the United States and other countries. TIMSS data have been collected every 4 years from students in fourth and eighth grades since 1995. In this research, we analyzed TIMSS (2011) fourth-grade data to examine U.S. students’ geometry performance. It is worth noting that the analyses are nationally generalizable with the available sampling weights (Foy et al., 2013). The 2011 mathematics assessment at the fourth-grade level included three content subscales, namely, Data Display (Display), Geometric Shapes and Measures (Geometry), and Number. TIMSS also provided demographic and related information data about the participating students. We chose the fourth-grade rather than the eighth-grade data because fourth-grade geometry is closer to so-called “intuitive geometry,” which is independent from instruction, experiences, and culture (Giofrè et al., 2013). In contrast, eighth-grade geometry involves a greater proportion of geometry academic content that is closely dependent on learning and education and closely related to other mathematics subjects such as algebra (Giofrè et al., 2014). Specifically, for fourth graders, the TIMSS Geometry scale includes items such as “Length of string pulled straight,” “Position of shape after a
TIMSS provides five plausible values (von Davier et al., 2009) per content subscale. Within the context of large-scale assessments, plausible values are random numbers that are drawn from the distribution of scores of individual students—that is, from the marginal posterior distribution. Multiple plausible values are more accurate and can better capture the expected values and variance in subgroups, especially when the true distribution of mastery or accurate estimates of individual performance are difficult to obtain on short tests. The main challenge of analyzing plausible values is based on the method of multiple imputation, which is widely used to analyze missing data (Schafer & Graham, 2002). With latent variables, multiple imputation models are more difficult to converge, with fewer fit statistics available for model assessment. For this research, we chose Mplus (L. K. Muthén & Muthén, 1998–2017) to analyze the five sets of plausible values with multiple imputation and accordingly used all the fit and descriptive statistics that were available in the software for analysis. For missing values, missing at random was assumed. Finally, appropriate sampling weights have been applied based on TIMSS User Guide (Foy et al., 2013). Note that Mplus was the only software we found satisfying our requirements, namely, latent profile analysis (LPA) with covariate, multiple imputation, and sampling weights.
Data Analysis Plan
We followed a three-step analysis. First, we calculated the mean scores of the three mathematics subscales of TIMSS and compared the difference between any of the two subscales using Z tests to answer the first research question. We also provided correlations between the three subscales. Second, to answer Research Questions 2 and 3, we used LPA to classify students into different groups. We hypothesized a group of students would perform poorly on the Geometry subscale only and would perform at average or above-average levels on the other mathematics subscales. Similar to latent class analysis (LCA), LPA is a form of mixture modeling used to identify different latent classes of individuals with similar response patterns based on a set of observed variables (Lubke & Muthén, 2005). The difference is that the LPA is for continuous observed data, whereas the LCA is for categorical observed data. To determine the number of classes, we compared a range of latent class models with different number of classes in terms of their relative model fit. For LCA or LPA with multiple imputation in Mplus, three fit indices are available, that is, Akaike’s (1974) information criterion (AIC), Schwarz’s (1978) Bayesian information criterion (BIC), and entropy. Entropy is an index to measure the variability, or chaos, in a stochastic system, with higher values indicating greater precision of classification. Entropy with values approaching 1 indicates clear delineation of classes, while those above .8 are generally considered acceptable in assigning individual cases into appropriate classes (Celeux & Soromenho, 1996).
Third, to answer Research Question 4, we examined the effects of demographic variables on class membership using the LPA with covariates. LPA or LCA with covariates is analogous to a multinomial logistic regression approach with latent class membership serving as a categorical dependent variable and the covariates as independent variables (B. O. Muthén & Satorra, 1995). Among different approaches to incorporate covariates in latent classes (Vermunt, 2010), the one-step approach was adopted due to the exploratory nature of the present research. The one-step approach reduces the methodological difficulties in addressing latent classes with plausible values. In a more confirmatory or high-stake context, however, the more sophisticated three-step approach (see Vermunt, 2010) could be more appropriate.
We tested nine demographic covariates in the latent class models of interest: (a) gender, (b) age, (c) English (English spoken at home, 1 = always to 4 = never), (d) Spanish versus other non-English language spoken at home, (e) how much the student likes mathematics (likemath; 1 = like to 3 = not like), (f) how much the student likes science (likesci; 1 = like to 3 = not like), (g) White or Black (WvsB), (h) White or Hispanic (WvsH), and (i) White or other non-Black or Hispanic races (WvsO). The covariates were included in the model one by one to avoid the possible impact of multicollinearity among the covariates.
Results
Descriptive Statistics
Descriptive statistics among the three mathematics subscales of TIMSS (i.e., Display, Geometry, and Number) are displayed in Table 1. Students’ performances on the three subscales were highly correlated, with correlation coefficients of .903, .903, and .897 for the Geometry–Display, Geometry–Number, and Display–Number relationships, respectively. The mean score of Geometry was substantially lower than the mean score of Display (Z = 9.953, p < .001) and the mean score of Number (Z = 8.327, p < .001), while the mean scores of Display and Number were close to each other (Z = 1.605, p > .05).
Means and Correlations Among Three Subscales.
Note. Below the diagonal are the correlation coefficients among the three subscales. Above the diagonal are the Z values comparing the mean scores of the three subscales.
p < .1. *p < .05. **p < .01. ***p < .001.
LPA
To determine the appropriate number of classes, we compared models from two to eight classes using LPA, and the goodness of model fit can be found in Table 2, with smaller values of AIC and BIC for a better model. However, AIC and BIC should not be used alone for model comparisons with plausible values because of a concern about their uncertainty (i.e., SD) due to multiple imputation (Chaurasia & Harel, 2012). Instead, we combined them with the entropy index. In Table 2, one can see that both AIC and BIC values decrease with models of more classifications, but the trend of decreasing is much alleviated starting from the five-class model. Considering the highest entropy (0.873) among all models, the five-class model seemed to be the best choice. To account for the uncertainty of AIC or BIC, two adjacent models with the second and third highest entropies, that is, the four-class and six-class models, were also selected for analysis simultaneously.
Model Fitting Indices Under Each Classification.
Note. AIC = Akaike information criterion; BIC = Bayesian information criterion.
Four-class model
There were descending trends of all mean scores from the first to the last class. Latent Class 1 (Class 1; 16.9% of the entire sample, N = 2,122) exhibited the highest mean scores of all three subscales (Display, M = 648.16, Geometry, M = 651.40, and Number, M = 649.94). Thus, this class was the group with the highest scores across all three mathematics subjects, and the group members achieved higher than the Advanced International Benchmark (625). In contrast, Class 4 demonstrated the lowest mean scores across all three subscales (Display, M = 416.26, Geometry, M = 389.78, and Number, M = 412.34) and is the only group whose members performed below the Low International Benchmark (400) in geometry, while performing above the Low International Benchmark on both the Number and Display subscales; 1,411 students were categorized into this group. Among the two middle-level classes, Class 2 members performed above the High Benchmark (550) on all three mathematics domains, and the Class 3 members performed above the Intermediate Benchmark (475) on all three mathematics subjects.
To understand the discrepancy between Geometry and the other two subscales in each of the latent classes, we further examined the differences in the mean level of subscales across different classes of students. Table 3 shows that only in Class 3 and Class 4, the two lowest performing groups, group members’ mean score in geometry was significantly lower than in the other two subscales. No significant gaps between Geometry and the other two subscales were found in Class 1 and Class 2. When we further calculated the differences between the mean scores of Geometry and the other two subscales (see Figure 1), we found a clear pattern of increasing differences from the first to the last class.
Characteristics of Subscales From the Four- to Six-Class Models (Weighted).
Note. Standard errors in parentheses. Means and stand errors are based on weighted data. H0: The subscale and Geometry means are not significantly different.
p < .05. **p < .01.

The differences between the Geometry and other subscales under the four-class, five-class, and six-class models, respectively.
Five-class model
Latent Class 1 (Class 1; 10.36% of the entire sample, N = 1,303) exhibited the highest mean score (Display, M = 664.01, Geometry, M = 669.45, and Number, M = 666.34) of all three subscales, which surpassed the Advanced International Benchmark (625). Conversely, Class 5 exhibited the lowest mean scores in all three subscales (Display, M = 396.00, Geometry, M = 366.67, and Number, M = 392.82). All three subscale means of Class 5 were below the Low International Benchmark (400), with the Geometry significantly lower than the other two subscales. The second lowest class, Class 4, performed between the Low International (400) and the Intermediate (475) Benchmarks, with the Geometry score (454.31) significantly inferior to the other two measures (Display, M = 473.81 and Number, M = 468.58). Class 3 members performed between the High Benchmark and the Intermediate Benchmark, and Class 2 members performed between the High Benchmark (550) and the Advanced Benchmark across all three mathematics subjects. No significant gaps between Geometry and the other two subscales were found in Class 1, Class 2, and Class 3. Similar to the four-class model, we calculated the differences between the mean score of the Geometry and the two subscales. A similar pattern of increasing differences from the first to the last class was identified, as displayed in Figure 1.
Six-class model
With this model, we divided students into six groups (see Table 2). The mean score of each class decreased gradually from Class 1 to Class 6. Class 1 was composed of students with the highest achievement and above the Advanced Benchmark (625) across all the three subscales, whereas the lowest achieving students were categorized into Class 6, whose members scored below the Low Benchmark (400). In the second lowest performing class, Class 5, participants scored between the Low International (400) and the Intermediate Benchmark (475). A significant discrepancy between Geometry and the other two subscales was only found in the two lowest performing latent classes, Class 6 and Class 5. Similar to above, one can find increasing differences between the mean scores of Geometry and the other two subscales from the first to the last class (see Figure 1).
Summary
The four-, five-, and six-class models demonstrated very similar patterns, with descending trends of all mean scores from the first to the last class, together with a clear pattern of increasing differences between Geometry and other subscales’ mean scores along the same line. The differences also tended to be increasingly significant from the first to the last class. The results implied that the gap between geometry and other domains tended to be larger when students’ overall mathematical ability decreased. Moreover, the greater the number of latent classes, the more likely that the lowest performing group members showed (a) the poorest performance on all three subscales and (b) a lower performance in Geometry than in the other content domains. In answer to Research Question 3, we did find a lowest performing group whose members performed extremely low in geometry (i.e., did not meet the Low Benchmark of 400 points) and showed a discrepancy between geometry and other areas; however, we cannot make the argument that this group represented a population with learning difficulties in geometry only, because their other subscale scores could be below the Intermediate Benchmark (475) but be significantly higher than their geometry scores.
LPA With Covariates
Last, we tested whether any demographic factors would affect the membership as classified in the above LPA. We incorporated the demographic variables as covariates in all latent profile models and found very similar results across all three models. To save space, here we only presented results from the five-class model, which appeared to be more representative. Results from other models were similar and available upon request. Class 5, or the last class with the lowest mean scores and the largest discrepancy between Geometry and the other subscales, was set as the reference group. Table 4 shows the estimates of the intercepts (β0), regression coefficients (β1), and related odds and odds ratios (ORs), with further explanation below.
Five-Class Model With Covariates.
Note. β0 = estimates of the intercepts; β1 = regression coefficients; OR = odds ratio.
p < .1. *p < .05. **p < .01.
Gender
Girls were used as the baseline group (i.e., X = 0). Based on the intercepts and related odds, girls were significantly more likely to be in Classes 2, 3, and 4 (β0 = 1.451, 1.638, 1.203; odds = 4.267, 5.145, 3.330, corresponding to Classes 2, 3, and 4 vs. Class 5, respectively) rather than Class 5, the reference class. Based on the coefficients and related ORs, boys were slightly more likely to be included in Class 5 than Class 2, 3, or 4 in comparison with girls. Moreover, Table 4 shows that the odds for boys of being in Class 1 rather than Class 5 was 1.468 times as the corresponding odds for girls (OR = 1.468), the coefficient of which was significant (β1 = 0.384, p = .003). The results show that girls were more likely to be found in the middle classes, and boys in the highest or lowest classes.
Age
The oldest students were used as the baseline group (i.e., X = 0) in the analysis. The significant intercepts and related odds suggested that the oldest students were significantly more likely to be in Classes 2, 3, and 4 rather than Class 5 (β0 = 1.515, 1.683, 1.173; odds = 4.549, 5.382, 3.232, respectively). In contrast, the ORs of regression (β1) were slightly smaller than 1 for each latent class (ORs = 0.975, 0.890, 0.850, 0.942, respectively), which indicated that younger students had a slightly higher probability than older students to be included in Class 5 than in any other classes.
English spoken at home
Students reported the frequency of speaking English in the home environment on a Likert-type scale from 1 (always) to 4 (never), with the category of always as the intercept group in the analysis. The significant intercepts and related odds suggested that students always speaking English at home were significantly less likely to be in Class 5 than in any other classes (β0 = 0.727, 1.703, 1.794, 1.248; odds = 2.069, 5.490, 6.013, 3.483, respectively). All negative and significant regression coefficients suggested that speaking less English at home significantly increased the chance to be included in Class 5 than in any other classes except for Class 4 (β1 = −0.600, −0.507, −0.338, corresponding to Classes 1, 2, and 3 vs. Class 5, respectively). That is, the students who always speak English at home had a greater chance to be in Classes 1, 2, and 3 than in the lowest performing class, Class 5. The change of chance from Class 1 to Class 5 was especially salient from being a native English speaker to non-native speakers, as suggested by the ORs of regression in Table 4.
Spanish spoken at home
Students reported what language was spoken at home other than English, and speaking Spanish was set to be the baseline group. The intercepts and related odds suggested that students speaking Spanish at home were significantly more likely to be in Classes 2, 3, and 4 than Class 5 (β0 = 1.245, 1.633, 1.245; odds = 3.473, 5.119, 3.473, respectively). However, speaking Spanish was less likely to be in Class 1 than Class 5. Moreover, the only significant regression coefficient (β1 = 1.024, p < .001) and related OR (= 2.784) suggested that speaking a language other than Spanish at home significantly increased the chance to be included in Class 1 rather than Class 5 by 3 times.
How much students liked mathematics
Students reported how much they liked mathematics on a Likert-type scale from 1 (like) to 3 (not like), with the category of like as the intercept group. The significant intercepts and related odds suggested that students who liked mathematics were significantly less likely to be in Class 5 than in any other classes (β0 = 0.893, 1.734, 1.787, 1.275; odds = 2.442, 5.663, 5.972, 3.579, corresponding to Classes 1, 2, 3, and 4 vs. Class 5, respectively). In contrast, negative regression coefficients were significant, and suggested that students who did not like mathematics had a higher chance to be included in Class 5 than in any of the other classes (β1 = −0.637, −0.368, −0.218, −0.149; ORs = 0.529, 0.692, 0.804, 0.862, corresponding to Classes 1, 2, 3, and 4 vs. Class 5, respectively). For example, if the preference for mathematics decreased by one unit, the odds of being in Class 1 relative to Class 5 dropped by about half (OR = 0.529).
How much students liked science
Students reported how much they like learning sciences on a Likert-type scale from 1 (like) to 3 (not like), with the category of “like” as the baseline group in the analysis. The intercepts and related odds suggested that students who liked science were significantly more likely to be in Classes 1, 2, 3, and 4 than Class 5 (β0 = 0.644, 1.617, 1.755, 1.289; odds = 1.904, 5.038, 5.783, 3.629, respectively). Similar to above, all regression coefficients were significant and negative, suggesting that the students who did not like learning science were more likely to be included in Class 5 than in any of the other classes (β1 = −0.335, −0.264, −0.218, −0.200, respectively).
White/Black
Students’ race was coded into a set of dummy variables. White students were set as the reference in the dummy variables. The significant intercepts and related odds suggested that White students were significantly less likely to be in Class 5 than in any other classes (β0 = 1.277, 2.228, 2.216, 1.500; odds = 3.586, 9.281, 9.171, 4.482, respectively). All regression coefficients were negative and significant, showing that Black students were more likely to be included in Class 5 than in any other classes in comparison with White students (β1 = −4.116, −2.914, −1.866, −0.916, respectively). It is noteworthy that the regression’s ORs were 0.016, 0.054, and 0.155 for Classes 1, 2, and 3 vs. Class 5, respectively, suggesting that Black students were about 70, 20, and 6 times more likely than White students to be included in Class 5 than in Classes 1, 2, and 3, respectively.
White/Hispanic
Similar to above, White students had a higher probability of being in each of the latent classes other than the reference latent class, Class 5. All regression coefficients were significant and negative, suggesting that being a Hispanic student significantly increased the chance to be in Class 5, the lowest performing class, than in any other classes in comparison with White students (β1 = −2.364, −1.637, −0.998, −0.555, respectively). Further analysis with the ORs of the regression coefficients suggests that Hispanic students were approximately 11, 5, 3, and 2 times more likely than White students to be in Class 5 than in Classes 1, 2, 3, and 4 (ORs = 0.094, 0.195, 0.369, 0.574, respectively).
White/Other races
This dummy variable compares White students’ chances of being in Class 5 versus students of Other races (i.e., non-Hispanic/Black). Similar to above, White students had a lower probability of being in Class 5 than in any other classes. In contrast, being a student of Other races increased the chance to be in Class 5 than in any other classes in comparison with White students, and the increment was significant or marginally significant for the two middle-achieving classes, Classes 2 and 3 (β1 = −0.460, −0.416, respectively).
Summary
As the results showed in the above five-class model analysis, all covariates were significantly associated with class membership in specific ways. In summary, the following conclusions can be reached: (a) Female students were more likely to be classified in the middle-achieving groups, and male students were more likely to be classified into the advanced or lowest groups; (b) older students were more likely to be classified in a higher achieving group than in the lowest performing group; (c) students who spoke English at home had a greater chance to be classified in a high-achieving group than in Class 5; (d) students’ preference for mathematics and science significantly decreased their likelihood to be in the lowest achieving group; and (e) Black and Hispanic students who scored lower on the Geometry subscale were more likely than White students to be classified in the lowest achieving group.
Discussion
This study aimed to portray a profile of students who performed poorly on a geometry standardized assessment by investigating the TIMSS 2011 mathematics data of fourth graders in the United States. We aimed to answer four major questions: (a) How did U.S. students’ geometry performance compare with their performance on the other mathematics subscales? (b) What were the patterns of student achievement among the mathematics subscales? (c) Was there a group of students who demonstrated specific difficulties in geometry only? and (d) Which demographic variables contributed to students’ classification in the group with geometry difficulties?
First, results suggested that, in general, geometry was the weakest mathematics content domain for fourth graders in the United States as reflected in the comparison of mean scores on the mathematics subscales in the TIMSS 2011 data. This finding echoes a previous study analyzing the TIMSS 2011 data that reported that among the four mathematics content domains (i.e., algebra, number, data and chance, and geometry), geometry was the weakest area of proficiency for U.S. eighth graders (Provasnik et al., 2012). In contrast, Shanghai students scored higher on the Geometry subscale compared with Display and Number (Organisation for Economic Co-Operation and Development, 2014). A recent cross-culture comparison study (Kern & Henrick, 2016) suggested that, although Chinese teachers outperformed U.S. teachers on many dimensions of mathematics teaching quality, the higher achievement of Chinese students was primarily driven by geometry instruction: In geometry instruction, the study showed that Chinese students had more opportunities than U.S. students to develop their conceptual understanding and to make explicit connections between representations and methods for solving problems.
In answer to our second and third research questions, “What were the patterns of student achievement among the mathematics subscales?” and “Was there a group of students who demonstrated specific difficulties in geometry only?” we employed LPA to classify the whole student population into different groups according to the patterns of performance across the three mathematics domains. We found that students with the lowest Geometry subscale scores typically also had the lowest achievement in other mathematics domains. When we analyzed the four-, five-, and six-class models, we found a pattern that showed the more specifically and the greater number of groups into which we categorized the students, the more likely we were to find a group of students who showed the lowest scores on all three subscales, with a broader gap between geometry and the numerical mathematics domains. That is, students’ geometry performance was consistent with their performance in the non-geometry domains, and students with the lowest geometry performance was most likely to be found in the group of students with the poorest overall mathematics performance. That is, the more problems the students displayed with mathematics in general, the greater difficulties the students were likely to manifest in geometry performance.
In sum, using the TIMSS fourth-grade mathematics data, we did identify a group of students (i.e., the lowest performing class in each of the three models) who showed poorer performance in geometry than in other mathematics areas; however, these students neither reached the average or above-average benchmarks in other mathematics domains. Generally speaking, students with the lowest mean geometry scores also had very low mean scores in other mathematics areas, whereas students with high mean scores in other mathematics subscales also had high mean scores in geometry. The students who performed poorly in the numerical areas tended to have even poorer performance in geometry, but such a discrepancy did not exist with the high-performing and average-performing students.
Results contributed to the literature by showing that geometry performance as measured by a standardized assessment is not independent of performance in other mathematics areas, and likewise, students’ poor performance in geometry does not stand alone from their difficulties in other mathematics areas. Results supported the theoretical proposition that performance in different mathematics domains is intercorrelated and reflects mutual underlying cognitive abilities (Peng et al., 2018). A student with poor spatial abilities may manifest difficulties in many mathematics areas (e.g., story problems; van Garderen, 2006) and other STEM areas (Wai et al., 2009) rather than in geometry only. Although it sounds plausible to categorize a group of students who demonstrate difficulties only in geometry due to poor spatial abilities, our results from the TIMSS 2011 data did not find evidence to support this hypothesis.
To answer the fourth research question, “Which demographic variables contributed to students’ classification in the group with geometry difficulties?” we examined the influences of various covariates on students’ probability to be classified into the geometry difficulty group, and found that being White, English-speaking at home, and liking mathematics and science significantly decreased the students’ probability to be classified into the geometry difficulty group. In contrast, being non-White and disliking mathematics and science significantly increased the chance to be classified into the lowest achieving group with particular geometry difficulties. These sociocultural-dependent predictors again indicated that geometry achievement is significantly influenced by many non-cognitive factors, such as ethnicity, culture, language, and interest.
It has been well established that culture and education play an important role in students’ mathematics achievement (Agirdag et al., 2011; Martin, 2000; Starkey & Klein, 2008). Although intuitive geometry is considered to be culture-free and to rely on cognitive abilities (Giofrè et al., 2013), the results of the present study indicated that as early as fourth grade, students’ geometry achievement has been significantly influenced by many educational and cultural factors. Fourth-grade geometry in the TIMSS was primarily about basic spatial manipulation tasks; however, even these so-called “intuitive geometry” tasks that seem to be cognitive-based were significantly affected by students’ cultural and educational backgrounds. Echoing previous research, a student’s social and cultural background can influence his or her academic achievement because of many factors, including parental involvement (Lee & Bowen, 2006; Mau, 1997), self-concept (Chiu & Klassen, 2010), school resources (Roscigno & Ainsworth-Darnell, 1999), social history context (Martin, 2000), and community forces (Martin, 2000).
Previous literature reported a small effect size of male superiority in geometry performance (Hyde et al., 1990). In this study, being a girl significantly increased students’ probability to be included in an average-achieving class, whereas being a boy significantly increased the chance to be classified into either the highest achieving class or the lowest achieving class. Previous literature repeatedly demonstrated that spatial ability is the only cognitive domain that shows the most robust gender differences favoring males, especially in the area of spatial rotations (Feng et al., 2007; Terlecki & Newcombe, 2005); however, the results of this study suggested that although males were more likely to show the highest achievement scores in geometry, they also had a greater probability of showing the lowest achievement scores in geometry, suggesting that geometry achievement is not completely dependent on spatial abilities but is a function of multiple cognitive and non-cognitive factors.
Limitations and Future Research
The present study has two major limitations. First, we analyzed fourth graders’ TIMSS 2011 Geometry subscale scores to identify students with difficulties in geometry. The students who were classified into the low-achieving group in this study performed poorly on the TIMSS 2011 assessment and did not meet the Low International Benchmark. However, we should be cautious when reaching any conclusions about labeling these students with “Learning Disabilities (LD)” because these low-achieving students were not necessarily students with learning disabilities. Some students do well in class activities but are not good at taking standardized tests (Kearns, 2011), and it is widely recommended (Nitko & Brookhart, 2014) to adopt multiple measures (i.e., teachers’ reports, classroom observations, standardized tests, curriculum-based assessment) to ensure the validity to refer a child to Tier 2 interventions. While many of the lowest performing group in this study may have geometry learning difficulties or learning disabilities, TIMSS does not provide data regarding students’ special education status. Future research is warranted to examine patterns of students’ abilities in different mathematics areas among students who have been officially diagnosed with mathematics learning disabilities.
Second, we need to be cautious regarding conclusions about whether there is a group of students who achieve low scores only in geometry but achieve average or above-average scores in other mathematics domains. Typically, LCA or LPA is carried out in an exploratory manner in which a strong a priori hypothesis does not exist regarding the number or nature of the latent classes underlying the data (Hoijtink, 2001). For a future study, when there is a well-developed theory to hypothesize the classes of students, a more confirmatory approach that allows for testing specific hypotheses will be used to verify the existence of hypothesized classes.
Implications for Educational Practice
Although this study does not provide direct implications for educators as to how to provide effective interventions to improve geometry performance of students with difficulties solving geometry problems, results of this study portrayed a larger picture of the geometry performance of U.S. fourth graders and the patterns of U.S. students’ achievement in three mathematics domains assessed by TIMSS 2011. First, the significantly lower performance in geometry than in the other two mathematics domains among U.S. students overall signifies a strong need for greater attention from policy makers and educational practitioners. In particular, special attention needs to be paid to students who are experiencing mathematics learning difficulties, as the results of this study showed that students with poor achievement in numerical domains tended to have even poorer performance in geometry. Given that geometry topics comprise only a relatively small part of the mathematics curriculum that is required to be taught to elementary school students, teachers and parents often do not address the importance of providing early geometry intervention to students who have mathematics learning disabilities or who are at risk. In fact, the results of this study suggested that students with overall difficulties in mathematics tend to experience greater difficulties in geometry than in other mathematics areas; thus, providing intensive prevention and intervention focusing on geometry subjects is warranted.
This study also sheds light on the importance of teachers and policy makers understanding students’ geometry performance from a sociocultural perspective. Many STEM areas, including mathematics and geometry, are not solely determined by students’ cognitive abilities or personal efforts. Even for tasks that are considered “intuitive geometry,” students’ performance can be influenced by many sociocultural factors. In comparison with the rich literature and resources of mathematics remediation programs that help students with mathematics difficulties to learn using cognitive approaches, less research and practice exist in terms of how to address students’ mathematics learning problems from a sociocultural perspective (Waite, 2017).
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Open Fund of National Higher Education Quality Monitoring Data Center (Higher Education Research Institute), Sun Yat-sen University (Grant Number: M1903), and also supported by the Key Nurturing Program for Young Teachers, Sun Yat-sen University (Grant Number: 18wkzd14).
