Abstract
This article describes the development and validation of a newly designed instrument for measuring the spatial ability of middle school students (11-13 years old). The design of the Spatial Reasoning Instrument (SRI) is based on three constructs (mental rotation, spatial orientation, and spatial visualization) and is aligned to the type of spatial maneuvers and task representations that middle-school students may encounter in mathematics and Science, Technology, Engineering and Mathematics (STEM)-related subjects. The instrument was administered to 430 students. Initially, a set of 15 items were devised for each of the three spatial constructs and the 45 items were eventually reduced to 30 items on the basis of factor analysis. The three underpinning factors accounted for 43% of variance. An internal reliability value of .845 was obtained. Subsequent Rasch analysis revealed appropriate item difficulty fit across each of the constructs. The three constructs of the SRI correlated significantly with existing well-established psychological instruments: for mental rotation (.71), spatial orientation (.41), and spatial visualization (.66). The psychometric characteristics of SRI substantiate the use of this measurement tool for research and pedagogical purposes.
Introduction
The importance of spatial ability as a distinct form of cognition has been of sustained concern of mathematics educators. This recognition can be seen through the explicit and formal integration of spatial reasoning in a number of national curricula (e.g., Australian Curriculum, Assessment and Reporting Authority [ACARA], 2015; Finnish National Board of Education, 2004; Ministry of Education Singapore, 2006). Influential organizations including National Research Council (2006) have raised the awareness of the critical importance of spatial thinking across the curriculum, with recent research programs demonstrating its importance beyond the traditional domain of geometry (Davis, 2015; Sinclair & Bruce, 2014). Increasingly, there is evidence to suggest strong associations between spatial ability and success in science, technology, engineering, and mathematics (STEM) disciplines (Uttal & Cohen, 2012; Wai, Lubinski, & Benbow, 2009).
In essence, spatial ability refers to the skill of being able to generate and manipulate images or, as Battista (2007) postulates, “the ability to see, inspect, and reflect on spatial objects, images, relationships and transformations” (p. 843). It may involve elements such as holding a visual pattern in memory, comparing visual patterns, or doing a mental transformation and requires the manipulation of internal (mental) representations. Spatial ability includes working with mental imagery or, more specifically, spatial imagery (Hegarty & Waller, 2005). Spatial imagery refers to “a representation of the spatial relationships between parts of an object, the location of objects in space or their movement” (Hegarty & Waller, 2005, p. 144). These processes could involve imagining the transformation of a particular spatial configuration such as performing a mental rotation, visualizing the opening of a cube, or finding the relative positions of objects on a map.
Spatial reasoning is recognized as a separate ability from geometrical reasoning that is more inclined toward the articulation of axiomatic properties although the former serves as a foundation for the latter. Spatial reasoning is not only useful within the domain of mathematics but also across a number of subjects in the school curriculum such as geography and the sciences. On the practical side, spatial ability is a vital component of modern living, given the complex and global (geographical) networks in which the individual is asked to operate. Spatial tools, such as Google maps and Geographic Positioning System, have become an integral part of societal norms. From a vocational perspective, spatial ability is an inherent requirement for various professions, not only for highly technical jobs such as architects, engineers, town planners, or surveyors but even for socially related jobs. For example, the policeman requires swift/flexible ability to choose the shortest route to locate particular spots or vehicles. Thus, given the widespread implications of spatial ability, it is important to ensure that middle school students get the necessary experiences to develop such a habit of mind. An immediate question is then how do we measure this ability as an entity of its own?
Despite research attention from psychologists and educators, some important questions are yet to be answered. First, there is lack of consensus about the exact constituents of spatial ability although it is considered multifaceted. Second, little is known about the developmental progression of spatial ability from the start of adolescence to adulthood. More attention has been afforded to the early years of schooling (Davis, 2015; Sinclair & Bruce, 2014) or with students at an undergraduate level (Hegarty & Waller, 2005). Furthermore, many of the instruments used to measure spatial ability date back to the 1970s. These tests tend to follow a particular mode of measurement in the sense that respondents are asked to solve similar items with varying levels of complexity. For example, in the Paper Folding Test (PFT; Ekstrom, French, & Harman, 1976), respondents are asked to imagine the folding and unfolding of a square sheet of paper when it is folded in different ways and where the position of a punched hole is varied. Similarly, in the Vandenberg and Kuse (1978) test, the orientation of a three-dimensional array of cubes is varied to observe how respondents identify pairs of identical arrangements as particular parts become occluded. Furthermore, psychologists tend to give much attention to speed in measuring spatial ability. The drawback of this approach to the measurement of spatial ability is that it assesses a respondent’s skill in only one type of tasks.
It is our contention that a more valid measurement of a person’s spatial ability can be obtained when he or she has to solve a set of varied tasks involving hypothetically similar spatial maneuvers rather than solving similar task types. We thus divert slightly from the traditional psychological approach in the measurement of spatial ability in two ways. First, the nature of our interest is in diagnosing and helping students to learn to think spatially. As such, we give more importance to accuracy than speed as the instrument is designed to be used with middle school students who are establishing spatial understandings. We acknowledge that speed is also a consideration, with the entire instrument needing to be completed within 45 min. This is consistent with the average duration students are given to complete national and international assessments (e.g., Trends in Mathematics and Science Study [TIMSS]).
Second, as mentioned above, in the psychology/spatial cognition tradition, when a spatial reasoning instrument is designed, there is a tendency to use only one type of task (a stem) and the complexity of this task is varied by changing different features of the task. For instance, if one looks at the well-known PFT (Ekstrom et al., 1976) for measuring spatial visualization, the stem (where a square sheet of paper is folded and a hole is punched) is kept constant and spatial features such as the number of folds, the type of folds (horizontal, vertical, or diagonal), and the position of the punched hole is varied. The drawback of this approach is that it is context-specific and does not allow us to know how the respondent would fare in a different scenario that involves spatial visualization. As such, we depart from this mode of measurement by providing a range of varied situations where respondents are required to perform similar spatial maneuvers and complexity is potentially varied by the inclusion of more spatial elements. Given that spatial ability is a multidimensional construct, the use of multiple tasks in one instrument is particularly important.
This study presents the design and validation of a new instrument, labeled the Spatial Reasoning Instrument (SRI), for assessing the spatial ability of students in the middle school. The purpose of the instrument is to measure three dimensions of spatial ability (mental rotation [MR], spatial orientation [SOR], and spatial visualization [SV] as described in the ensuing section) from a cognitive functioning perspective. It is also meant for pedagogical purposes for diagnosing students’ spatial ability. The SRI is not a test for job placement or selective procedures but rather is an instrument designed to be of more use to teachers, researchers in education, and possibly psychologists too. The development of the SRI was motivated by the lack of an instrument that is aligned with the type of curricular representations that students are likely to encounter in their middle years of schooling. Building on the psychological foundation of spatial reasoning, it integrates different types of spatial maneuvers.
A Three-Tier Framework of Spatial Ability
Spatial ability is not a single unitary construct, rather it consists of several dimensions. Although there is agreement about the multidimensionality of spatial ability, what has not yet been fully settled is the number of sub-factors (Yilmaz, 2009). Furthermore, the non-normative use of closely related terms such as spatial visualization, spatial relation, and orientation makes this area of research problematic, partly due to the complex nature of spatial ability. As researchers, we face the challenge of setting the boundaries, often blurred, between these closely related spatial constructs. One way to barricade these constructs is to analyze the processing requirements of spatial tasks. For the purposes of this article, we have analyzed a range of tasks from the primary and lower secondary mathematics curricula and come to the conclusion that the three constructs, mental rotation, spatial orientation, and spatial visualization capture much of the spatial ability requirements at the middle school level. In the following section, we describe the three constructs as we interpret them.
Mental Rotation
Mental rotation is a cognitive process in which a person imagines how 2D and 3D objects would appear after it has been turned around a point by a certain angle (Shepard & Metzler, 1971). As a dimension of spatial ability, this form of mental transformation has received considerable research attention from psychologists. From the psychological standpoint, mental rotation has been measured as a speeded performance, that is, how fast students carry out mental rotations. Experiments conducted in the 1970s (Cooper, 1975; Shepard & Metzler, 1971) attempted to measure the relationship between angle of rotation, configuration complexity of the object to be rotated, and reaction time. Typically, students are given a target object and are asked to compare it with the rotated or reflected object and the time that they take to do these mental rotations are measured. It appears that the cognitive load to perform a mental rotation depends more on the angle of orientation than the complexity of the object (Cooper, 1975). What is also known is that representations of 3D objects are more difficult to mentally rotate than representations of 2D objects (Jolicoeur, Regehr, Smith, & Smith, 1985). The accentuated difficulty in rotating a 3D object lies in the fact that it also requires the rotation of depth (i.e., not only length and width as in 2D shapes). Furthermore, when rotating a 3D object, some parts may become occluded and previously occluded portions may come into view. The Vandenberg and Kuse test (Vandenberg & Kuse, 1978) is a commonly used instrument to measure mental rotation. Subjects are asked to compare 3D figures to decide whether they are similar or not. However, this test has been reported to be unsuitable for elementary school students (Hoyek, Collet, Fargier, & Guillot, 2012). The more recent mental rotation test (Picture Rotation Test) developed by Quaiser-Pohl (2003) is designed for pre- and early primary school children (ages 4-6). Thus, there is an identified need for a mental rotation test for students at the middle years of schooling.
It seems to be the case that the majority of the experimental studies in psychology were conducted with undergraduate students. It is therefore difficult to gauge the extent to which those findings may be informative for primary and secondary school students. As mental rotation has been shown to be related to mathematics performance (Cheng & Mix, 2014), more research is required on this construct with school-age students.
Spatial Orientation
The previous construct focused on mental rotation as a form of mental spatial transformation. More generally, there are two classes of mental spatial transformations: object-based spatial transformations and egocentric perspective transformations (Zacks, Ollinger, Sheridan, & Tversky, 2002). Object-based transformations involve imagined movements of objects, as in performing a mental rotation. Here, the object is mentally manipulated but the observer points of view remain fixed. By contrast, egocentric perspective transformations are imagined movements of one’s point of view. An egocentric representation involves locating an object with respect to one’s body as reference. For example, we can describe the location of a computer on the desk to the right and a book to the left when we take ourselves as the reference point. Similarly, when we enact a left turn and right turn, it is in relation to ourselves. The critical difference between mental rotation and spatial orientation lies in the frame of reference that we use to interpret the situation, that is, whether it is egocentric or object-based. Kozhevnikov and Hegarty (2001) and Hegarty and Waller (2004) maintain that the ability to mentally rotate and the ability to reorient the imagined self are separate spatial abilities.
Included within spatial orientation is the notion of perspective taking. This is the ability to imagine how an object or scene looks from perspectives different to the observer’s. It is regarded as the anticipation of location from different vantage points (Newcombe & Huttenlocher, 1992). In a spatial orientation task, one has to mentally or physically position himself or herself in the place of an object to be manipulated to determine the position of the object or the result of a transformation on the object. The problem solver is required to analyze an object with respect to his or her position. For example, in determining whether something is to the right or left, one has to use his own body as a frame of reference.
In terms of measurement of spatial orientation ability, several tests are prominent in the field of psychology such as the Road Map Test of Reading Direction (Money, Alexander, & Walker, 1965), Guilford–Zimmerman Perspective Taking Test (Guilford & Zimmerman, 1948), test involving four cameras (De Lange, 1984), and the Perspective Taking/Spatial Orientation Test (SOT; Hegarty & Waller, 2004; Kozhevnikov & Hegarty, 2001). As is the case with mental rotation tests, respondents are required to perform the same type of tasks when specific problem parameters are varied, typical of the psychological approach of measurement. It was thus necessary to develop items that involve a variety of situations that require respondents to perform spatial orientation.
Spatial Visualization
Spatial visualization is the concept that has the least rigid boundaries as reflected in the lack of specificity in the definitions given in the literature, especially when compared with mental rotation and spatial orientation. Spatial visualization seems to be an umbrella term for a number of spatial maneuvers. In fact, a number of researchers (Linn & Petersen, 1985; McGee, 1979; Sorby, 2009) include mental rotation as a type of spatial visualization. For example, according to Linn and Petersen (1985),
Spatial visualization is the label commonly associated with those spatial ability tasks that involve complicated, multistep manipulations of spatially presented information. These tasks may involve the processes required for spatial perception and mental rotations but are distinguished by the possibility of multiple solution strategies. (p. 1484)
Yakimanskaya (1991) characterizes spatial visualization as the ability to generate and manipulate images. For Carroll (1993), spatial visualization involves the “processes of apprehending, encoding and mentally transforming spatial forms” (p. 309). These definitions can be interpreted to mean different things and brings elements of inconsistency and ambiguity in the way they are articulated by researchers. However, the common piece among these definitions is the emphasis on the relatively complex manipulations that they may require. Salthouse, Babcock, Mitchell, Palmon, and Skovronek (1990) assert that spatial visualization may involve the execution of a sequence of mental transformations and it may be necessary to store intermediate products temporarily during the processing of information. It may involve the mental manipulation of entire spatial configurations and generally necessitates a greater number of processing operations.
Another way to discern the characteristics of spatial visualization is to consider the tasks that are generally used to measure it. One of the commonly used test batteries is from the Educational Testing Services (ETS; Ekstrom et al., 1976) which include different spatial visualization instruments. For example, in the PFT, respondents are required to visualize the folding and unfolding of a piece of paper with punched holes. In the Surface Development Test, respondents are required to construct a solid from its given net. The Form Board Test involves joining the parts of a polygon to construct a whole. These types of situations neither involve mental rotation nor spatial orientation as described above. Given the broad range of spatial maneuvers encompassed by spatial visualization, we were inclined to consider spatial visualization from a complementary perspective—Spatial tasks that do not involve mental rotation and spatial orientation are considered as involving spatial visualization. Such a complementary outlook is necessary for measurement purposes. It should also be pointed out that these tests of spatial visualization date back to the 1970s and have not been subject to much scrutiny in recent times. This provides another rationale for a test battery for spatial reasoning. It is acknowledged that there may be other dimensions of spatial ability but these tend to be less apparent in the school curricula.
Design Principles
SRI Item Construction
As we delineated the three constructs described above, we simultaneously perused and analyzed accessible school mathematics curricula (from online curricular documents, textbooks, and examples of spatial problems from national testing bodies wherever available) in relation to the importance given to spatial reasoning. For example, in the Australian curriculum, the following aspects of space are highlighted: positional language, 2D and 3D shapes and their relationships, orientation, motion, transformation, nets and cross sections, location, compass directions, and reading maps (ACARA, 2015). The aim of this exercise was to observe the nature and type of spatial maneuvers that middle school students are required to perform. This helped us to identify the extent to which the three constructs reflect on the range of spatial maneuvers identified within curricula and the types of task representation commonly found in school-based assessment.
We also analyzed commonly used spatial reasoning instruments from the psychology and mathematics education literature such as the test batteries from Education Testing Services (Ekstrom et al., 1976), the Mental Rotations Test (Vandenberg & Kuse, 1978), Picture Rotation Test (Quaiser-Pohl, 2003), Middle Grades Mathematics Project (MGMP) Spatial Visualization Test (Ben-Chaim, Lappan, & Houang, 1988), Perspective Taking Test (Eliot & Smith, 1983; Kozhevnikov & Hegarty, 2001), and spatial ability practice tests (e.g., Newton & Bristoll, 2009).
We designed 15 items in each of the three constructs. Thus, the initial version of the SRI had 45 questions. We were inclined to develop an instrument that takes less than 1 hr to complete, given the cognitive load that such spatial tasks may demand. We also took into consideration the fact that the respondents were in the age range 11 to 13 years. The school mathematics curriculum involves a range of spatial concepts. Key aspects of space include the concepts of location and position (e.g., compass directions and map interpretation), orientation, two- and three-dimensional shapes and their relationships (e.g., constructing the nets of objects), perspective taking, spatial transformations, and graphs among others. Noteworthy, spatial reasoning involves both static and dynamic aspects (e.g., inferring the front view of a configuration or rotating a particular configuration by a right angle anticlockwise; see Table 1).
Design Framework for the Spatial Reasoning Instrument.
Figures 1 to 4 present example items from the SRI. Figures 1 and 2 require the mental rotation of a 2D and 3D object, respectively, while Figures 3 and 4 involve spatial orientation and spatial visualization, respectively. These items are novel in that the tasks are embedded within school-based representations while still measuring the three constructs individually. Many international mathematics assessment items are contextually based and, given this instrument is predominantly for teachers and educational researchers, the inclusion of contextual representations provides a level of familiarity and access to middle school students.

Sample mental rotation item involving 2D objects.

Sample mental rotation item involving 3D objects.

Sample spatial orientation item.

Sample spatial visualization item.
As well as varying the contexts of the tasks, we also incremented their complexity by including more spatial elements. Thus, we varied task features such as the spatial configuration and the relative positions of objects such that the problem solver had to handle more information. For instance, in designing the mental rotation tasks, we included both two-dimensional and three-dimensional objects (see Figures 1 and 2), with previous research suggesting that three-dimensional rotation is more difficult (Shepard & Metzler, 1988). Furthermore, the objects to be rotated were carefully chosen or designed on the basis of their spatial configurations with the support of the graphic artist. Similarly, for the spatial orientation tasks, the initial questions were relatively accessible in that the problem solver just had to position himself or herself in the situation to answer the questions. Gradually, we varied the tasks such that they required the coordination of the positions of different objects, where the problem solver had to make intermediate deductions. In a similar way, in choosing and designing the items for the spatial visualization dimension, we looked at aspects such as vertical and slanting line of symmetry, complexity of the nets to be folded, and configurations of shapes to be joined. Hence, we were able to develop items with differing levels of complexity for each of the three constructs.
Method
Design of Data Analysis
The analysis was undertaken in two phases. The first phase of analysis was concerned with the reliability of the SRI. Initially, exploratory factor analyses were undertaken on the 45-item SRI data to identify the strongest loading items within each construct. This reduced the SRI to 30 items, 10 for each construct (the results of the factor analyses are provided in the “Results” section). A Rasch analysis was then undertaken on each of the three constructs. The second phase of analysis was to determine the extent to which the three constructs within the SRI aligned with the existing tests for mental rotation, spatial orientation, and spatial visualization. This was undertaken using correlation analysis and person separation reliability.
Participants
The participants for the first phase of analysis were 430 students (219 boys and 211 girls) from Grades 5, 6, and 7 aged between 11 and 13 years (M = 11.79, SD = 0.72). The six schools involved in this stage were located in a large metropolitan city in Australia and chosen from government, Catholic, and private jurisdictions through a convenience sampling method. The schools had a broad socio-economic demographic, with the Index of Community Socio-Educational Advantage (ICSEA) 1 scores ranging from 996 to 1,194. In all, 16 intact classes of students completed the SRI in their own classrooms to ensure a range of academic ability levels within the respective schools.
Based on availability, time constraints, and other limitations within the six individual schools, a sub-sample of students from the original 430 participants were identified for the second phase of analysis. Nine classes from one school were administered an additional test along with the SRI. Table 2 describes the distribution of participants who undertook either the Card Rotation Test (CRT) and Cube Comparison Test (CCT; three classes), the PFT (three classes), or the Perspective Taking (Spatial Orientation) Test (three classes).
Distribution of Participants for Additional Testing.
Instruments
Data were collected from participants from the SRI and four well-established paper-and-pencil tests of spatial abilities: CRT, CCT, PFT (Ekstrom et al., 1976), and Perspective Taking or SOT (Kozhevnikov & Hegarty, 2001). Each of these Instruments has been used in the psychological literature to measure various aspects of spatial ability. For example, the CRT and CCT have been used as measures of mental rotation ability (e.g., De Lisi & Wolford, 2002; Gluck & Fitting, 2003). The PFT has been used as the standard measure for spatial visualization for decades (see, for example, Kozhevnikov, Hegarty, & Mayer, 2002; McVey, 2001; Whitlock, McLaughlin, & Allaire, 2012). Kozhevnikov and Hegarty’s (2001) SOT has been utilized in a number of studies with undergraduate students across various science and cognitive psychology experimental designs to measure spatial orientation and perspective taking (e.g., Hegarty & Waller, 2004).
The three spatial reasoning tests (CRT, CCT, and PFT) developed by ETS dates back to the 1960s and the available copies of the tests sometimes lack clarity. Furthermore, ETS tend to condense all the items of one test on one page. This can be visually overwhelming. Thus, the items in CRT, CCT, and PFT were made larger and distributed across several sheets to ensure that extraneous variables do not tamper with the measures.
CRT
The CRT assesses the mental rotation ability of the participants. In this timed test, respondents are required to compare pairs of congruent two-dimensional shapes (a target and a response shape) to determine whether one shape is rotated to form the other. The response shape is either a reflection or a rotation of target shape. The CRT consists of 10 questions with eight parts in each question. Participants were given 6 min to complete the 80 comparisons. A student score on the test is the number of items answered correctly minus the number answered incorrectly.
CCT
The 21-item CCT tends to assess both mental rotation and spatial visualization ability. It involves the comparison of two cubes to determine whether they are similar or different. The sides of the cubes consist of shapes and letters. Participants were given 6 min to complete the test. The score in the test is calculated as the number of items answered correctly minus the number of items answered incorrectly.
PFT
The PFT is a commonly used instrument to assess spatial visualization ability. This test consists of two sets of 10 questions where participants are to imagine (a) folding a piece of squared paper, (b) punching a hole through the layers of the folded paper, and (c) unfolding the paper. The type of fold that students are required to visually/mentally make becomes more and more challenging along the test. Thus, all the items are identical in context but involve varied level of visualization. Worth noting is the negative scoring procedure where an incorrect answer is negatively marked. A student score on the test is the number of items marked correctly minus one fourth the number of items marked incorrectly. Thus, respondents are informed a priori that it is to their advantage not to guess and skip hard items. The students were given 12 min to undertake the two sets of items.
The Perspective Taking/SOT
The SOT assesses a respondent’s ability to imagine different perspectives or orientations in space on the basis of 12 questions. In this test, the respondent is asked to orient himself or herself among a set of seven objects (a house, a stop sign, a car, a flower, a cat, a traffic light, and a tree) spaced out on a page. He or she has to imagine that he or she is standing at a given object (Position 1) and facing another object (Position 2). Then he or she is asked to indicate the relative position of another object (Position 3). For example, imagine that you are standing at the flower (Position 1) and facing the tree (Position 2) and you are asked to point at the cat (Position 3). The respondent is asked to represent Positions 1, 2, and 3 relatively on a circle. Given the age profile of the respondents, we used a simpler marking scheme in contrast to that suggested by Hegarty and Waller (2004), which relies on the precise angular measurement of the positions. We marked an answer as correct when it was 15° either side from its exact location. Thus, in this new marking scheme, the minimum score is 0 and the maximum score is 12 as there are 12 questions in the test.
Instrument Administration and Scoring
The SRI and four established psychological measures (CRT, CCT, PFT, and SOT) were administered by the research team and classroom teachers in two parts. First, the SRI was administered to the 16 classes in the sample. The students had 45 min to complete the Instrument. The sub-sample identified for the second stage was then administered the additional testing and answered either the CRT/CCT, PFT, or SOT. These participants were given a break of approximately 20 to 30 min between administration of the SRI and the additional instrument. This took place over a 2-week period approximately halfway through the school year.
It was decided to ask the respondents to answer the SRI and only one of the additional instruments so as not to overburden them with the spatial tasks. Given the age range of the students, it was necessary to give additional instructions (e.g., verbally explaining the instructions) and time for the students to complete the four instruments.
Test duration
As mentioned earlier, the SRI is not predominantly a speed test. Students were given 45 min to complete the 45 items. Based on a sample of 200 students, the mean time taken to complete the 45 items was 26 min, with a minimum of 13 min and a maximum of 45 min.
Answer format and scoring
The items are presented in multiple-choice format with 42 of the 45 items involving four answer choices and three items offering two choices. The three items with only two answer choices were designed to test students’ understanding of left and right and clockwise and anticlockwise and as such, only two options were appropriate. A correct item was scored as 1 and an incorrect item as 0. Thus, for the first stage of analysis, the minimum score was 0 and the maximum was 45 and for the second stage of analysis and for the final 30-item instrument, the minimum score was 0 and the maximum was 30.
Given that the instrument is based on three constructs, the following scores were computed for each construct. The mental rotation score (MRSCORE), spatial orientation score (SORSCORE), and spatial visualization score (SVSCORE) were computed by averaging each set of 10 items from the respective constructs. The TOTALSCORE, which gives a measure of spatial ability, is obtained by summing the MRSCORE, SORSCORE, and SVSCORE.
Results
Phase 1
Factor analyses
We purposively devised 15 items in each of the three constructs to eventually determine a set of 10 items that best capture each of them. The objective was to obtain a measure of each of the three constructs as well as a total measure of spatial ability. In our attempt to reduce the number of items, we conducted three separate factor analyses on each of the three sets of 15 items. These factor analyses were based on polychoric correlations in the R software (R Core Team, 2013). The R package was chosen due to its flexibility working with dichotomous data (Li, 2008). Exploratory factor analysis was conducted on the 15 items from each construct. Items with factor loading lower than 0.4 were eliminated from the respective scales, which resulted in revised 10-item scales for each construct. Factor loadings for the final 10-item constructs are presented in Table 3. The proportion of variance for each construct was mental rotation 35%, spatial orientation 34%, and spatial visualization 28%.
Factor Loadings Based on Final 10 Items.
Note. MR = mental rotation; SOR = spatial orientation; SV = spatial visualization.
Table 4 illustrates the proportion of variance explained and the reliability of the three separate measures based on the 10 items. Although the three individual scales do not have overly high reliability, the 30 items as whole have a Cronbach’s alpha value of .849. Henceforth, all the analysis will be presented based on the 30 items.
Descriptive Statistics, Proportion Variance, and Reliability.
Note. MRSCORE = mental rotation score; SORSCORE = spatial orientation score; SVSCORE = spatial visualization score.
Item difficulty
Table 5 presents the item difficulty and item discrimination indices based on classical test theory analysis. The difficulty level for mental rotation ranges from 0.28 to 0.84 while that for spatial visualization ranges from 0.35 to 0.85. The relatively high item difficulty values (0.60 to 0.92) for spatial orientation suggest that these items were found to be relatively easy. The discrimination indices ranged from 0.27 to 0.48, all indicating medium or high values of discrimination.
Item Difficulty (p) and Item Discrimination (D) Based on Classical Test Theory.
Note. MR = mental rotation; SOR = spatial orientation; SV = spatial visualization.
Rasch analysis
At this stage of the analysis, we chose to use Rasch model analysis to understand the functioning of the items and respondents simultaneously—a process that could not be achieved with confirmatory factor analysis (Chang & Engelhard, 2016). In addition, the Rasch analysis allowed us to explore item difficulty, which we felt was necessary to determine the usefulness of the instrument across grade levels. Analyses were conducted on each of the three constructs to establish a sense of item quality in relation to infit and outfit measures. The item difficulty estimates and corresponding standard errors are presented in Table 6. The magnitude of the estimates indicates the level of difficulty of the items, with positive values suggesting easier items and negative values highlighting more difficult items. We also provide the infit and outfit indices that show how accurately the data fit the Rasch model. An item with a fit statistic > 1.4 suggests that it may not contribute to the underlying trait as well as the other items in the scale. An item with a fit statistic < 0.6 suggests that the item may be redundant (Bond & Fox, 2007; Wright, Linacre, Gustafson, & Martin-Lof, 1994). The values in Table 6 meet the item fit criteria, suggesting each item contributes to the understanding of the total construct. Another indicator for item fit is the t-statistic, with values in the interval, −2 to 2, suggesting good fit. All items apart from items MR7 and MR43 meet these criteria.
Item Difficulty Based on Rasch Analysis.
Note. MSQ = mean square statistics; MR = mental rotation; SOR = spatial orientation; SV = spatial visualization.
Test norms
Norms for SRI were computed based on the percentiles for the three constructs (mental rotation, spatial orientation, and spatial visualization) as well as the total score (see Table 7).
Test Norms Based on Percentiles.
Note. MRSCORE = mental rotation score; SORSCORE = spatial orientation score; SVSCORE = spatial visualization score.
Phase 2
Comparison of SRI with existing spatial reasoning instruments
This analysis was undertaken to determine the extent to which the three constructs of our combined SRI related to existing well-regarded instruments in the literature. This analysis allowed us to evaluate the extent to which the respective components of our instrument aligned with those instruments typically used in psychology and in education. Thus, to show the comparability of SRI with existing instruments (PFT, CCT, CRT, and SOT), which measure relatively similar constructs, both correlations and the person separation reliability were computed.
The correlations of each of the three dimensions of SRI as well as the total score on the 30 items with the other existing instruments (CRT, CRT, SOT, and PFT) are presented in Table 8. The correlations are significant and range from .33 to .62. This gives credibility to the construct validity of the instrument.
Correlations Between Measures.
Note. CCT = Cube Comparison Test; MRSCORE = mental rotation score; SORSCORE = spatial orientation score; SVSCORE = spatial visualization score; SOT = Spatial Orientation Test; PFT = Paper Folding Test.
p < .01.
The person separation reliability (an index comparable with Cronbach’s alpha) gives an indication of the extent to which a sample of people is able to separate the items in a test. SRI has comparable separation reliabilities to the existing instruments for the mental rotation and spatial visualization constructs (see Table 9). For instance, with only 10 mental rotation items, the SRI compared reasonably well with the 80-items CRT. We would attribute the lower value of separation reliability for the spatial orientation dimension of SRI (compared with SOT) to the relative ease with which the participants answered these items (see Table 5).
Comparison of Separation Reliabilities With Existing Instruments.
Note. SRI = Spatial Reasoning Instrument; MR = mental rotation; CCT = Cube Comparison Test; CRT = Card Rotation Test; SOR = spatial orientation; SOT = Spatial Orientation Test; SV = spatial visualization; PFT = Paper Folding Test.
Discussion and Conclusion
The SRI is designed to be used for two main purposes, namely, as a measure of spatial reasoning and as a research tool. First, it provides a measure of spatial reasoning ability for students in the age range 11 to 13 years. As mentioned earlier, the items in SRI are based from different contexts and are aligned with the representations and experiences of school students (see, for example, Table 1). We maintain that the SRI presents items that are much more familiar to middle school students than is the case with existing psychological tests. Most of the items relate to mathematics concepts that would typically be classified as geometry and/or measurement knowledge in the mathematics curricula and frameworks of most countries. As a consequence, the SRI has added utility as a measure for specific aspects of mathematics knowledge. Second, it can be used as a research tool for pre- to posttest measure and in correlational studies in mathematics achievement. It can be used to screen respondents with difficulties in spatial reasoning and it provides three different spatial measures: mental rotation, spatial orientation, and spatial visualization.
The SRI fills a gap in spatial ability research literature by providing a scale or metric to assess middle school students’ (11- to 13-year-old students) spatial maneuvers. The instrument is aligned with the curricular experiences of middle school students in contrast to existing instruments that have a more psychological orientation. The test is intended to measure cognitive functioning in terms of spatial manipulations as well as for pedagogical purposes. The instrument is designed on a theoretical foundation branched across three established dimensions of spatial reasoning, namely, mental rotation, spatial orientation, and spatial visualization. Thus, it provides three separate measures of spatial ability. Consequently, SRI offers the advantage of providing multiple measures in one administration of the test. It is unreasonable for classroom teachers to organize the implementation of three separate tests, adding to the utility of the SRI. Furthermore, compared with other existing instruments (e.g., PFT, CRT, and CCT), it assesses the respondent’s spatial ability on a relatively broader range of tasks within each construct and as such has more face validity.
The instrument exhibits sound psychometric properties in terms of reliability and validity. Evidence for the validity of the instrument was obtained from three types of information: (a) exploratory factor analysis of the individual scales (mental rotation, spatial orientation, and spatial visualization), (b) Rasch analysis of item quality within the respective constructs, and (c) correlations and person separation reliability with existing spatial reasoning instruments. The results suggest that the reliability of the 30-item instrument is high.
As any measuring instrument, the validity of the SRI scale is subject to the theoretical foundation that underpins it. We have chosen to design the instrument on the basis of three constructs. It is possible that such a focus may have overlooked other dimensions of spatial ability. We also acknowledge that the design of tasks in the instrument according to the hypothesized spatial maneuvers may not always match the ones used by students. A task theoretically designed to measure spatial orientation may be solved by mental rotation depending on how the respondent views the problem. This type of disparity may occur given the knotted relationship among the spatial constructs although we expect these to be minimal. At this stage of our project, we experienced the need for further characterization of spatial visualization (as it occurs in the experiences of the school curriculum) given its lack of specificity as a spatial construct among researchers. Furthermore, as more validation work is carried out, from both quantitative and qualitative perspectives, it will be possible to establish the test–retest reliability and stability of SRI scale over time.
It is envisaged that the SRI will be significantly helpful for researchers working with middle school children, especially those who require a spatial metric. Compared with the tests designed in the 1970s, SRI brings a novel perspective in spatial ability measurement, consistent with contemporary demands. It is hoped that the SRI will be useful for researchers in both education and psychology.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
