STEM Interest Complexity Inventory Short Form With IRT and DIF Applications

Abstract

The 127-item Science, Technology, Engineering, Mathematics (STEM) Interest Complexity Inventory and 15-item General STEM Interests Scale, each of which were previously developed to assess interests toward increasingly complex tasks, were shortened to 37-item and 12-item measures. Item response theory analyses employed on the data of 930 students in STEM majors indicated items with higher discrimination parameters and equivalent functioning across genders. The short form (SF) supported a four-factor structure of interests toward interacting with numerical data, symbolic data, spatial data, and STEM-related ideas. Concurrent criterion–related validation was supported with relevant vocational fit criteria. Hierarchical regression analysis revealed that STEM interest complexity added incremental variance over achievement motivation and test anxiety in predicting fit. Measurement invariance was demonstrated across samples from Turkey and the United States. The STEM Interest Complexity Inventory SF is a valid measure of vocational interests for research at the college level. Validities with high school and working samples are yet to be demonstrated.

Keywords

STEM vocational interests interest complexity occupational complexity item response theory differential item functioning

Vocational interest assessment is an area that is continually improving. Researchers are introducing new models (e.g., Armstrong & Rounds, 2010; Toker & Ackerman, 2012; Tracey, 2002) and are working on improving existing assessment methods through psychometric theory (e.g., Einarsdottir & Rounds, 2009; Tay, Drasgow, Rounds, & Williams, 2009) to enhance prediction and reduce measurement bias. Our aim with the present study was to contribute to this growing body of literature by introducing a practically useful version of a vocational interest assessment that was developed by taking account of differing levels of occupational complexity, pertaining to investigative and realistic work environments (Holland, 1997). The measure was developed and initially validated as a 127-item instrument (Toker & Ackerman, 2012). In the present study, item response theory (IRT) was applied to obtain a shorter and valid instrument that could be easily utilized in research and application.

The Science, Technology, Engineering, Mathematics (STEM) Interest Complexity Inventory was developed to fill a gap in the literature that has to do with assessing interests toward work that differ in their complexity levels. STEM occupations differ in their level of task complexity and are included in the realistic and investigative work environments together with very low-complexity occupations in the current vocational assessment inventories. Thus, identifying an individual’s direction of vocational interest does not tell us anything about potential fit for occupational work with varying levels of complexity.

The development of the STEM Interest Complexity Inventory is described in Toker and Ackerman (2012) in detail. Items were formed based on occupational databases including the occupational aptitudes map (Gottfredson, 1986), the Dictionary of Occupational Titles (DOT; U.S. Department of Labor, 1991), and the Occupational Information Network (O*NET, 2007). Four content factors of numeric, symbolic, and spatial interests and interests toward STEM-related ideas were formed as such content was shared across relevant occupations. Data complexity levels, as described in the DOT, were used to identify low-, moderate-, and high-complexity level items. Accordingly, within realistic and investigative environments, low-complexity occupations (e.g., machine operator) included work tasks to do with “copying,” “comparing,” and “computing”; moderate-complexity occupations (e.g., quality-control technician) included work tasks to do with “compiling data” and “analyzing”; and high-complexity occupations (e.g., engineering) included tasks to do with “synthesizing” and “generating new ideas or products.” Scale validation included two consecutive studies with students at a technical college in the Southeast region of the United States (total N = 691, Toker & Ackerman, 2012). Construct validity was shown based on converging associations with the relevant Holland interests, self-concept variables, and cognitive abilities. Factorial structure was supported with bifactor confirmatory factor analysis (CFA), modeling both the general factor and four content factors and another CFA model in which each indicator scale (e.g., numeric moderate-complexity interests) loaded on its respective content and complexity factors. Criterion-related validity was shown with positive moderate associations with vocational fit criteria, ranging from .31 to .39 and based on incremental variance over traditional interest assessments.

A shorter version of the inventory was in order for practical use of it. With large samples, IRT offers a psychometrically sound method to refine a scale through identifying items that do not discriminate well enough between latent trait levels and items that function differentially across groups. Differential item functioning (DIF) across genders have been shown to exist in some interest inventories that are currently in use (e.g., Einarsdottir & Rounds, 2009). Since men and women have consistent mean differences across STEM-related vocational interests (Inda, Rodriguez, & Peňa, 2013; Su, Rounds, & Armstrong, 2009), it is imperative that we have an instrument without measurement bias across genders.

IRT and DIF Use in Scale Refinement

Researchers have utilized IRT techniques to produce shorter versions of measures (e.g., Betancourt, Yang, Bolton, & Normand, 2014; Gültas, 2014; Hays, Morales, & Reise, 2000). Scale refinement based on retaining items according to their discrimination parameters does not necessarily increase the validity of the measure nevertheless; a shorter scale can be produced by preserving validities (e.g., Gültas, 2014).

In the present study, a two-parameter graded response model (GRM; Samejima, 1997) is used to identify discrimination parameters. Higher discrimination indicates that the item can better differentiate across the trait range. DIF was applied using the R lordif package version 0.3-3 (Choi, Gibbons, & Crane, 2011) based on GRM within the R statistical environment (R Core Team, 2016). The package employs ordinal logistic regressions for polytomous items (Miller & Spray, 1993; Zumbo, 1999). DIF detection is based on model comparisons. In identifying the ordinal logistic regression probabilities of responding to a scale point, Model 1 includes trait estimates as the predictor. A grouping variable (gender in this case) is added to the predictors of the regression equation in Model 2, and whether or not there is a difference across groups along the trait continuum (in this case STEM interests) is evaluated. For each item of the scale, a statistically significant χ² difference between Model 2 and Model 1 at the .01 level indicates probable differential functioning. In other words, even though men and women have the same true trait levels, their estimated trait scores appear different. In Model 3, an interaction, that is the product of the trait estimate and grouping variable, is added. A significant χ² difference between Model 3 and Model 2 indicates that the direction of the difference observed between groups changes as the trait estimates change.

Validation of the STEM Interest Complexity Inventory Short Form (SF)

Once the SF of the Inventory was produced, construct validation was studied based on factor analyses. Criterion-related validation was studied concurrently with variables typically employed in the validation of interest inventories such as vocational area satisfaction, persistence intentions, and performance. In addition, motivational variables other than vocational interests were included in the test of validation in order to see the additional contribution of vocational interests as individual differences in different domains have been shown to jointly contribute to vocational outcomes (e.g., Ackerman, 1997; Kanfer, Wolf, Kantrowitz, & Ackerman, 2010; Lubinski, 2000). The personality variable of achievement motivation and the affective state of test anxiety are motivational variables that have been repeatedly shown to be related to academic and career success (e.g., James, 1998; Kanfer et al., 2010). Accordingly, the study hypotheses were as follows:
Hypothesis 1: The STEM Interest Complexity Inventory SF will have a four content factor structure indicating numerical interests, symbolic interests, spatial interests, and interests toward STEM ideas.

Hypothesis 2: The STEM Interest Complexity Inventory SF will have moderate associations with STEM-related vocational fit criteria including STEM area satisfaction and persistence intentions and with academic success in STEM.

Hypothesis 3: The STEM Interest Complexity Inventory SF will explain unique variance in the prediction of STEM satisfaction, persistence, and intentions to pursue a complex STEM career above achievement motivation with moderate effect sizes.

Hypothesis 4: The STEM Interest Complexity Inventory SF will explain unique variance when analyzed together with achievement motivation and test anxiety in the prediction of STEM-related cumulative grade point average (CGPA), mathematics, and physics grades, with small effect sizes.

Method

Sample and Procedure

The study was announced on bulletin boards of five university campuses in Ankara and Istanbul, Turkey. In identifying the universities for data collection, those that have numerous STEM departments and were easily accessible by the researchers were given priority. In addition, attention was paid to the University Entrance Examination minimum scores necessary to be admitted into these universities, such that the minimum scores would span a wide range. Approvals of the institutional review boards of the universities were obtained prior to data collection.

Participants responded to questionnaires through the most up to date version of Qualtrics in 2015, which included demographic information questions, scales of interest complexity, achievement motivation, test anxiety, academic domain satisfaction, and intentions to persist in STEM. Participants provided written permission that allowed the researchers to reach their course grades over the online database of the University Registrar. Upon completion of the study, participants were reimbursed either with extra course credit or with monetary compensation.

Undergraduate students enrolled in STEM departments formed the sample. Data from the online survey system showed that participants had responded to the informed consent 2,484 times either accepting or declining participation. Of that sum, 1,319 completed the survey fully or partially. Raw data of the remaining participants were examined carefully, and responses that did not meet predetermined criteria were removed. For a participant’s data to be included in any further analysis, she or he must have responded to 70% or more of the survey questions (to ensure that all sections of the focal STEM Interest Complexity Measure Scales were answered), must not have completed the survey in less than 10 min, must have correctly answered at least three of the six control questions (e.g., answer this question as 4 = is somewhat true of me), and should not have consistently selected the same values in a Likert-type question sequence that also contains several reverse-coded items (i.e., answering 20 or more questions as 5 = is very true of me). In addition, the responses of participants who completed the survey twice or more were removed from the data set.

Data of 930 students remained for IRT analyses. Of this final total, 595 (64%) were men, and 335 (36%) were women. The ratio of women in the data set was higher than the percentage of women (i.e., 28%) reported to be enrolled in engineering departments according to the 2005 statistics of the Higher Education Council (HEC) of Turkey (Smith & Dengiz, 2009). The representation of men in engineering departments was higher compared to women, and the representation of women in science departments was higher compared to men.

Of the participating students, 638 (68.6%) of them were enrolled in a public technical university in Ankara, 224 (24.1%) of them were enrolled in a private university in Istanbul, 43 (4.6%) of them were enrolled in another public university in Ankara, and 25 (2.7%) of them were enrolled in two other private universities in Ankara. In terms of duration of education, 168 (18.1%) were freshman, 276 (29.7%) were sophomore, 210 (22.6%) were junior, 176 (18.9%) were senior, 62 (6.7%) of them were in their fifth year, and 28 (2.8%) of them were in their sixth or higher years of education. CGPA of STEM courses were calculated for 311 participants.

Measures

Demographic information form

Participants responded to questions related to their gender, department, scores on a nationwide quantitative reasoning test, and CGPA. In addition, to be able to calculate CGPA of STEM-related courses, participants’ consent to reach their transcripts from the registrar’s online system was taken. Academic transcripts were obtained only from the technical university students in Ankara; thus, grade conversion across universities was not an issue.

STEM Interest Complexity Inventory

The measure was developed (Toker, 2010; Toker & Ackerman, 2012) based on activities relating to occupations with varying levels of complexity under Holland’s (1997) realistic and investigative work environments. Items were designed to assess interests toward work activities in STEM occupations that necessitate interacting with numerical (25 items), symbolic (30 items), and spatial data (22 items) and with STEM-related ideas (30 items). The inventory has 127 items. In each one of the subscales, there are items assessing low-, medium-, and high-level complexity in its respective domain. Additionally, a general STEM Interest Complexity Scale with 15 items was used, which in order of increased complexity measures (1) getting the general idea, without going into technical jargon or detail; (2) acquiring specialized knowledge, without learning about the empirical studies that form the basis for the knowledge; (3) following the empirical literature; (4) critically evaluating the empirical literature; and (5) formulating ideas to investigate. Items were rated on a 6-point Likert-type scale ranging from 1 = not at all true of me to 6 = very true of me. Reliability and validity evidence with U.S. college students are presented in Toker and Ackerman (2012). Internal consistency reliabilities ranged from .93 to .97. Translation of the measure to Turkish was undertaken by the developer of the scale who is a bilingual in Turkish and English and an industrial/organizational psychologist. Back-translation was undertaken by another bilingual with a degree in business administration. The two translators compared the original English items with back-translated items for conceptual equivalence. Necessary revisions were undertaken to maintain conceptual equivalence of the Turkish form.

International Personality Item Pool (IPIP) Achievement Motivation Scale

The scale, a conscientiousness facet in the IPIP (Goldberg et al., 2006), has 10 items, rated on a 6-point scale ranging from 1 = not at all true of me to 6 = very true of me. Reliability and validity information are provided by Goldberg et al. (2006). Translation to Turkish, back-translation, and evaluation of the translation were undertaken by the bilingual individuals.

Test Anxiety Inventory

The scale includes 20 items from Mandler and Sarason’s (1952) original inventory translated and adapted to Turkish by Öner (1990). Cronbach’s α coefficient was reported to be .89. In its validity study, correlations as high as .43 were reported between the inventory and mathematics grades and also CGPA. Items were rated on a 4-point Likert-type scale ranging from 1 = almost never to 4 = almost always.

STEM persistence intentions

The scale, developed by Toker (Toker, 2010; Toker & Ackerman, 2012), measures intentions to obtain an STEM bachelor’s degree and to pursue a graduate education and career in an STEM field. Ten items are rated on a 6-point scale ranging from 1 = not at all true of me to 6 = very true of me. Internal consistency reliabilities of the three dimensions ranged from .79 to .88. Construct validity was demonstrated with STEM interest complexity, realistic and investigative interests, self-concept in mathematics and sciences, and STEM-course CGPA, with correlations ranging from .20 to .34 (Toker, 2010).

Academic Domain Satisfaction Scale

This Likert-type scale, developed by Lent et al. (2005), measures the level of satisfaction students get from the experiences in their departments and area of specialization (i.e., classes they are enrolled in, the level of intellectual stimulation). Lent et al. reported an internal consistency reliability of .86 and showed moderate-to-high associations with engineering self-efficacy, self-efficacy in overcoming obstacles in an engineering major, and outcome expectancy in engineering.

Vertical career intentions measure

Developed by Toker (Toker, 2010; Toker & Ackerman, 2012), the measure lists three of the most complex STEM clusters based on the ordering of occupations from highly complex ones to less complex ones (Gottfredson, 1986). The highest complexity cluster includes engineering occupations, the moderate-to-high complexity cluster includes occupations such as that of a quality-control technician, and the low-to-moderate complexity cluster includes occupations such as that of a machine operator. In each of the clusters presented to participants, names of occupations were listed along with the required tasks, activities, skills, and abilities. Participants were asked whether they would want to work in one of the jobs listed in the presented clusters. They were instructed to assume that these occupational clusters were equivalent in terms of status and pay. In order to check whether or not participants perceived the clusters to indeed vary in their level of complexity, they were asked the question “in your opinion, how cognitively demanding are the tasks involved in the occupations of this cluster?” to be rated on a 6-point Likert-type scale ranging from 1 = not cognitively demanding at all to 6 = very cognitively demanding. It was previously observed that perceptions of the level of cognitive demand required by the clusters increased as the complexity of the occupational clusters increased (Toker & Ackerman, 2012). The same results were observed in the current data set.

Results

IRT analyses were performed with the two-parameter GRM using EQSIRT. Unidimensional models provide more reliable parameter estimates (e.g., Edelen & Reeve, 2007; Embretson & Reise, 2000), and we wanted to show that we satisfy the basic unidimensionality assumption. Before conducting the unidimensional IRT analyses, each content area was analyzed for its factor structure to see if the data would yield a multifactor structure based on complexity levels. The numeric, symbolic, spatial, and the general interest scales each had one eigenvalue surpassing 1 based on the calculation of principal axis eigenvalues. Parallel analysis supported that one factor clearly surpassed random data generated eigenvalues. Only the STEM-Ideas Scale yielded a two-factor structure; however, the factors did not lend themselves well to a distinction based on complexity levels. Moreover, the first factor explained a much greater portion of variance in the data, with eigenvalues of 13.8 and 1.5, respectively. Thus, in the following IRT analyses, all three-complexity level items were included together under each content factor in order to identify which complexity level had better discrimination parameters.

One noteworthy finding was that the discrimination parameters and factor loadings of the reverse-coded items were much lower than those of nonreverse-coded items. Discrimination parameters ranged from .26 to .98 for the reverse-coded items (median = 0.81) and they ranged from .48 to 3.27 (median = 2.02) for the nonreverse-coded items. Factor loadings ranged from .15 to .50 for the reverse-coded items (median = 0.43), whereas they ranged from .27 to .90 for the nonreverse-coded items (median = 0.72). Mean, median, and ranges for the discrimination parameters and factor loadings are presented in Table 1 based on content and complexity factors. Reverse-coded items were not included in the calculation of these descriptive statistics; they are presented separately. Reverse-coded items were eliminated from the measure.

Table 1.
Item Discrimination and Threshold Parameters and Factor Loadings Based on Content/Complexity Scales.

Scale N ^a Discrimination Parameter Factor Loadings

Range Mean Median Range Mean Median

Numeric

Low 5 0.484–2.020 1.035 0.755 .274–.765 .475 .406

Moderate 12 1.083–3.273 2.010 1.884 .537–.887 .734 .741

High 5 0.787–3.263 2.369 2.567 .420–.887 .758 .834

Symbolic

Low 4 1.598–1.978 1.804 1.820 .685–.758 .726 .730

Moderate 14 1.601–2.722 2.290 2.340 .686–.848 .798 .809

High 9 1.728–2.528 2.039 2.008 .713–.830 .765 .763

Spatial

Low 4 0.832–1.671 1.167 1.082 .440–.701 .553 .536

Moderate 8 1.262–3.503 2.070 1.883 .596–.900 .741 .742

High 8 1.358–3.152 2.430 2.641 .624–.880 .803 .841

STEM ideas

Low 1 2.492 — — .826 — —

Moderate 10 1.613–2.521 2.102 2.034 .697–.829 .771 .767

High 15 1.681–2.584 2.110 2.118 .703–.835 .774 .780

General STEM

Low 2 1.242–1.620 1.430 — .590–.690 .640 —

Moderate 9 1.928–3.114 2.611 2.526 .750–.878 .833 .830

High 4 2.629–3.127 2.850 2.821 .840–.879 .858 .856

Reverse-coded items 12 0.260–0.982 0.810 0.807 .151–.500 .429 .429

Note. Low = low-complexity scale; moderate = moderate-complexity scale; high = high-complexity scale.

^aDescriptive statistics for the reverse-coded items are given separately and are not included in the low-, moderate-, and high-complexity categories.

All low-complexity items, with the exception of one STEM-ideas factor item, have relatively lower discrimination parameters (ranging from .48 to 2.02) and lower factor loadings (ranging from .27 to .76). Discrimination parameters are observed to increase as the complexity level of factors increases; means of the discrimination parameter medians for the low-complexity factors were 1.22, for the moderate-complexity factors, it was 2.15, and for the high-complexity factors, it was 2.43.

Item discrimination parameters and factor loadings were taken into consideration while determining which items would be retained in the SF of the STEM Interest Complexity measure. Most items had discrimination parameters higher than 2.00 as determined with the logistic metric. Most items had factor loadings higher than .80. To come up with a shorter version of the measure, cutoff values of 2.00 and .80 for discrimination and factor loadings, respectively, were set. Following these criteria, all low-complexity items were dropped from the measure, except for one STEM-ideas factor item (i.e., “I enjoy comparing and contrasting two perspectives on a topic related to STEM areas”). Several moderate- and high-complexity items were also dropped. In addition, dropping items that displayed DIF across men and women were considered.

For each of the scales, DIF was tested using the R lordif package (Choi et al., 2011). According to model comparison χ² difference tests, only 1 item (of 25) in the numerical interests scale, 14 items (of 30) in the symbolic interests scale, 7 items (of 22) in the spatial interests scale, 3 items (of 30) in the STEM-Ideas Scale, and finally, 2 items (of 15) in the general STEM interests scale were found to display DIF across genders. Effect sizes of model comparison results were also considered (see Choi et al., 2011) while taking out items from the measure since the probability of committing Type 1 error increases when performing χ² tests with large samples. For instance, 3 items with statistically significant DIF were retained, as adding the gender grouping variable to the model over the interest complexity trait estimate did not explain even over 1% additional variance. The symbolic high complexity item “I enjoy thinking in terms of abstractions and formulas while trying to develop new ideas” had a significant χ² difference when Model 2 was compared with Model 1. Nevertheless, the effect sizes of the difference in terms of ▵R² and ▵β were below .01. Such items were retained also because gender did not interact with interest complexity trait estimates and because the discrimination parameters and factor loadings were above the criteria.

Items with significant interaction effects, indicating that different trait estimates are required across men and women for choosing a given response category, were dropped (e.g., “I would enjoy thinking about how several three-dimensional systems could be combined in the most optimal way for a better working system”). In this example, the item function changes across trait levels as indicated with a significant difference between Model 2 and Model 3, P( $χ_{23}^{2}$ , 1) < .001. Most reverse-coded items displayed such interaction effects.

As a result of IRT and DIF analyses, 6 items (3 moderate and 3 high complexity) were retained in the numerical scale, 13 items (10 moderate and 3 high complexity) were retained in the symbolic scale, 7 items (3 moderate and 4 high complexity) were retained in the spatial scale, 11 items (1 low-later categorized as moderate, 3 moderate, and 7 high complexity) were retained in the STEM-Ideas Scale, and finally 12 items (8 moderate and 4 high complexity) were retained in the general STEM interests scale. All further analyses were based on this shortened form.

SF Construct Validation

We subjected the 37 items to a four-factor CFA using EQS6.2, by freely correlating the latent factors of numerical interests, symbolic interests, spatial interests, and ideas. Robust statistics (Satorra & Bentler, 1988) were evaluated as the data had multivariate kurtosis (Mardia’s Z = 188). With the addition of five correlated errors on symbolic items, the four-factor model fits the data well, S-Bχ²(618) = 1,784.63, p < .001, confirmatory fit index (CFI) = .94, root mean square error of approximation (RMSEA) = .045, 95% confidence interval (CI) = [.043, .047], ρ = .97. Average off-diagonal standardized residuals were small (.04), and 91% of residuals were distributed normally between −0.1 and +0.1. Factor loadings ranged from .77 to .86 on numeric interests, from .74 to .81 on symbolic interests, from .78 to .89 on spatial interests, and from .70 to .78 on the STEM-ideas factor. Items and their loadings are presented in Table 2. Thus, Hypothesis 1 pertaining to the four-factor construct validity of the SF found support. CFA factor loadings of the 12-item General STEM Interests Scale for the moderate- and high-complexity items (see Table 3) ranged from .73 to .83, S-Bχ²(52) = 320.27, p < .001 CFI = .95, RMSEA = .075, 95% CI [.067, .082], ρ = .97. A separate analysis including the two low-complexity items yielded loadings of .48 and .59. Internal consistency reliabilities were .96 (6 items) for numeric interests, .96 (13 items) for symbolic interests, .93 (7 items) for spatial interests, .93 (11 items) for STEM ideas, and .95 (12 items) for general STEM interests. Reliabilities ranged from .82 to .93 for the scales formed based on their complexity levels.

Table 2.
STEM Interest Complexity Inventory Short Form Items and CFA Factor Loadings.

Code Level of Data Complexity Items Nu Sym Sp Id R ²

Nu_high Synthesizing I would not hesitate to evaluate numerical results to arrive at a conclusion .86 .74

Nu_high Synthesizing I would like working on problems that necessitate integrating numerical results .83 .70

Nu_mod Analyzing I enjoy analyzing the given information in a numerical problem .83 .70

Nu_mod Analyzing When reading something technical, I like to analyze the numerical evidence presented to check its accuracy .82 .67

Nu_mod Compiling While reading a technical article, I like paying attention to the numerical evidence they present .81 .65

Nu_high Synthesizing I like drawing conclusions or forming relationships between concepts based on numerical information .77 .60

Sym_high Generating I enjoy thinking in terms of abstractions and formulas while trying to develop new ideas .81 .66

Sym_mod Analyzing I would like to evaluate the accuracy of a solution involving formulas .81 .65

Sym_mod Analyzing I enjoy trying to solve a complicated abstract problem indicated with scientific notations .81 .66

Sym_mod Analyzing While reading a technical report, I like to analyze the evidence they present as formulas .80 .65

Sym_high Generating I would be interested in using mathematical and scientific tools (e.g., formulas) to define new problems .78 .61

Sym_mod Compiling I would not mind compiling information that is based on formulas .78 .62

Sym_mod Analyzing I enjoy working with abstract parameters in a problem (e.g., d/dx sin x) .78 .61

Sym_mod Analyzing I like trying to figure out what mathematical or scientific formulas to use in a problem .76 .58

Sym_mod Analyzing I am interested in learning complex forms of mathematical representations (such as a complex Fourier integral: $y (t) = ½π \int \int y (v) exp (i ω (t - v)) d v d ω)$ . .76 .58

Sym_mod Analyzing I find it exciting to solve problems with abstractions (e.g., functions, trigonometry, differentiation, and integrals) .76 .60

Sym_mod Analyzing I enjoy learning symbolic representations of concepts (e.g., mathematical symbols such as ∫dx) .76 .58

Sym_high Synthesizing I like to finding relationships between pieces of information in symbolic form .75 .57

Sym_mod Analyzing I like solving for equations represented only by symbolic formulas with very few numbers (e.g., $y (t) = 1/ \sqrt{2} \int s i n \sqrt 2 (t - τ .) d τ$ ). .74 .55

Sp_mod Analyzing I would like working on analyzing a complex three-dimensional (3-D) system .89 .80

Sp_high Generating I would like working on designing a 3-D representation of an object/system .85 .72

Sp_high Generating I would be interested in thinking about how to create an original mechanical system .82 .67

Sp_mod Analyzing I enjoy solving a problem based on information of a 3-D system (e.g., mechanical system) .80 .64

Sp_high Generating I would like to think about how to adapt a 3-D technological system to user needs .79 .63

Sp_high Generating I would be interested in thinking about how to create a 3-D system (e.g., designing part of an original mechanical system or a 3-D animation design) .78 .61

Sp_mod Analyzing I would like to evaluate the advantages and disadvantages of a new 3-D design .78 .61

Id_mod Compiling In STEM-related areas, I like learning the basic principles and applying them to specific problems .78 .61

Id_low Comparing^a I enjoy comparing and contrasting two perspectives on a topic related to STEM areas .78 .61

Id_high Generating I would enjoy spending my time on thinking to come up with a number of ideas about a topic related to STEM .77 .59

Id_high Synthesizing I would enjoy using my accumulated knowledge and ideas in engineering and sciences to further the advancements of the STEM fields .76 .58

Id_high Synthesizing I get satisfaction out of working on integrating ideas from the different areas of sciences and engineering .76 .57

Id_mod Analyzing I would like to critically analyze the strengths and weaknesses of technical propositions in STEM fields .76 .58

Id_high Synthesizing I like to integrate findings to make inferences on a STEM-related technical topic .75 .57

Id_high Synthesizing I like thinking about the implications and future applications of findings in STEM-related areas .74 .54

Id_high Generating I would find it exciting to develop creative ways to solve a STEM-related problem .74 .54

Id_mod Analyzing I enjoy reasoning about different perspectives in a STEM area–related technical debate .74 .55

Id_high Synthesizing I like thinking what the real-world applications of mathematical and scientific concepts could be .70 .49

Note. R² = percentage of variance explained; Nu = numeric interests; Sym = symbolic interests; Sp = spatial interests; Id = STEM ideas; mod = moderate complexity; high = high complexity; CFA = confirmatory factor analysis; STEM = Science, Technology, Engineering, Mathematics.

^aCould be considered a moderate-complexity item related to analyzing.

Table 3.
STEM Interest Complexity Inventory—General Interests Scale Items and CFA Factor Loadings.

Code Level of Data Complexity Items Loading R ²

Gen_high 5 Putting extra effort in working on improving an aspect in the fields of math, science, or technology .83 .69

Gen_mod 2 Searching for advanced knowledge in scientific and technological developments .82 .67

Gen_mod 2 Grasping mathematics and scientific principles at a level that enables understanding of the empirical research literature .82 .67

Gen_mod 4 Analyzing the scientific and/or technology literature and the existing technological systems to identify areas of improvement .82 .67

Gen_high 5 Using advanced levels of mathematics and/or sciences to formulate new technological designs .81 .65

Gen_high 5 Taking the challenge of building theories based on your advanced knowledge of mathematics, sciences, and technology .80 .64

Gen_high 5 Trying to come up with new ideas that would contribute to the fields of sciences or technology .78 .61

Gen_mod 3 Accumulating your in-depth knowledge on mathematics and/or a scientific or technological topic at a professional level .78 .61

Gen_mod 3 Following the controversies and debates in the sciences or technology development literature .76 .58

Gen_mod 3 Following the empirical scientific and technological research literature .76 .57

Gen_mod 4 Analyzing the pros and cons of specific research conducted in sciences and technology development .75 .55

Gen_mod 2 Learning about the underlying mathematical and scientific principles of new technologies .73 .54

Gen_low 1 Browsing through conceptual narrative books/magazines related to physical sciences, engineering, or technology (e.g., Science magazine) .59 .35

Gen_low 1 Learning about interesting facts in sciences or technology from magazines or the TV .48 .23

Note. R² = percentage of variance explained; Gen = general STEM interests scale; low = low complexity; mod = moderate complexity; high = high complexity; CFA = confirmatory factor analysis; STEM = Science, Technology, Engineering, Mathematics; 1 = getting the general idea without going into technical jargon or detail; 2 = acquiring specialized knowledge without learning about the empirical studies that form the basis for the knowledge; 3 = following the empirical literature; 4 = critically evaluating the empirical literature; 5 = formulating ideas to investigate.

A bifactor CFA model (e.g., Chen, West, & Sousa, 2006; Reise, Morizot, & Hays, 2007) was also tested in order to support that the eight content-complexity variables would be represented both by a global factor in addition to the four content factors. Here, item parceling was used to form two indicators for each content factor: a composite of the moderate-complexity items and another composite of the high-complexity items. Four latent content factors and one latent global factor were modeled in which each manifest indicator (e.g., numerical moderate complexity) loaded on its respective content factor and also the global factor. The bifactor model was built as described by Toker and Ackerman (2012, p. 538) for testing the long version of the scale. Accordingly, the model had uncorrelated content factors and uncorrelated errors. Robust statistics were evaluated as the data had multivariate kurtosis (Mardia’s Z = 49.2). The bifactor model fits the data very well, S-Bχ²(12) = 11.41, p = .494, CFI = 1.00, RMSEA = .000, 95% CI [.00, .03], ρ = .98. All loadings were significant. The numeric and symbolic scales loaded more heavily on the global factor (from .84 to .90) than their respective content factors (from .20 to .50). Spatial interests loaded slightly higher on the global factor (.66 for moderate complexity and .84 for high complexity) than the content factor (.62 and .60, respectively). The ideas scales loaded equivalently on the global and content factors (from .63 to .68). Such a pattern of loadings on the global and content factors was consistent with the scale’s long version studied in the United States (Toker & Ackerman, 2012).

SF Concurrent Criterion-Related Validation

The criterion-related validity of the STEM Interest Complexity Inventory SF was studied concurrently based on associations with vocational fit criteria including STEM academic major satisfaction; intentions to pursue an STEM bachelor’s degree, graduate degree, and career; and intentions to work in an STEM occupational group characterized by high levels of complexity and also CGPA calculated based on STEM courses. Descriptive statistics of STEM content scales, vocational criteria, achievement motivation, and test anxiety are presented in Table 4. It was observed that the moderate- and high-complexity scales did not differ in their magnitude of associations with the relevant criteria; thus, the results are given for the content domains (e.g., numeric) each of which is a composite of moderate- and high-complexity items. The general STEM interests scale is presented based on its low-, moderate-, and high-complexity scales as they display somewhat differential descriptive statistics and associations.

Table 4.
Descriptive Statistics of Study Variables.

Scale M SD Min. Max. Skew α k

SF: numeric 4.40 1.00 1.00 6.00 −0.72 .93 6

SF: symbolic 4.03 1.12 1.00 6.00 −0.42 .96 13

SF: spatial 4.24 1.16 1.00 6.00 −0.53 .93 7

SF: STEM ideas 4.66 0.84 1.00 6.00 −0.71 .94 11

General STEM low 4.54 1.00 1.00 6.00 −0.66 .58 2

General STEM mod. 4.38 0.97 1.00 6.00 −0.56 .93 8

General STEM high 4.36 1.10 1.00 6.00 −0.64 .89 4

STEM-SF composite 4.34 0.86 1.35 6.00 −0.48 .97 37

General STEM composite 4.37 1.01 1.00 6.00 −0.60 .95 12

STEM area satisfaction 4.41 1.14 1.00 6.00 −0.72 .93 7

STEM BS degree intent 5.20 0.99 1.00 6.00 −1.49 .77 2

STEM graduate degree intent 4.17 1.45 1.00 6.00 −0.57 .90 3

STEM career intent 4.92 1.14 1.00 6.00 −1.23 .91 5

Composite STEM intent 4.77 1.03 1.00 6.00 −0.98 .91 10

Test anxiety 4.00 0.71 1.00 4.00 0.61 .96 20

Achievement motivation (IPIP) 4.32 0.77 1.60 6.00 −0.34 .87 10

CGPA 2.42 0.66 0.10 4.00 −0.16

STEM CGPA 2.32 0.70 0.27 3.93 −0.04 — —

Math course grades 2.35 1.10 0 4 −0.31 — —

Physics course grades 2.21 1.12 0 4 −0.10 — —

Chemistry course grades 2.70 0.75 0 4 −0.10 — —

Quantitative reasoning scores 82.30 7.93 67.70 98.20 0.02 — —

Note. STEM-SF composite includes numeric, symbolic, spatial, and STEM-Ideas Scales. α = Cronbach’s α internal consistency reliability; k = number of items; SF = short form; STEM = Science, Technology, Engineering, Mathematics; low = low-complexity scale; mod. = moderate-complexity scale; high = high-complexity scale; IPIP = International Personality Item Pool; CGPA = cumulative grade point average.

Correlations between STEM interest complexity scales and vocational fit criteria are presented in Table 5, together with correlations among vocational criteria. Statistically significant correlations that ranged from .19 to .50, mostly in the moderate range, are observed between the SF scales and vocational fit criteria. Lowest of the associations are observed with the spatial interests scales (r range = .19 and .29). Highest of them were observed with the STEM ideas and general STEM interests scales (r range = .27 and .50). Numerical and symbolic interest scales displayed moderate correlations as expected (r range = .22 and .34). A composite score based on all four content factors yielded correlations with vocational fit criteria that ranged from .30 to .41. Associations for the composite score based on the moderate- and high-complexity general STEM interests scales ranged from .29 to .51. A noteworthy finding was that the composite interest scale correlations were moderate (r = .30) for the dichotomously scored intentions toward working in a high-complexity STEM occupation, small (r = .16) for intentions toward working in a moderately complex STEM occupation, and negligible (r = .05) for intentions toward working in an STEM occupation with lower levels of complexity. Hypothesis 2 pertaining to observing moderate associations between interest complexity scales and fit criteria received support.

Table 5.
STEM Interest Complexity Inventory—Short Form Correlations With Vocational Fit Criteria.

Scale STEM Area Satisfaction STEM BS Degree Intent STEM Graduate Degree Intent STEM Career Intent Composite STEM Intent Work in Highly Complex STEM Work in Moderate- Complex STEM Work in Low- Complex STEM

SF: numeric .33 .24 .23 .32 .32 .23 .11 .03

SF: symbolic .33 .22 .29 .31 .34 .24 .09 .03

SF: spatial .24 .20 .19 .29 .28 .25 .23 .07

SF: STEM ideas .39 .36 .36 .43 .46 .27 .10 .03

STEM-SF composite .38 .30 .31 .40 .41 .30 .16 .05

General STEM low .30 .25 .30 .35 .37 .15 .09 .07

General STEM mod. .40 .31 .43 .46 .50 .29 .12 .10

General STEM high .40 .31 .43 .46 .50 .27 .09 .11

General STEM composite .41 .32 .46 .47 .51 .29 .11 .11

STEM area satisfaction .48 .46 .55 .59 .27 .11 .04

STEM BS degree intent .43 .68 .75 .35 .13 .01

STEM graduate degree intent .56 .82 .21 .12 .10

STEM career intent .92 .39 .16 .07

Composite STEM intent .37 .17 .08

Note. N = 930. Work in a highly complex STEM occupation: 0 = I do not intend, 1 = I do intend. STEM-SF composite includes numeric, symbolic, spatial, and STEM-Ideas Scales. All correlations above .09 are significant below .001. SF = short form; STEM = Science, Technology, Engineering, Mathematics; low = low-complexity scale; mod. = moderate-complexity scale; high = high-complexity scale. Values printed in Boldface type are for the composite scales.

STEM Interest Complexity Scales were also found to have statistically significant associations with STEM academic success criteria (see Table 6). Correlations with STEM CGPA ranged from .15 to .18, except for spatial interests. Relatively higher significant correlations were observed with mathematics course grades that ranged from .25 and .34. Significant correlations with physics and chemistry course grades ranged from .15 to .21. A smaller sample (N = 63) reported scores on a nationwide quantitative reasoning test, which yielded correlations from .23 to .31 with the interest complexity scales. Associations with such STEM academic success criteria also supported Hypothesis 2.

Table 6.
STEM Interest Complexity Inventory—Short Form Correlations With Academic Success Criteria.

Scale STEM CGPA Math Grade Physics Grade Chemistry Grade Quant. Reasoning CGPA

SF: numeric .15* .25 .12 .16 .31* .10**

SF: symbolic .18* .34 .21 .16 .24 .14

SF: spatial .03 .13 .15* .09 .23 −.02

SF: STEM ideas .14 .31 .17 .18 .24 .12

STEM-SF composite .14 .30 .19 .16 .31* .09

General STEM low .03 .09 .09 .02 .00 .04

General STEM mod. .08 .25 .12* .12* .05 .07*

General STEM high .13 .29 .19 .16 .10 .10

General STEM Composite .11 .28 .16* .14* .07 .09**

Note. STEM-SF composite includes numeric, symbolic, spatial, and STEM-Ideas Scales. Sample sizes are 232 for STEM CGPA, 198 for mathematics course grades, 279 for physics course grades, 278 for chemistry course grades, 63 for the quantitative reasoning exam scores, and 886 for CGPA. SF = short form; STEM = Science, Technology, Engineering, Mathematics; low = low-complexity scale; mod. = moderate-complexity scale; high = high-complexity scale; CGPA = cumulative grade point average; Quant. = quantitative. Values printed in Boldface type are for the composite scales.

*p < .05. **p < .01.

To test Hypothesis 3 that STEM interest complexity would predict attitudinal vocational outcomes about further pursuing STEM areas, over the personality variable relating to achievement motivation, a series of hierarchical regression analyses were performed. The SF composite score of the four content domains was used for interest complexity. According to regression results, achievement motivation was a significant predictor of STEM area satisfaction (β = .40, p < .001), of composite intentions to further pursue STEM areas (β = .34, p < .001), and of intentions to choose a highly complex STEM occupation (β = .13, p < .001). STEM interest complexity added significant incremental variance from 5% to 9% and offered moderate effect sizes in the prediction of intentions to further pursue STEM areas (β = .33, p < .001), of intentions to choose a highly complex STEM occupation (β = .28, p < .001), and of STEM area satisfaction (β = .25, p < .001). Hypothesis 3 found support.

To test Hypothesis 4 that expected that STEM interest complexity would still be an important predictor of STEM CGPA, and math and physics grades, in the presence of achievement motivation and an affective state variable relating to test anxiety, again hierarchical regression analyses were performed. Achievement motivation and test anxiety were significant predictors and together explained from 5% to 8% of variance. STEM interest complexity added a small amount of incremental variance in the prediction of mathematics (β = .22, p = .005) and physics grades (β = .13, p = .045) but not in predicting overall STEM CGPA (β = .05, ns). Hypothesis 4 was only partially supported.

Measurement Invariance Across Turkish and U.S. Samples

The STEM Interest Complexity measure was originally developed and validated in the Unites States. The measure was translated into Turkish and was shortened through IRT techniques using a large sample in Turkey. In order to ascertain that the SF could be used across both Turkey (TR) and the United States, a test of measurement equivalence was in order. The SF was subjected to measurement invariance across university student samples obtained from TR and the United States.

Sample and procedure

The TR sample used for invariance analysis was the same as described in the main study (N = 930). The U.S. sample had 274 STEM major participants (55% men) from a technical university in the Southeast region. This sample is part of the one that was used for the initial validation of the measure (Toker & Ackerman, 2012). Of the sample, 65% was enrolled in engineering majors, and the rest was enrolled in biology, mathematics, and computer science, with 23.7% freshmen, 32.1% junior, 17.5% sophomore, 15% senior, and 10.2% at their fifth grade. Demographic characteristics of the samples were deemed comparable. Measurement invariance with an emphasis on metric and structural invariances was undertaken based on the covariances approach. First, baseline models for the two samples were identified, followed by tests for configural, metric, and structural invariance. Thirty-five items could be utilized in the analysis of the four-factor model, as one symbolic and one spatial interest items were used as reverse coded in the U.S. sample but not in the TR sample. The 12-item General STEM Interests Scale was also tested as a separate one-factor model.

All model fit results are presented in Table 7. The four-factor model had configural, metric, and structural invariance across the TR and U.S. samples. Only 1 item in the symbolic factor (“I find it exciting to solve problems with abstractions such as functions, trigonometry, differentiation, and integrals.”) had metric noninvariance at an α level of .001. The one-factor General STEM Interests Scale also displayed configural and metric invariance, with no items displaying metric noninvariance.

Table 7.
Measurement Invariance Results Across Turkish and U.S. Student Samples.

Model S-B χ² df χ²/df CFI RMSEA CI for RMSEA

Four-factor model

TR baseline 1,597.22 549 2.90 .939 .048 [.043, .052]

U.S. baseline 893.45 549 1.63 .932 .048 [.042, .053]

Configural invariance 2,569.01 1,098 2.34 .936 .047 [.045, .050]

Metric invariance 2,662.85 1,134 2.35 .934 .047 [.045, .050]

Structural invariance 2,675.48 1,140 2.35 .933 .047 [.045, .050]

General STEM Interests Scale

TR baseline 279.29 52 5.37 .957 .069 [.061, .076]

U.S. baseline 113.06 54 2.09 .951 .063 [.047, .079]

Configural invariance 410.50 106 3.87 .953 .069 [.062, .076]

Metric invariance 433.37 117 3.70 .951 .067 [.060, .074]

Note. n for the TR sample is 930. n for the U.S. sample is 274. Total N = 1,204. CFI values above .90, RMSEA values below .06, and χ²/df values below 3 indicate adequate fit of the model to the data. Baseline models show fit of the models in the samples separately, configural invariance shows fit in the combined sample, metric invariance shows fit for indicator loadings across samples, and structural invariance shows fit for factor intercorrelations across samples. TR = Turkey; U.S. = United States; S-B = Satorra–Bentler scaled χ²; df = degrees of freedom; CFI = confirmatory fit index; RMSEA = root mean square error of approximation; CI = confidence interval; STEM = Science, Technology, Engineering, Mathematics.

Discussion

The main purpose of the present investigation was to construct the STEM Interest Complexity Inventory SF and demonstrate that it is valid in predicting vocational fit criteria. Sound psychometric procedures based on IRT and DIF applications were employed in the identification of items to be retained. The resulting SF displayed equivalent validity to that of the original long form, which was shown in Toker and Ackerman (2012). Analyses rest on a sample of 930 participants with 36% women and 64% men; thus, the results are deemed reliable since sample size was adequate and surpassed 500 to perform IRT (Reise & Yu, 1990) and was again adequate and passed the required minimum of 250 per group to perform DIF (Herrera & Gomez, 2008). The gender ratio is almost equivalent to the gender ratio observed in college STEM majors in Turkey according to the 2005 statistics reported by the HEC (Smith & Dengiz, 2009). Actually, the HEC reported that 28% STEM majors are women in Turkey. Our study sample surpasses that with 36%, which was an advantage to utilize IRT analyses. Our main aim of producing a shorter and valid STEM Interest Complexity measure was fulfilled, which added to the scarce literature demonstrating the validity of instruments shortened with IRT applications (e.g., Gültas, 2014). The original 127-item measure was reduced to a 37-item valid measure.

Measurement invariance of the SF across the TR and U.S. samples was demonstrated. Historically, several studies in the vocational interest literature have shown cross-cultural noninvariance of measurement instruments or their underlying construct structure (e.g., Einarsdottir & Rounds, 2009; Tracey, Watanabe, & Schneider, 1997). It is imperative to show that what an instrument is measuring is the same across different groups. We were able to show conceptual equivalence based on metric invariance and to show construct equivalence based on structural invariance and based on factor analysis results. CFA results supported the four content factors based on the shortened form. Furthermore, bifactor modeling yielded a pattern of content- and global-factor associations that mirrored those obtained in the United States with the long form (Toker & Ackerman, 2012). We can safely conclude that the SF is equivalent across the U.S. and TR samples. One could argue that integrating complexity levels into an interest assessment increases the fidelity, reducing ambiguity in item content. As also argued by Toker and Ackerman (2012, p. 526), interest assessments based on Holland’s themes include items that are ambiguous in terms of complexity and the level of required cognitive ability. Individuals from different cultures might have a different anchor point in mind with regard to the level of complexity the item is displaying, hence might be yielding different responses. Differential perception of item content does not seem to be the case with the STEM Interest Complexity Measure SF.

In addition to producing a valid shorter instrument, the IRT analyses on the items of the STEM Interest Complexity Inventory supported the theoretical basis of the measure pertaining to varying levels of difficulty. Specifically, the long-form item groups originally designed to express low-, moderate-, and high-complexity levels could be differentiated from each other based on item discrimination parameters. The observed trend was that as the level of complexity expressed in an item increased, that item better discriminated across the latent STEM interest complexity trait continuum. The explanation for this trend most probably is related to participants perceiving items as more difficult as the level of complexity an item was designed to express increased. When participants were shown three clusters of STEM occupations based on the work of Gottfredson (1986) together with their work tasks and required skills and abilities that varied across the clusters, and asked how difficult they perceived each cluster in terms of the cognitive effort required to perform in each, they indeed reported on average that the highest complexity-cluster occupations were the most difficult followed by the moderate-to-high-complexity cluster. The low-to-moderate complexity cluster was perceived to be the easiest among all three. This finding supports that having integrated levels of complexity led to a better discriminating measure.

The IRT applications of the present investigation served to justify the theoretical and conceptual basis of occupational complexity (see Gottfredson, 1986; Gottfredson & Holland, 1996; Spaeth, 1979; U.S. Department of Labor, 1991), which formed the basis of developing the STEM Interest Complexity measure in the first place and served to save testing time in the application of a valid instrument. Furthermore, items retained based on good discrimination values still span moderate- and high-complexity levels, rendering the SF parallel to its theoretical grounds.

Using the R lordif package (Choi et al., 2011) in DIF detection can be considered a strength of the present study. This method rests on IRT procedures, rather than non-IRT-based DIF applications such as the Mantel–Haenszel procedure. It simultaneously takes account of item discrimination and location parameters, estimates trait levels, and compares models to see the effects of the grouping variable. Following the logic of model comparisons, items that display group and trait-level interactions can also be identified. Such items, most of which were reverse-coded items, were eliminated.

Following IRT and DIF analyses, the SF was subjected to concurrent criterion-related validation. Associations with vocational fit criteria were as expected, and small-to-moderate statistically significant correlations were also obtained with STEM academic success criteria. These results are consistent with those reported in Toker and Ackerman (2012) and provide further support for the use of the measure. Associations on the spatial interest scale were lower, which is a finding again consistent with those reported by Toker and Ackerman (2012, p. 531). STEM interest complexity added incremental variance over achievement motivation in the prediction of attitudinal variables related to satisfaction and further persisting in STEM areas. The motivational influence of vocational interests giving direction to career choices was shown once more, supporting the motivational theories (e.g., Ackerman, 1996; Holland, 1997) and findings related to vocational interests (Tracey & Robbins, 2006; Van Iddekinge, Putka & Campbell, 2011), but with higher associations with criteria.

STEM interest complexity content scales had small-to-moderate associations with STEM academic performance indices. Effects were comparable or better than the meta-analytic corrected effects between various interest inventories and academic performance (Nye, Rong, James, & Fritz, 2012). The composite score added incremental variance over test anxiety and achievement motivation, in the prediction of mathematics and physics grades, albeit with small effects, but not in the prediction of overall STEM CGPA. Spatial interests were not associated with overall STEM success, but we did observe a small significant association with physics grades, which makes theoretical sense considering the spatial nature of the course content. Only numeric interests’ association with quantitative reasoning abilities was significant in the small sample we used to investigate this relationship. Consistent with the educational and work psychology literatures, achievement motivation and test anxiety were found to play a role in academic performance (e.g., James, 1998). It can be argued that cognitive abilities (Guion, 2011) and grit (Duckworth & Quinn, 2009) are potential predictors that would increase explained variance.

At this point, a discussion on the cultural context with regard to criterion associations is in order. Even though there are no studies located that have investigated differential interest criterion associations based on culture, one can still argue that culture can shape the degree to which interests can shape vocational decisions. The social cognitive career theory (Lent, Brown, & Hackett, 1994) takes account of environmental variables that can affect one’s career choices. Societal and family values are one of these potential influences, and in-group collectivism at the societal level has been shown to be a moderator of the developmental association between personality traits (e.g., extroversion) and vocational interests (e.g., enterprising interests; Ott-Holland, Huang, Ryan, Elizondo, & Wadlington, 2013). Nevertheless, the magnitudes of associations with fit criteria that we obtained in the current study were consistent with those obtained in the U.S. samples. This is perhaps due to the fact that we assessed intentions rather than actual behavior and that we assessed these intentions in the samples of college students who attained enrollment in highly ranked departments through passing a rigorous university selection system, restricting them in terms of their cognitive abilities and perhaps STEM interests (see Ackerman, 1996; Cattell, 1987, for a discussion on the role of cognitive abilities in shaping vocational interests). Future studies that would include more diverse university samples across the country could expect differential associations between interests and fit.

Conclusion, Limitations, and Future Research

The 37-item STEM Interest Complexity Inventory SF spanning the numerical, symbolic, spatial, and STEM-related ideas domains, and the 12-item General STEM Interest Scale, prove to be valid measures of interest complexity. The two low-complexity items are also provided to readers should one wish to use them in assessment to observe the complexity level one is at.

The SF can be used confidently in research to measure STEM interests in samples of men and women. Even though sound psychometric procedures were applied in shortening the measure, its validation was studied concurrently which poses a limitation. Furthermore, data were collected from college students, which again pose a limitation to the generalizability of complexity levels and associations with criteria to different ages and levels.

Currently, the SF can be used in research with college students or to assess interests of entry-level college students to aid them in declaring a major. It would be possible to identify the level of occupational complexity one is interested in once interest in Holland’s realistic and investigative domains has been indicated. Nevertheless, we currently do not know whether or not item content would apply to aiding students at the high-school level. Even though item content in the SF is not far from the content students are exposed to at the high-school level, scale validation still needs to be investigated. The best strategy would be to undertake a longitudinal investigation of STEM interest complexity starting at the high-school level and to follow students through college for at least 2 years.

Vocational interest inventories exist to aid individuals in making healthy career choices. Thus, it follows that interest inventory validation ought to include validation with working samples. One such attempt was undertaken by Van Iddekinge, Putka et al. (2011) in terms of studying the validity of vocational interest inventories against job knowledge, job performance, and continuance intentions at work, which yielded moderate associations. A meta-analysis of associations of vocational interests with job performance, training performance, turnover intentions, and actual turnover produced small effect sizes (Van Iddekinge, Roth, Putka, & Lanivich, 2011), and another study (Nye et al., 2012) also produced small corrected effects with job performance indices and persistence. Similarly, the STEM Interest Complexity Inventory would benefit from an investigation among working samples across varying levels of work complexity.

Scale	N ^a	Discrimination Parameter	Factor Loadings
Numeric
Low	5	0.484–2.020	1.035	0.755	.274–.765	.475	.406
Moderate	12	1.083–3.273	2.010	1.884	.537–.887	.734	.741
High	5	0.787–3.263	2.369	2.567	.420–.887	.758	.834
Symbolic
Low	4	1.598–1.978	1.804	1.820	.685–.758	.726	.730
Moderate	14	1.601–2.722	2.290	2.340	.686–.848	.798	.809
High	9	1.728–2.528	2.039	2.008	.713–.830	.765	.763
Spatial
Low	4	0.832–1.671	1.167	1.082	.440–.701	.553	.536
Moderate	8	1.262–3.503	2.070	1.883	.596–.900	.741	.742
High	8	1.358–3.152	2.430	2.641	.624–.880	.803	.841
STEM ideas
Low	1	2.492	—	—	.826	—	—
Moderate	10	1.613–2.521	2.102	2.034	.697–.829	.771	.767
High	15	1.681–2.584	2.110	2.118	.703–.835	.774	.780
General STEM
Low	2	1.242–1.620	1.430	—	.590–.690	.640	—
Moderate	9	1.928–3.114	2.611	2.526	.750–.878	.833	.830
High	4	2.629–3.127	2.850	2.821	.840–.879	.858	.856
Reverse-coded items	12	0.260–0.982	0.810	0.807	.151–.500	.429	.429

Code	Level of Data Complexity	Items	Nu	Sym	Sp	Id	R ²
Nu_high	Synthesizing	I would not hesitate to evaluate numerical results to arrive at a conclusion	.86				.74
Nu_high	Synthesizing	I would like working on problems that necessitate integrating numerical results	.83				.70
Nu_mod	Analyzing	I enjoy analyzing the given information in a numerical problem	.83				.70
Nu_mod	Analyzing	When reading something technical, I like to analyze the numerical evidence presented to check its accuracy	.82				.67
Nu_mod	Compiling	While reading a technical article, I like paying attention to the numerical evidence they present	.81				.65
Nu_high	Synthesizing	I like drawing conclusions or forming relationships between concepts based on numerical information	.77				.60
Sym_high	Generating	I enjoy thinking in terms of abstractions and formulas while trying to develop new ideas		.81			.66
Sym_mod	Analyzing	I would like to evaluate the accuracy of a solution involving formulas		.81			.65
Sym_mod	Analyzing	I enjoy trying to solve a complicated abstract problem indicated with scientific notations		.81			.66
Sym_mod	Analyzing	While reading a technical report, I like to analyze the evidence they present as formulas		.80			.65
Sym_high	Generating	I would be interested in using mathematical and scientific tools (e.g., formulas) to define new problems		.78			.61
Sym_mod	Compiling	I would not mind compiling information that is based on formulas		.78			.62
Sym_mod	Analyzing	I enjoy working with abstract parameters in a problem (e.g., d/dx sin x)		.78			.61
Sym_mod	Analyzing	I like trying to figure out what mathematical or scientific formulas to use in a problem		.76			.58
Sym_mod	Analyzing	I am interested in learning complex forms of mathematical representations (such as a complex Fourier integral: $y (t) = ½π \int \int y (v) exp (i ω (t - v)) d v d ω)$ .		.76			.58
Sym_mod	Analyzing	I find it exciting to solve problems with abstractions (e.g., functions, trigonometry, differentiation, and integrals)		.76			.60
Sym_mod	Analyzing	I enjoy learning symbolic representations of concepts (e.g., mathematical symbols such as ∫dx)		.76			.58
Sym_high	Synthesizing	I like to finding relationships between pieces of information in symbolic form		.75			.57
Sym_mod	Analyzing	I like solving for equations represented only by symbolic formulas with very few numbers (e.g., $y (t) = 1/ \sqrt{2} \int s i n \sqrt 2 (t - τ .) d τ$ ).		.74			.55
Sp_mod	Analyzing	I would like working on analyzing a complex three-dimensional (3-D) system			.89		.80
Sp_high	Generating	I would like working on designing a 3-D representation of an object/system			.85		.72
Sp_high	Generating	I would be interested in thinking about how to create an original mechanical system			.82		.67
Sp_mod	Analyzing	I enjoy solving a problem based on information of a 3-D system (e.g., mechanical system)			.80		.64
Sp_high	Generating	I would like to think about how to adapt a 3-D technological system to user needs			.79		.63
Sp_high	Generating	I would be interested in thinking about how to create a 3-D system (e.g., designing part of an original mechanical system or a 3-D animation design)			.78		.61
Sp_mod	Analyzing	I would like to evaluate the advantages and disadvantages of a new 3-D design			.78		.61
Id_mod	Compiling	In STEM-related areas, I like learning the basic principles and applying them to specific problems				.78	.61
Id_low	Comparing^a	I enjoy comparing and contrasting two perspectives on a topic related to STEM areas				.78	.61
Id_high	Generating	I would enjoy spending my time on thinking to come up with a number of ideas about a topic related to STEM				.77	.59
Id_high	Synthesizing	I would enjoy using my accumulated knowledge and ideas in engineering and sciences to further the advancements of the STEM fields				.76	.58
Id_high	Synthesizing	I get satisfaction out of working on integrating ideas from the different areas of sciences and engineering				.76	.57
Id_mod	Analyzing	I would like to critically analyze the strengths and weaknesses of technical propositions in STEM fields				.76	.58
Id_high	Synthesizing	I like to integrate findings to make inferences on a STEM-related technical topic				.75	.57
Id_high	Synthesizing	I like thinking about the implications and future applications of findings in STEM-related areas				.74	.54
Id_high	Generating	I would find it exciting to develop creative ways to solve a STEM-related problem				.74	.54
Id_mod	Analyzing	I enjoy reasoning about different perspectives in a STEM area–related technical debate				.74	.55
Id_high	Synthesizing	I like thinking what the real-world applications of mathematical and scientific concepts could be				.70	.49

Code	Level of Data Complexity	Items	Loading	R ²
Gen_high	5	Putting extra effort in working on improving an aspect in the fields of math, science, or technology	.83	.69
Gen_mod	2	Searching for advanced knowledge in scientific and technological developments	.82	.67
Gen_mod	2	Grasping mathematics and scientific principles at a level that enables understanding of the empirical research literature	.82	.67
Gen_mod	4	Analyzing the scientific and/or technology literature and the existing technological systems to identify areas of improvement	.82	.67
Gen_high	5	Using advanced levels of mathematics and/or sciences to formulate new technological designs	.81	.65
Gen_high	5	Taking the challenge of building theories based on your advanced knowledge of mathematics, sciences, and technology	.80	.64
Gen_high	5	Trying to come up with new ideas that would contribute to the fields of sciences or technology	.78	.61
Gen_mod	3	Accumulating your in-depth knowledge on mathematics and/or a scientific or technological topic at a professional level	.78	.61
Gen_mod	3	Following the controversies and debates in the sciences or technology development literature	.76	.58
Gen_mod	3	Following the empirical scientific and technological research literature	.76	.57
Gen_mod	4	Analyzing the pros and cons of specific research conducted in sciences and technology development	.75	.55
Gen_mod	2	Learning about the underlying mathematical and scientific principles of new technologies	.73	.54
Gen_low	1	Browsing through conceptual narrative books/magazines related to physical sciences, engineering, or technology (e.g., Science magazine)	.59	.35
Gen_low	1	Learning about interesting facts in sciences or technology from magazines or the TV	.48	.23

Scale	M	SD	Min.	Max.	Skew	α	k
SF: numeric	4.40	1.00	1.00	6.00	−0.72	.93	6
SF: symbolic	4.03	1.12	1.00	6.00	−0.42	.96	13
SF: spatial	4.24	1.16	1.00	6.00	−0.53	.93	7
SF: STEM ideas	4.66	0.84	1.00	6.00	−0.71	.94	11
General STEM low	4.54	1.00	1.00	6.00	−0.66	.58	2
General STEM mod.	4.38	0.97	1.00	6.00	−0.56	.93	8
General STEM high	4.36	1.10	1.00	6.00	−0.64	.89	4
STEM-SF composite	4.34	0.86	1.35	6.00	−0.48	.97	37
General STEM composite	4.37	1.01	1.00	6.00	−0.60	.95	12
STEM area satisfaction	4.41	1.14	1.00	6.00	−0.72	.93	7
STEM BS degree intent	5.20	0.99	1.00	6.00	−1.49	.77	2
STEM graduate degree intent	4.17	1.45	1.00	6.00	−0.57	.90	3
STEM career intent	4.92	1.14	1.00	6.00	−1.23	.91	5
Composite STEM intent	4.77	1.03	1.00	6.00	−0.98	.91	10
Test anxiety	4.00	0.71	1.00	4.00	0.61	.96	20
Achievement motivation (IPIP)	4.32	0.77	1.60	6.00	−0.34	.87	10
CGPA	2.42	0.66	0.10	4.00	−0.16
STEM CGPA	2.32	0.70	0.27	3.93	−0.04	—	—
Math course grades	2.35	1.10	0	4	−0.31	—	—
Physics course grades	2.21	1.12	0	4	−0.10	—	—
Chemistry course grades	2.70	0.75	0	4	−0.10	—	—
Quantitative reasoning scores	82.30	7.93	67.70	98.20	0.02	—	—

Scale	STEM Area Satisfaction	STEM BS Degree Intent	STEM Graduate Degree Intent	STEM Career Intent	Composite STEM Intent	Work in Highly Complex STEM	Work in Moderate- Complex STEM	Work in Low- Complex STEM
SF: numeric	.33	.24	.23	.32	.32	.23	.11	.03
SF: symbolic	.33	.22	.29	.31	.34	.24	.09	.03
SF: spatial	.24	.20	.19	.29	.28	.25	.23	.07
SF: STEM ideas	.39	.36	.36	.43	.46	.27	.10	.03
STEM-SF composite	.38	.30	.31	.40	.41	.30	.16	.05
General STEM low	.30	.25	.30	.35	.37	.15	.09	.07
General STEM mod.	.40	.31	.43	.46	.50	.29	.12	.10
General STEM high	.40	.31	.43	.46	.50	.27	.09	.11
General STEM composite	.41	.32	.46	.47	.51	.29	.11	.11
STEM area satisfaction		.48	.46	.55	.59	.27	.11	.04
STEM BS degree intent			.43	.68	.75	.35	.13	.01
STEM graduate degree intent				.56	.82	.21	.12	.10
STEM career intent					.92	.39	.16	.07
Composite STEM intent						.37	.17	.08

Scale	STEM CGPA	Math Grade	Physics Grade	Chemistry Grade	Quant. Reasoning	CGPA
SF: numeric	.15*	.25**	.12	.16**	.31*	.10**
SF: symbolic	.18*	.34**	.21**	.16**	.24	.14**
SF: spatial	.03	.13	.15*	.09	.23	−.02
SF: STEM ideas	.14**	.31**	.17**	.18**	.24	.12**
STEM-SF composite	.14	.30**	.19**	.16**	.31*	.09**
General STEM low	.03	.09	.09	.02	.00	.04
General STEM mod.	.08	.25**	.12*	.12*	.05	.07*
General STEM high	.13**	.29**	.19**	.16**	.10	.10**
General STEM Composite	.11	.28**	.16*	.14*	.07	.09**

Model	S-B χ²	df	χ²/df	CFI	RMSEA	CI for RMSEA
Four-factor model
TR baseline	1,597.22	549	2.90	.939	.048	[.043, .052]
U.S. baseline	893.45	549	1.63	.932	.048	[.042, .053]
Configural invariance	2,569.01	1,098	2.34	.936	.047	[.045, .050]
Metric invariance	2,662.85	1,134	2.35	.934	.047	[.045, .050]
Structural invariance	2,675.48	1,140	2.35	.933	.047	[.045, .050]
General STEM Interests Scale
TR baseline	279.29	52	5.37	.957	.069	[.061, .076]
U.S. baseline	113.06	54	2.09	.951	.063	[.047, .079]
Configural invariance	410.50	106	3.87	.953	.069	[.062, .076]
Metric invariance	433.37	117	3.70	.951	.067	[.060, .074]

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by The Scientific and Technological Research Council of Turkey (grant number 114K034).

References

Ackerman

P. L.

(1996). A theory of adult intellectual development: Process, personality, interests, and knowledge. Intelligence, 22, 227–257. doi:10.1016/S0160-2896(96)90016-1

Ackerman

P. L.

(1997). Personality, self-concept, interests, and intelligence: Which construct doesn’t fit? Journal of Personality, 65, 171–204. doi:10.1111/j.1467-6494.1997.tb00952.x

Armstrong

P. I.

Rounds

(2010). Integrating individual differences in career assessment: The atlas model of individual differences and the strong ring. Career Development Quarterly, 59, 143–153. doi:10.1002/j.2161-0045.2010.tb00058.x

Betancourt

T. S.

Yang

Bolton

Normand

(2014). Developing an African youth psychosocial assessment: An application of item response theory. International Journal of Methods Psychiatric Research, 23, 142–160. doi:10.1002/mpr.1420

Cattell

R. B.

(1987). Intelligence: Its structure, growth, and action. New York, NY: North-Holland.

Chen

F. F.

West

S. G.

Sousa

K. H.

(2006). A comparison of bifactor and second order models of quality of life. Multivariate Behavioral Research, 41, 189–225. doi:10.1207/s15327906mbr4102_5

Choi

S. W.

Gibbons

L. E.

Crane

P. K.

(2011). lordif: An R package for detecting differential item functioning using iterative hybrid ordinal logistic regression/item response theory and Monte Carlo simulations. Journal of Statistical Software, 39, 1–30. doi:10.18637/jss.v039.i08

Duckworth

A. L.

Quinn

P. D.

(2009). Development and validation of the Short Grit Scale (Grit-S). Journal of Personality Assessment, 91, 166–174. doi:10.1080/00223890802634290

Edelen

M. O.

Reeve

B. B.

(2007). Applying item response theory modeling to questionnaire development, evaluation, and refinement. Quality of Life Research, 16, 5–18. doi:10.1007/s11136-007-9198-0

10.

Einarsdottir

Rounds

(2009). Gender bias and construct validity in vocational interest measurement: Differential item functioning in the Strong Interest Inventory. Journal of Vocational Behavior, 74, 295–307. doi:10.1016/j.jvb.2009.01.003

11.

Embretson

S. E.

Reise

S. P.

(2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum.

12.

Goldberg

L. R.

Johnson

J. A.

Eber

H. W.

Hogan

Ashton

M. C.

Cloninger

C. R.

Gough

H. G.

(2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40, 84–96. doi:10.1016/j.jrp.2005.08.007

13.

Gottfredson

L. S.

(1986). Occupational aptitude patterns map: Development and implications for a theory of job aptitude requirements. Journal of Vocational Behavior, 29, 254–291. doi:10.1016/0001-8791(86)90008-4

14.

Gottfredson

G. D.

Holland

J. L.

(1996). Dictionary of Holland occupational codes (3rd ed.). Odessa, FL: Psychological Assessment Resources.

15.

Guion

R. M.

(2011). Assessment, measurement, and prediction for personnel decisions (2nd ed.). New York, NY: Routledge.

16.

Gültas

(2014). Work-discipline compound personality scale development with item response theory (Unpublished master’s thesis). Middle East Technical University, Ankara, Turkey.

17.

Hays

R. D.

Morales

L. S.

Reise

S. P

. (2000). Item response theory and healthy outcomes measurement in the 21th century. Medical Care, 38, II28.

18.

Herrera

Gomez

(2008). Influence of equal or unequal comparison group sample sizes on the detection of differential item functioning using the Mantel-Haenszel and logistic regression techniques. Quality & Quantity, 42, 739–755. doi:10.1007/s11135-006-9065-z

19.

Holland

J. L.

(1997). Making vocational choices: A theory of vocational personalities and work environments (3rd ed.). Odessa, FL: Psychological Assessment Resources.

20.

Inda

Rodriguez

Peňa

J. V.

(2013). Gender differences in applying social cognitive career theory in engineering students. Journal of Vocational Behavior, 83, 346–355. doi:10.1016/j.jvb.2013.06.010

21.

James

L. R.

(1998). Measurement of personality via conditional reasoning. Organizational Research Methods, 1, 131–163. doi:10.1177/109442819812001

22.

Kanfer

Wolf

M. B.

Kantrowitz

T. M.

Ackerman

P. L.

(2010). Ability and trait complex predictors of academic and job performance: A person-situation approach. Applied Psychology: An International Review, 59, 40–69. doi:10.1111/j.1464-0597.2009.00415.x

23.

Lent

R. W.

Brown

S. D.

Hackett

(1994). Toward a unifying social cognitive theory of career and academic interest, choice, and performance [Monograph]. Journal of Vocational Behavior, 45, 79–122.

24.

Lent

R. W.

Singley

Sheu

Gainor

K. A.

Brenner

Treistman

Ades

(2005). Social cognitive predictors of domain and life satisfaction: Exploring the theoretical precursors of subjective well-being. Journal of Counseling Psychology, 52, 429–442. doi:10.1037/0022-0167.52.3.429

25.

Lubinski

(2000). Scientific and social significance of assessing individual differences: Sinking shafts at a few critical points. Annual Review of Psychology, 51, 405–444. doi:10.1146/annurev.psych.51.1.405

26.

Mandler

Sarason

S. B.

(1952). A study of anxiety and learning. Journal of Abnormal Psychology, 47, 166–173. doi:10.1037/h0062855

27.

Miller

T. R.

Spray

J. A.

(1993). Logistic discriminant function analysis for DIF: Identification of polytomously scored items. Journal of Educational Measurement, 30, 107–122. doi:10.1111/j.1745-3984.1993.tb01069.x

28.

Nye

C. D.

Rong

James

Fritz

(2012). Vocational interests and performance: A quantitative summary of over 60 years of research. Perspectives on Psychological Science, 7, 384–403. doi:10.1177/1745691612449021

29.

O*NET. (2007). O*NET-SOC 2000 occupations. Retrieved August 5, 2007, from http://online.onetcenter.org/find/career?%20c=15&g=Go

30.

Öner

(1990). Sinav Kaygisi Envanteri El Kitabi. (1. Baski). Istanbul: Yöret Vakfi Yayini.

31.

Ott-Holland

C. J.

Huang

J. L.

Ryan

A. M.

Elizondo

F. E.

Wadlington

P. L.

(2013). Culture and vocational interests: The moderating role of collectivism and gender egalitarianism. Journal of Counseling Psychology, 60, 569–581. doi:10.1037/a0033587

32.

R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/

33.

Reise

S. P.

Morizot

Hays

R.D.

(2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16, 19–31. doi:10.1007/s11136-007-9183-7

34.

Reise

S. P.

(1990). Parameter recovery in the graded response model using MULTILOG. Journal of Educational Measurement, 27, 133–144. doi:10.1111/j.1745-3984.1990.tb00738.x

35.

Samejima

(1997). Graded response model. In van der Linden

W. J.

Hambleton

R. K.

(Eds.), Handbook of modern item response theory (pp. 85–100). New York, NY: Springer-Verlag.

36.

Satorra

Bentler

P. M.

(1988). Scaling corrections for chi-square statistics in covariance structure analysis. ASA 1988 Proceedings of the Business and Economic Statistics Section. Alexandria, VA: American Statistical Association, 308–313.

37.

Smith

A. E.

Dengiz

(2009). Women in engineering in Turkey: A large scale quantitative and qualitative examination. European Journal of Engineering Education, 35, 1–13. doi:10.1080/03043790903406345

38.

Spaeth

J. L.

(1979). Vertical differentiation among occupations. American Sociological Review, 44, 746–762. Retrieved from http://www.jstor.org/stable/2094526

39.

Rounds

Armstrong

P. I.

(2009). Men and things, women and people: A meta-analysis of sex differences in interests. Psychological Bulletin, 135, 859–884. doi:10.1037/a0017364

40.

Tay

Drasgow

Rounds

Williams

B. A.

(2009). Fitting measurement models to vocational interest data: Are dominance models ideal? Journal of Applied Psychology, 94, 1287. doi:10.1037/a0015899

41.

Toker

(2010). Non-ability correlates of the science/math trait complex: Searching for personality characteristics and revisiting vocational interests (Doctoral dissertation). Georgia Institute of Technology , Atlanta, GA.

42.

Toker

Ackerman

P. L.

(2012). Utilizing occupational complexity levels in vocational interest assessments: Assessing interests for STEM areas. Journal of Vocational Behavior, 80, 524–544. doi:10.1016/j.jvb.2011.09.001

43.

Tracey

T. J. G.

(2002). Personal globe inventory: Measurement of the spherical model of interests and competence beliefs. Journal of Vocational Behavior, 60, 113–172. doi:10.1006/jvbe.2001.1817

44.

Tracey

T. J. G.

Robbins

S. B.

(2006). The interest-major congruence and college success relation: A longitudinal study. Journal of Vocational Behavior, 69, 64–89. doi:10.1016/j.jvb.2005.11.003

45.

Tracey

T. J. G.

Watanabe

Schneider

P. L.

(1997). Structural invariance of vocational interests across Japanese and American cultures. Journal of Counseling Psychology, Special Section: Cross-Cultural Career Psychology, 44, 346–354.

46.

U.S. Department of Labor. (1991). Dictionary of occupational titles (4th ed.). Washington, DC: U.S. Government Printing Office.

47.

Van Iddekinge

C. H.

Putka

D. J.

Campbell

J. P.

(2011). Reconsidering vocational interests for personnel selection: The validity of an interest-based selection test in relation to job knowledge, job performance, and continuance intentions. Journal of Applied Psychology, 96, 13–33. doi:10.1037/a0021193

48.

Van Iddekinge

C. H.

Roth

P. L.

Putka

D. J.

Lanivich

S. E.

(2011). Are you interested? A meta-analysis of relations between vocational interests and employee performance and turnover. Journal of Applied Psychology, 96, 1167–1194. doi:10.5465/AMBPP.2011.65869729

49.

Zumbo

B. D.

(1999). A handbook on the theory and methods of Differential Item Functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Ontario: Directorate of Human Resources Research and Evaluation, Department of National Defense.

Scale	N ^a	Discrimination Parameter			Factor Loadings
Scale	N ^a	Range	Mean	Median	Range	Mean	Median
Numeric
Low	5	0.484–2.020	1.035	0.755	.274–.765	.475	.406
Moderate	12	1.083–3.273	2.010	1.884	.537–.887	.734	.741
High	5	0.787–3.263	2.369	2.567	.420–.887	.758	.834
Symbolic
Low	4	1.598–1.978	1.804	1.820	.685–.758	.726	.730
Moderate	14	1.601–2.722	2.290	2.340	.686–.848	.798	.809
High	9	1.728–2.528	2.039	2.008	.713–.830	.765	.763
Spatial
Low	4	0.832–1.671	1.167	1.082	.440–.701	.553	.536
Moderate	8	1.262–3.503	2.070	1.883	.596–.900	.741	.742
High	8	1.358–3.152	2.430	2.641	.624–.880	.803	.841
STEM ideas
Low	1	2.492	—	—	.826	—	—
Moderate	10	1.613–2.521	2.102	2.034	.697–.829	.771	.767
High	15	1.681–2.584	2.110	2.118	.703–.835	.774	.780
General STEM
Low	2	1.242–1.620	1.430	—	.590–.690	.640	—
Moderate	9	1.928–3.114	2.611	2.526	.750–.878	.833	.830
High	4	2.629–3.127	2.850	2.821	.840–.879	.858	.856
Reverse-coded items	12	0.260–0.982	0.810	0.807	.151–.500	.429	.429