Evaluating Construct Validity of the Middle School Self-Efficacy Scale With High School Adolescents

Abstract

This study presents an exemplar for psychometric evaluation and modification of established measures when applied to new populations. Specifically, we describe the use of two subscales (Career Decision-Making Self-Efficacy Scale and Math/Science Self-Efficacy Scale) from the Middle School Self-Efficacy Scale (MSSES) as outcome measures in an intervention study of high school students. Several researchers have utilized the MSSES with high school students since it was developed by Fouad, Smith, and Enochs 20 years ago, but few studies have examined it for construct validity with a high school sample, even though the measure was designed for a middle school population. Our findings demonstrated that the MSSES required modification for high school students in order to meet the standards of reliability and validity in a counseling intervention study. The discussion focuses on implications for career counseling and research, limitations of the findings, and suggestions for future research.

Keywords

career decision-making career-related self-efficacy high school students psychometric evaluation

Career counseling researchers recommend that counseling research not be published unless researchers have “minimal indications of the reliability and validity of the instrument on which the research is based” (Zytowski & Betz, 1972, p. 78) and recommend construct validity as the most feasible method for establishing new and existing instruments in counseling research (Oliver, 1979). In particular, only valid and reliable scores of outcome measures indicate effectiveness of a planned intervention in experimental design research. Throughout the research community, investigators agree that using an existing measure in a contextually different manner (e.g., with a new population) requires a pilot study of reliability and validity before employing said measure in a new study (Kimberline & Winterstein, 2008; Switzer, Wisniewski, Belle, Dew, & Schultz, 1999). Modifications to an existing measure may be necessary if the original validity study for an outcome measure employed a different sample than current research (Stewart, Thrasher, Goldberg, & Shea, 2012).

The purpose of this research is to present an exemplar validity study of an established measure in career counseling, the Middle School Self-Efficacy Scale (MSSES; Fouad, Smith, & Enochs, 1997) using a sample of high school students. Results of this study were used to modify the MSSES for an intervention study using a similar sample (Falco & Summers, 2017). The following sections describe the need for measures specific to social cognitive career theory (SCCT), foundations for developing the MSSES with previous reliability/validity information, the importance of testing validity on existing measures in new populations, and the need for assessment studies whenever modifications on existing measures are made.

SCCT

Self-efficacy refers to individuals’ perceptions about their capabilities for learning or performing tasks within specific domains. Since Bandura (1977, 1997) introduced the construct of self-efficacy, researchers have explored its role in various contexts including career development. In social cognitive theory, self-efficacy is said to influence behaviors and environments and, in turn, is influenced by them (Bandura 1986, 1997). Students with strong self-efficacy are more likely to set goals and create adaptive learning environments for themselves. Likewise, self-efficacy can be influenced by the outcomes of such behaviors (goal progress, achievement) and by input from the environment (e.g., feedback from teachers, social comparisons with peers). Bandura (1997) theorized that people acquire their self-efficacy beliefs from four sources: interpretations of performances, vicarious (modeled) experiences, social (verbal) persuasion, and physiological indexes (emotional arousal). It is generally agreed that even very young children differentiate their beliefs of competence and task value in different domains of functioning (e.g., Eccles et al., 1993; Marsh, Craven, & Debus, 1991). Studies with middle and high school students often assess students’ motivational orientations toward specific academic domains, with an understanding that they hold more or less differentiated perceptions toward these areas. High school students have more academic experience, which can help them better attune to the demands and possibilities of each domain, which would in turn contribute to finer differentiation between domains. In particular, as a result of their heavier concern with future college majors and career choices, high school students are believed to hold more differentiated task-value beliefs compared with middle school students (Bong, 2001).

Bandura’s (1997) self-efficacy construct, situated within social cognitive theory, is considered to have broad and important implications for career development theory and counseling (Betz, 2000). It has generated a great deal of research on career behavior in large part because of its explanatory potential. For example, Hackett and Betz (1981) first used self-efficacy theory, within a career development context, to explain women’s avoidance of math and science careers. Their subsequent studies (e.g., Betz & Hackett, 1981, 1983) oriented researchers to the ways in which self-efficacy can be used to understand, more broadly, individuals’ vocational behaviors.

Similar to Bandura’s (1986) social cognitive theory, SCCT (Lent, Brown, & Hackett, 1994) posits that self-efficacy is developed through learning experiences that interact with person/environment variables such as gender, ethnicity, social support, and barriers. Within the SCCT, individuals’ choices to pursue or avoid certain academic coursework and careers can be understood as the interplay between their self-efficacy beliefs, outcome expectations, and interests. SCCT posits that career choice behavior is influenced by outcome expectancies, interests, and career self-efficacy. The theory proposes an interactional influence of external/environmental factors and individual/cognitive variables on individuals’ career development. Within this model, one’s background influences one’s learning experiences which influence self-efficacy. Self-efficacy shapes one’s interest and outcome expectations which, ultimately, influence career choice (Lent, Brown, & Hackett, 2000).

The concept of self-efficacy has important theoretical, empirical, and practical applications because of the nomological network in which it is embedded (Cronbach & Meehl, 1955). Specifically, self-efficacy expectations have at least three behavioral outcomes (approach or avoidance, performance, and persistence), and it is these outcomes that make perceived self-efficacy an important explanatory construct (Betz, 2000; Klassen & Usher, 2010). By focusing on the relationship between cognitions, behaviors, and other environmental factors that, presumably, are malleable and responsive to intervention, SCCT provides researchers and practitioners a heuristic for both understanding and assessing the core constructs. Because of its precise nomological network, SCCT provides a mechanism for (1) identifying specific variables (such as self-efficacy) for intervention and (2) identifying specific outcomes (such as career choice behaviors) for assessment.

In this vein, SCCT has received much attention in the literature because of its prominent role in the implementation of career development interventions and the assessment of such interventions (Betz, 2004; Betz & Luzzo, 1996). Several studies evaluating career or vocational guidance interventions have demonstrated increases in self-efficacy by addressing one or more of the sources of self-efficacy originally proposed by Bandura (1997) including verbal persuasions, previous performance accomplishments, vicarious or observational learning, and emotional arousal (see Betz & Schifano, 2000; Falco & Summers, 2017; Hackett & Betz, 1981; Luzzo, Hasper, Albert, Bibby, & Martinelli, 1999; Sullivan & Mahalik, 2000).

However, as Betz and Hackett (2006) caution, researchers and practitioners interested in the application of SCCT must fully understand its meaning and implications, particularly with regard to measuring its constructs. This suggests that any measure attempting to assess constructs within the SCCT must consider the domain specificity of the behaviors being examined while also considering traditional methods of evaluation such as factor structure, internal consistency, and construct validity based on such concepts as Cronbach and Meehl’s (1955) nomological network. It is important to keep in mind that a score is valid if what was intended to be measured was in fact measured. Validity refers to the degree to which evidence supports the inferences that are drawn from the measurement instruments or procedures (e.g., interventions) themselves. Therefore, a particular score derived from an established measure may be valid for one purpose or population but has little or no validity for another (Gullickson & Howard, 2009; Yarbrough, Shula, Hopson, & Caruthers, 2010). For the purposes of our own intervention study (Falco & Summers, 2017), we wished to use Fouad, Smith, and Enochs’s (1997) MSSES to investigate change over time and between an experimental and control group of high school girls. Girls were particularly salient in our evaluation of the MSSES because compared to boys, their self-efficacy for learning STEM content and pursuing STEM careers tends to decline in adolescence (Bandura, Barbarnelli, Caprara, & Pastorelli, 2001). Our intervention was specifically designed to support and increase girls’ sense of self-efficacy for STEM (Falco & Summers, 2017). However, the MSSES had not been evaluated for construct validity among high school samples, even though there are several studies that have used the measure for similar samples. The following section provides a review of the MSSES and a rationale for conducting further validity testing with an established measure.

The MSSES

The MSSES (Fouad et al., 1997) is widely used in career counseling interventions for adolescents and was originally developed to assess a career-related self-efficacy intervention for Hispanic and Latino middle school students. The instrument consists of 46 scale-response items total with two subscales (24 items) designed to specifically measure aspects of self-efficacy: career decision-making self-efficacy (CDMSE), a process variable (12 items), and math/science (STEM; Science, Technology, Engineering, and Math) self-efficacy (MSSE), a content variable (12 items). Responses are obtained using a 5-point Likert-type scale asking students to rate the degree to which they agree or disagree with a series of statements ranging from strongly agree (1) to strongly disagree (5).

When Fouad and her colleagues (1997) first developed and established the MSSES, items for CDMSE were modified from the CDMSE Scale (CDMSES; Taylor & Betz, 1983). Items in the Math and Science Self-Efficacy subscale followed the format used in the Math Tasks subscale of the MSSES (Betz & Hackett, 1983; Lent, Lopez, & Bieschke, 1993). A reliability and validity analysis was first conducted separately for the process items (that included the CDMSE) and the content items (that included the MSSES) to determine their initial factor structure. Fouad and her colleagues concluded that all 12 items from the CDMSE formed a distinct factor, while only the math items from the MSSES held up under the scrutiny of a validity test. Second, remaining items of the process and content scales were combined to determine whether they were distinct constructs from one another, and the results concluded this to be true. Criterion validity was established by calculating subscale means and applying the instrument to an intervention. Fouad and her colleagues claim that their study demonstrated adequate reliability and validity of their instrument and that these scales measure outcomes of intervention programs designed to promote career decision-making and math/science career awareness among middle school students, particularly for female and minority students. It appears as though the three recommended steps for establishing construct validity of their instrument were followed using structural validity techniques (Clark & Watson, 1995; Cronbach & Meehl, 1955): “(a) articulating a set of theoretical concepts and their interrelations, (b) developing ways to measure the hypothetical constructs proposed by the theory, and (c) empirically testing the hypothesized relations among constructs and their observable manifestations” (Clark & Watson, 1995, p. 310).

Since the publication of the MSSES, other researchers have used it in a myriad of studies with varied selection of items and samples (see Table 1). By conducting a search of published articles and dissertations citing Fouad et al. (1997), we found that 10 studies used some version of the CDMSE: 6 were for middle school samples, 1 was a middle–high school sample, and 3 were a high school sample. We found six studies used some version of the MSSES: four were a middle school sample, one was a high school sample, and one was a college sample. Only one study used both the CDMSE and the MSSES for a middle school sample. These studies are summarized in Table 1, which provides a description of the items, samples, reliability estimates, and validity evidence. We chose to use the MSSES because it was designed to measure STEM career self-efficacy using SCCT as a framework. Our intervention was designed to support STEM career self-efficacy of high school girls incorporating the four sources of self-efficacy while also addressing perceived barriers, including gender issues in career development (Falco & Summers, 2017). Low self-efficacy and career indecision, together, may create important psychological barriers to girls’ choice and persistence in career decision-making, especially for traditionally male-dominated occupations in STEM. At the time of the intervention, the MSSES was established and widely used, and it seemed to be the best measure available to tap these constructs (MSSE and career decision self-efficacy [CDSE]) for adolescents.

Table 1.

Summary of Research Using Career Decision-Making Self-Efficacy (CDSE), Math/Science Self-Efficacy (MSSE), or Both From the Middle School Self-Efficacy Scale.

Citations	Items Used (of 12)	Population	Location/Sample	n	Reliability	Validity
Used all or part of CDSE
Arulmani, Van Laar, and Easton (2001)	All	high school boys	Bangalore, South India	755	No	No
Creed, Tilbury, Buys, and Crawford (2011)	Nine items	Adolescents (13–18)	Queensland, Australia—in out-of-home care (80% White)	202	0.89	Factor analysis indicated one factor accounting for 55% of variance
Hamel (2014)	All	Middle school	Midwest, 44% White	52	0.83 (all 22 items of Part I)	No
Keller and Whiston (2008)	All	Middle school	Mostly White	192	0.77	No
Macht Jantzer, Stalides, and Rottinghaus (2009)	All	Middle school	Rural midwest, 67% White	820	0.81	No
Ojeda et al. (2012)	All	Middle school	Texas, Latino	338	0.74	No
Olle and Fouad (2015)	All	High school	Midwestern city, 78% Latino	137	0.85	No
Sawitri, Creed, and Zimmer-Gembeck (2014)	All	High school	Indonesia	351	0.78	Used item parceling to reduce the number of indicators per factor (3) before testing model fit followed by exploratory factor analysis and confirmatory factor analysis
Sickinger (2013)	All	Middle school	New England, mostly White	200	0.79	No
Turner, Alliman-Brissett, Lapan, Udipi, and Ergun (2003)	All	Middle school	Midwestern city, 34% native American, 31% Black, and 23% White	293	0.88	No
Used part or all of MSSE
Ferry, Fouad, and Smith (2000)	Six items (math only)	College	Midwestern, mostly White (85%)	791	0.81	No
Garcia (2012)	All	High school	Orange County, CA, Latino	317	0.88	No
Jackson (2014)	Six items (science only)	Middle school girls	Texas, Latina	90	0.67	No
Mueller, Hall, and Miro (2015)	Seven items	Middle school	Midsouth, 45% Black	104	0.80	No
Navarro, Flores, and Worthington (2007)	All	Middle school	Texas, Latino	426	0.86	No
Song (2004)	All	Middle school	Midwest	90	0.81	No
Used both CDSE and MSSE
Howard et al. (2012)	All	Middle school	Midwest	Six schools	No	No

As evident from Table 1, most of the published studies and dissertations found reliability evidence for their samples, but few conducted a validity analysis on self-efficacy scores from the MSSES, even when their sample demographics differed from that of Fouad’s. According to the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014), Standard 1.4 states,

If a test score is interpreted for a given use in a way that has not been validated, it is incumbent on the user to justify the new interpretation for that use, providing a rationale and collecting new evidence, if necessary (p. 24).

However, many of the studies in Table 1 simply cited the evidence from the Fouad et al. study as evidence of validity. Even Fouad and Guillen (2006) recommend that because there have been many adaptations of the Fouad et al. (1997) instrument without replication of the studies, the reliability and validity of the adapted instruments need to be further evaluated.

Evaluating Validity of Existing Measures for New Populations

Consistent with the concept of measurement evolution is the perspective that the validity of the measure is not a property of a test or measure, but of a measure tested under a particular set of conditions (Messick, 1995; Sechrest, 2005). This perspective holds that because validity pertains to understanding the meaning of scores, construct validity can only be established incrementally based on the accumulation of evidence on how the measure relates to other measures (Sechrest, 2005). Well-designed modifications thus contribute to enhancing the validity of measures in new and diverse populations. According to Zumbo (2006), “construct validity involves generalizing from our behavioral or social observations to the conceptualization of our behavioral or social observations in the form of the construct” (p. 49). Tests of construct validity are strong when they are based on well-articulated theory and well-planned empirical tests. When trying to establish (or reestablish) construct validity of a measure, factor analytic techniques are useful in determining whether a group of items hypothesized to assess a construct actually do cluster together when they are analyzed with items from other scales and whether items within a measure describe a unified construct (Cronbach & Meehl, 1955). Specific procedural recommendations for establishing construct validity in survey research are as follow (Bohrnstedt, 2010, p. 377):

Do an exploratory factor analysis (EFA) of all items in an initial pool.

Retain enough factors (m) to explain the covariation among the items using fit statistics as a guide.

When m > 1, examine both the rotated and unrotated solutions to determine whether factors beyond the first are substantively meaningful or unwanted “nuisance” factors.

Remove items that are poorly related to no factors or clearly represent more than one domain.

Refactor the remaining items using confirmatory factor analysis (CFA) to verify that they are congeneric or near congeneric.

If a measure does not perform in the above steps as expected, it behooves researchers to make modifications to the measure to meet the standards of validity and reliability while maintaining theoretical integrity of the instrument. According to Stewart, Thrasher, Goldberg, and Shea (2012), a tremendous amount of information would be gained on how various modifications affect the reliability and validity of measures in new populations, as well as point to new strategies and methods for test and measurement assessment, if assessment studies of measure modifications were published. Stewart and his colleagues suggest that researchers should provide details of the modification and its assessment in a separate methods paper by reporting: (1) features of the original measure that required modification, (2) source of information on the basis for modifications, (3) specific type of modification made, and (4) how the modified measure was tested for psychometric adequacy and results.

The recommendation in measurement research is to test for the applicability of existing measures for a new sample by following the aforementioned process (EFA followed by CFA), as well as testing factor invariance across groups (Sass, 2011; Van de Schoot, Lugtig, & Hox, 2012). Examining measurement invariance involves evaluation of the latent variable model underlying a set of test scores and testing for numerical equality across groups. Latent variables are the explicit definitions of psychological constructs (Byrne, Shavelson, & Muthen, 1989; Widaman & Reise, 1997). When researchers are concerned only in the extent to which an instrument is equivalent across independent samples, measurement equivalence generally focuses solely on the invariant operation of the items and, in particular, on the factor loadings (Byrne, 2012). Should a researcher be interested in subsequently testing for latent factor mean differences, then tests for measurement equivalence must include a test for the equality of the observed variable intercepts as such equality is assumed in tests for factor mean differences. This was an important consideration for us to make in our analysis as many researchers listed in Table 1 have tested for gender differences using the MSSES, but it has not been established that the measure is valid for boys and girls using factor loadings invariance and/or testing for latent factor mean differences. Therefore, our research set out to test the structural validity of the MSSES with a sample of high school students in hopes that these results were applicable in a separate intervention study with a similar sample (Falco & Summers, 2017).

Method

Participants

Paper and pencil surveys were administered to 368 tenth graders attending a medium-sized, public high school in southeastern Arizona (47% male; 31% White, 27% Latino/a, 24% Asian American,1% Native American, 1% African American, and 13% mixed race or Other). Data were collected anonymously by the school counselor. The sample was randomly divided into two groups, so that we could run exploratory factor analyses on a sample separate from the CFAs (see Table 2 for descriptive data of each sample).

Table 2.

Sample Demographics.

Sample	Random Sample A (n = 182)	Random Sample B (n = 186)	Total (N = 368)
Ethnicity
White	57	58	115
African American	2	4	6
Hispanic	49	51	100
Asian American	45	42	87
Native American	2	2	4
Other	24	25	49
Gender
Male	86	88	174
Female	96	98	194

Measures

CDMSE

Fouad et al. (1997) adapted their measure of career decision-making for middle school students from the items originally developed for adult populations by Taylor and Betz (CDMSE: Taylor & Betz, 1983; CDMSES–Short Form: Betz, Klein & Taylor, 1996), particularly as they relate to the “process items” in Part I of the Fouad et al. (1997) scale. Items from this and all other subscales begin with the prompt, “Please indicate the degree to which you agree or disagree that you could do each statement below by writing the appropriate number to the right of each statement” using the following rating system: 1 = very high ability, 2 = high ability, 3 = uncertain, 4 = low ability, and 5 = very low ability (item values were reversed after scoring such that a higher value indicated higher self-efficacy). We specifically administered the 12 self-efficacy items of the CDMSE measure, which had an overall reliability estimate of α = .77 for our initial sample (n = 182). See Table 3 for a complete list of items.

Table 3.

Descriptive Statistics, Standardized Loadings for Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA), and Solution Uniqueness for the Career Decision-Making Self-Efficacy Scale.

Items: Middle School Career Decision-Making Self-Efficacy (Indicate the Degree to Which You Agree or Disagree That You Could Do Each Statement)	Descriptive Statistics				EFA Solution		CFA Solution
	M	Variance	Skew	Kurt	Self- Appraisal	Exploring Option	Factor Loadings		Solution Uniqueness
	M	Variance	Skew	Kurt	Self- Appraisal	Exploring Option	Male	Female	Male	Female
1. Find information in the library about five occupations that I am interested in	2.807	1.239	0.122	−0.668	.519*	.034	.735***	.396***	.540	.157
2. Make a plan of my educational goals for the next 3 years	2.824	1.859	0.151	−1.236	.830*	−.059	.909***	.603***	.825	.364
3. Select one occupation from a list of possible occupations I am considering	2.978	1.251	0.116	−0.780	.819*	.072	.872***	.503***	.761	.253
4. Determine what occupation would be best for me	2.989	1.654	0.084	−1.118	.893*	−.012	.938***	.710***	.879	.503
5. Decide what I value most in an occupation	2.944	1.864	0.074	−1.270	.854*	−.022	.929***	.635***	.863	.403
6. Resist attempts of parents or friends to push me into a career that I believe is beyond my abilities or not for me	2.899	1.979	0.143	−1.232	.671*	−.145*	.732***	.367***	.536	.134
7. Describe the job skills of a career I might like to enter	4.056	0.824	−1.052	1.320	−.072	.656*	.797***	.732***	.635	.536
8. Choose a career in which most workers are the opposite sex	3.339	0.768	0.028	0.130	−.098	.060
9. Choose a career that will fit my interests	4.461	0.660	−1.834	3.738	−.074	.713*	.872***	.814***	.760	.663
10. Decide what kind of schooling I will need to achieve my career goal	4.122	0.918	−0.890	0.280	−.056	.765*	.920***	.865***	.847	.749
11. Find out the average salary of people in an occupation	3.950	0.919	−0.736	0.300	.121	.616*	.801***	.704***	.642	.496
12. Talk with a person already employed in a field I am interested in	4.128	0.945	−1.129	1.106	.002	.587*	.785***	.723***	.616	.523

Note: Gray shadings indicate items retained for the current study. *p < .01; **p < .01; ***p < .001

MSSE

Fouad et al. (1997) used a format similar to the items from the Math Tasks subscale of the MSSES (Betz & Hackett, 1983; Lent et al., 1993) and was intended to measure MSSE beliefs for middle school–related tasks. Students were asked to rate their level agreement/disagreement for each item using the following rating system: 1 = strongly agree, 2 = agree, 3 = uncertain, 4 = disagree, and 5 = strongly disagree. We specifically administered the 12 self-efficacy items for the MSSES, which had an overall reliability estimate of α = .86 for our initial sample (n = 182). See Table 4 for a complete list of items.

Table 4.

Descriptive Statistics, Standardized Loadings for Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA), and Solution Uniqueness for the Math/Science Self-Efficacy Scale.

Items: Middle School Career Decision-Making Self-Efficacy (Indicate the Degree to Which You Agree or Disagree That You Could Do Each Statement)	Descriptive Statistics				EFA Solution		CFA Solution
	M	Variance	Skew	Kurt	Grades	Content	Factor Loadings		CFA Solution Uniqueness
	M	Variance	Skew	Kurt	Grades	Content	Male	Female	Male	Female
1. Earn an A in math	3.503	1.401	−.462	−.746	1.025*	−.162*	.467***	.918***	.218	.843
2. Earn an A in science	3.928	1.000	−.955	.691	0.487*	.198*	.380***	.386***	.144	.149
3. Get an A in math in high school	3.461	1.537	−.453	−.770	0.922*	−.028	.962***	.914***	.925	.835
4. Get an A in science in high school	3.806	1.168	−.771	.067	0.472*	.267*				.503
5. Determine the amount of sales tax on clothes I want to buy	3.358	1.157	−.316	−.393	0.194*	.261*
6. Collect dues and determine how much to spend for a school club	3.316	1.109	−.280	−.411	0.120	.411*	.773***	.703***	.598	.495
7. Figure out how long it will take to travel from Milwaukee to Madison, driving at 55 mph	3.244	1.585	−.250	−.918	0.121	.626*	.700***	.631***	.490	.398
8. Design and describe a science experiment that I want to do	3.358	1.275	−.308	−.599	−0.082	.645*	.752***	.613***	.566	.376
9. Classify animals that I observe	3.661	1.135	−.644	−.104	−0.192*	.765*	.647***	.567***	.418	.321
10. Predict the weather from weather maps	3.056	1.267	−.110	−.520	0.163*	.739*	.763***	.602***	.581	.362
11. Construct and interpret a graph of rainfall amounts by state	3.045	1.481	−.067	−.816	−0.045	.778*	.804***	.720***	.646	.519
12. Develop a hypothesis about why kids watch a particular TV show	3.525	1.136	−.501	−.426	−0.052	.603*	.792***	.615***	.550	.379

Note: Gray shadings indicate items retained for the current study. *p < .01; **p < .01; ***p < .00

Analyses

CFAs of an instrument are most appropriately applied to measures that have been fully developed and their factor structures validated (Byrne, 2012). However, Fouad et al.’s original factor structure was tested and recommended for middle school students who were predominantly Latino/a, even though it has been used widely for other samples (see Table 1). For our purposes, we wanted to validate the measure for 10th-grade students attending public high school in a midsize southwestern city, since our intervention study was conducted with this population (see Falco & Sumers, 2017). Therefore, we decided to begin with an EFA for self-efficacy items on a separate sample before conducing a CFA, a method commonly used in scale development research to establish construct validity (Worthington & Whittaker, 2006). Once the factor structure was established with CFA, we followed up with a test of structural invariance (including latent factor mean invariance) by comparing the equivalence of responses for boys and girls, then selected items with structural validity for girls only, since this was the population used in our intervention (Falco & Summers, 2017).

Results

Exploratory Factory Analysis

From an initial sample of 358 tenth graders, we randomly selected half (182: 86 boys and 96 girls) to conduct the two exploratory factor analyses, the first for 12 items from the CDMSE and the second for 12 items from the MSSE. For each test, we tried a one- and two-factor model solution in Mplus version 8 (Muthén & Muthén, 2012) based on the number of eigenvalues >1 accounted for by the data (Worthington & Whittaker, 2006). The data were normally distributed for each of the two scales and many items appeared to be correlated, thus we used maximum likelihood estimation method, applied oblimin rotation (due to several apparent correlations between items), and analyzed each model for goodness of fit (see Tables 3 and 4). The two-factor model fits the data best for both the CDMSES and the MSSES, with a significant χ² difference test between the one-factor and two-factor models: CDMSE, Δχ²(11) = 240.191, p < .001; MSSE, Δχ²(11) = 255.410, p < .001. Item 8 from the CDMSE did not load significantly on either factor, so it was not included in subsequent analyses. Items 4 and 5 from the MSSE cross loaded significantly on both factors, so they were not included in subsequent analyses. Upon initial examination, the exploratory factor structure is different for our sample than Fouad’s who found that all 12 CDMSE items loaded on one factor and only the math items (6 items) from the MSSE loaded on one factor in her CFA (Fouad et al. did not start with an EFA).

CFA With Tests for Invariance

Using the items retained from the EFA, we conducted separate CFAs for the CDMSE and the MSSE, with two separate factors each. For the CDMSE, we named the factors “efficacy for self-appraisal” for Items 1–6 and “efficacy for exploring options” for Items 7 and 9–12. For the MSSE, we named the factors “math/science grades self-efficacy” for Items 1–3 and “math/science content self-efficacy” for Items 6–12. All items loaded significantly (see Tables 3 and 4), and there were no suggested constraints to be released for boys and girls. Neither one of the models met acceptable criteria for good fit: a comparative fit index (CFI) ≥ .90, a Tucker–Lewis index (TLI) ≥ .90, a standardized root mean square residual (SRMSR) = .08, a root mean square error of approximation (RMSEA) = .05, and a nonsignificant χ² (Browne & Cudeck, 1992; Hu & Bentler, 1999; Kline, 2015): CDSE, χ²(86) = 229.788, p < .001, CFI = .804, TLI = .795, RMSEA = .134 (90% confidence interval [CI] = [.114, .156]), SRMSR = .099; MSSE, χ²(106) = 200.656, p < .001, CFI = .867, TLI = .862, RMSEA = .098 (90% CI [.077, .119]), SRMSR = .133. Additionally, the factor we labeled “self-appraisal” from the CDMSE was significantly higher for boys than for girls (standardized estimate of latent mean difference = 3.513, p < .001). Upon examination of each item’s contribution to solution uniqueness, we noticed that Items 1 and 6 from the CDMSE were somewhat low for girls (R² = .157, R² = .134, respectively), and Item 2 from the MSSE was low for boys (R² = .144) and girls (R² = .149). Although these items loaded significantly and were estimated as equivalent (i.e., noninvariant) between boys and girls, we wanted to use the best and most parsimonious set of items for our own study (which included only girls) and to describe our process if other researchers decide to use modified versions of the MSSES in the future (Worthington & Whittaker, 2006).

Deciding on Measures for Our Intervention Research

Because our related intervention study was conducted with only girls, we ran a single CFA analysis with just girls from the second sample, a modified version of “exploring options” factor of the CDMSE (retaining 5 items that had strong R² values) and a modified version of “content self-efficacy” factor of the MSSE.¹ Results indicated that the instrument had good fit to the model (Hu & Benter, 1999; Marsh, Hau, & Wen, 2004), and we were confident using this version of the measure in our intervention study of girls attending the same school as our validation sample: χ²(43) = 103.570, p < .001, CFI = .921, TLI = .900, RMSEA = .085 (90% CI [.064, .106]), SRMSR = .071.

Discussion

It is recommended by the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (2014) that when researchers use an instrument outside of its original intention or for a different population, a justification for validity must be made. Although the MSSES demonstrated evidence of construct validity for middle school students when it was first developed (Fouad et al., 1997), it has been modified and/or used for dissimilar populations without evidence of validity in several studies. For our own purposes, we wanted to demonstrate that the subscales we used from the MSSES were valid for the diverse population in our intervention study (see Falco & Summers, 2017). By conducting a separate validity study, we hope to provide other researchers with an example of this process. Our results contribute to the career development literature by providing new information on how modifications of the widely used MSSES affect the psychometric qualities of scores derived from the measure, having implications for its use in intervention studies. Therefore, our discussion is organized into two main areas of focus. First, we address the general importance of evaluating and reporting psychometric properties of modified instruments in counseling outcome research, then we address the specific findings of the present study, including the implications for practice and further research.

As Zytowski and Betz (1972) point out, counseling research is comprised of a number of discrete elements including theory, hypothesis development, design, participants, controls, and measurement. Our need to use and modify the MSSES (Fouad et al., 1997) arose from the need to evaluate the effectiveness of a career development intervention to improve CDMSE and STEM self-efficacy for high school girls. The intervention design utilized SCCT (Lent et al., 1994), and we hypothesized that CDMSE and STEM self-efficacy would increase for girls who participated in the small group intervention compared with girls in a no-treatment control group. In order to evaluate the effectiveness of the intervention for improving these specific outcomes, it was important that we measure them as accurately as possible. We chose the MSSES because it is an established and widely used measure, specifically developed for assessing the effect of intervention programs designed to promote career decision-making and math and science career awareness. Coupled with a need to measure the impact of such interventions in more precise and meaningful ways (Oliver, 1979; Whiston & Quimby, 2009; Zytowski & Betz, 1972), results of career intervention studies can neither advance theory nor improve practice unless they include a more complete picture of both the process of the intervention and the outcome(s) used to gauge the effectiveness of the intervention.

Because validity is context specific (Switzer et al., 1999), instruments that are valid for one purpose in one context may not be valid in another context. As such, it stands to reason that validity should be viewed as a process of accumulating evidence that supports the meaningfulness of a measure rather than as a discrete end point at which validity is “proven” (Stewart & Ware, 1992). An important point made by Zytowski and Betz (1972) was that information on reliability and validity should be reported routinely on all instruments used in counseling research. It is most efficient to use already developed measures for which reliability and validity have been established, but researchers and practitioners must be aware that, in order to obtain comparable results, it is necessary to use the instrument in the same way and for the same populations as the original researcher(s). Frequently, career counseling outcome studies use only portions of established instruments, something we found in our own review of the MSSES (Fouad et al., 1997; see Table 1). Unless the reliability and validity of a modified instrument can be established, there is no assurance that it measures the same construct or is as reliable as the original instrument. There can also be no assurance that the intervention being evaluated is having the desired effect on the outcome of interest.

For our sample of high school students, we found the structure of the measure quite different for the CDSES and MSSES. First, an EFA was conducted to see whether the factor structure proposed by Fouad et al. for the middle school sample was similar for our high school sample, even though Fouad et al. did not conduct an EFA in their study. Specifically, the CDSE appeared to be measuring two constructs for our sample, which we labeled “self-appraisal” and “exploring options.” Perhaps this difference is evident among high school students because they are closer to making professional development decisions than middle school students and are able to make subtle distinctions between evaluating what they are capable of and what needs to be done. Also, the factor structure of the MSSES did not divide into science- and math-related items but rather divided into Performance and Content scales that contained both math and science items. These findings remain consistent with theory but suggest that, for our study sample, the items in the scale are measuring task specificity related to math and science. This has important implications because, as Betz (2000) notes, uncovering this distinction helps researchers and practitioners better understand the nature of people’s choices. It also sheds light on the nature of the decision-making process for those interested in designing career development interventions.

Second, once we discerned the appropriate structure of the CDMSE and MSSES for high school students, we continued with a CFA and looked at structural invariance of the instruments between boys and girls with a second sample. Results showed that while all the factor loadings were significant and there were no constraints to be released, there were a few items that had lower contributions to the overall solution uniqueness, particularly for girls. Also, the overall fit of both the CDMSES and MSSES models was below what is recommended as acceptable, but this might have been due to a small sample size (Byrne, 2012; Marsh et al., 2004). Additionally, the RMSEA may be considered acceptable for smaller data sets in educational settings (Browne & Cudeck, 1992), and χ² is known to be sensitive to larger correlations (Kline, 2015; Tanaka, 1993). Finally, we found latent mean differences between boys and girls for the “Self-Appraisal” subscale of the CDMSE and would therefore be appropriate for examining differences between boys and girls in a future study. For our own purposes, we decided to run a second CFA with just girls to ensure the subscales we wanted to use were valid, and the model had adequate fit.

Within the scope of self-efficacy research, Klassen and Usher (2010) have cited numerous issues in self-efficacy research with regard to measurement and have called for a stocktaking of the directions and domains of current self-efficacy research, so that measurement problems can be addressed. Evaluating established measures of self-efficacy for validity among different populations may be part of this stocktaking processes (Switzer et al., 1999). We expect that the differences we found between Fouad et al.’s results and ours are possibly because children’s ability-related beliefs and values become more negative in many ways as they get older, at least through early adolescence. This would explain why the high school students differentiated between different, more specific types of CDMSE and math–science self-efficacy beliefs than middle school students.

Older adolescents tend to believe they are less competent in many activities and often value those activities less. These differences are more pronounced in certain activity areas. The negative changes in adolescents’ achievement-related beliefs and values have been explained in two major ways. First, adolescents become much better at understanding and interpreting the evaluative feedback they receive and engage in more social comparison with their peers. As a result of these processes, many adolescents become more accurate or realistic in their self-assessments, so that their beliefs become relatively more negative (see Stipek & Mac Iver, 1989, for thorough discussion of how children’s processing of evaluative information changes). Second, the school environment changes in ways that make evaluation more salient and competition between students more likely, thus lowering some children’s achievement beliefs (Pajares, 1996; Urdan & Pajares, 2006).

Although Fouad and her colleagues were careful to design a measure congruent with self-efficacy theory, and were careful to validate scores derived from their new measure using original guidelines for establishing construct validity (Clark & Watson, 1995; Cronbach & Meehl, 1955), methods for invariance testing were not widely known or utilized at the time their measure was published. Furthermore, many researchers using the MSSES in their own investigations did not demonstrate validity evidence for their own samples (Table 1). This is particularly important when selecting items apart from the established subscale and/or using the measure for dissimilar samples. Future researchers and/or practitioners may prefer a more parsimonious survey based on our analysis of the Fouad et al. scale. However, we suggest that additional research be conducted on varied middle school and high school samples to further explore the psychometric properties of this measure for future use.

Conclusion

Research findings suggest that CDSE is an important variable associated with making and implementing career decision, and this has pointed to the need for the development and evaluation of counseling interventions designed to increase CDSE and related behaviors (Bergeron & Romano, 1994; Betz & Luzzo, 1996). Hackett and Betz (1992) explained, “there is a compelling need to determine the usefulness of self-efficacy theory in enhancing career development and broadening career choices” (p. 241). Lent and Brown (2006) claim that adequate tests of any theory are predicated on the availability of reliable and valid measures of the theory’s core constructs. In other words, theorists, researchers, and practitioners each have a vested interest in utilizing measures that are precise and valid. Without sound measures, it is difficult if not impossible to establish whether theory-discrepant findings are due to problems with the theory, flaws in operationalizing it, or inadequacies in enacting it in the form of an intervention (or all of the above). Widely used measures of self-efficacy within the career development literature tend to utilize well-established measurement techniques for scale construction and initial validation. However, as our findings demonstrate, evaluating construct validity for established measures is critical when researchers and/or practitioners are interested in using such measures for dissimilar samples, particularly to assess the outcomes of an intervention.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Note

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Arulmani

Van Laar

Easton

(2001). Career planning orientation of disadvantaged high school boys: A study of socioeconomic and social cognitive variables. Journal of the Indian Academy of Applied Psychology, 27, 7–17.

Bandura

(1977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84, 191–215. doi:10.1037/0033-295X.84.2.191

Bandura

(1986). Social foundations of thought and action: A social cognitive theory. Upper Saddle River, New Jersey: Prentice Hall.

Bandura

(1997). Self-efficacy: The exercise of control. New York, NY: Freeman.

Bandura

Barbaranelli

Caprara

G. V.

Pastorelli

(2001). Self-efficacy beliefs as shapers of children’s aspirations and career trajectories. Child Development, 72, 187–206.

Bergeron

L. M.

Romano

J. L.

(1994). The relationships among career decision-making self-efficacy, educational indecision, vocational indecision, and gender. Journal of College Student Development, 35, 19–24.

Betz

N. E.

(2000). Self-efficacy theory as a basis for career assessment. Journal of Career Assessment, 8, 205–222. doi:10.1177/106907270000800301

Betz

N. E.

(2004). Contributions of self-efficacy theory to career counseling: A personal perspective. The Career Development Quarterly, 52, 340–353. doi:10.1002/j.2161-0045.2004.tb00950.x

10.

Betz

N. E.

Hackett

(1981). The relationship of career-related self-efficacy expectations to perceived career options in college women and men. Journal of Counseling Psychology, 28, 399. doi:10.1037/0022-0167.28.5.399

11.

Betz

N. E.

Hackett

(1983). The relationship of mathematics self-efficacy expectations to the selection of science-based college majors. Journal of Vocational behavior, 23, 329–345. doi:10.1016/0001-8791(83)90046-5

12.

Betz

N. E.

Hackett

(2006). Career self-efficacy theory: Back to the future. Journal of Career Assessment, 14, 3–11. doi:10.1177/1069072705281347

13.

Betz

N. E.

Klein

K. L.

Taylor

K. M.

(1996). Evaluation of a short form of the career decision-making self-efficacy scale. Journal of Career Assessment, 4, 47–57. doi:10.1177/106907279600400103

14.

Betz

N. E.

Luzzo

D. A.

(1996). Career assessment and the career decision-making self-efficacy scale. Journal of Career Assessment, 4, 413–428. doi:10.1177/106907279600400405

15.

Betz

N. E.

Schifano

R. S.

(2000). Evaluation of an intervention to increase realistic self-efficacy and interests in college women. Journal of Vocational Behavior, 56, 35–52. doi:10.1006/jvbe.1999.1690

16.

Bohrnstedt

G. W.

(2010). Measurement models for survey research. In Marsden

P. V.

Wright

J. D.

(Eds.), Handbook of survey research (2nd ed., pp. 347–404). Somerville, MA: Emerald Group.

17.

Bong

(2001). Between-and within-domain relations of academic motivation among middle and high school students: Self-efficacy, task value, and achievement goals. Journal of Educational Psychology, 93, 23. doi:10.1037/0022-0663.93.1.23

18.

Browne

M. W.

Cudeck

(1992). Alternative ways of assessing model fit. Sociological Methods & Research, 21, 230–258. doi:10.1177/0049124192021002005

19.

Byrne

B. M.

(2012). Structural equation modeling with Mplus: Basic concepts, applications, and programming. New York, NY: Routledge.

20.

Byrne

B. M.

Shavelson

R. J.

Muthén

(1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456. doi:10.1037/0033-2909.105.3.456

21.

Clark

L. A.

Watson

(1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309–319. doi:10.1037/1040-3590.7.3.309

22.

Creed

Tilbury

Buys

Crawford

(2011). The career aspirations and action behaviours of Australian adolescents in out-of-home-care. Children and Youth Services Review, 33, 1720–1729.

23.

Cronbach

L. J.

Meehl

P. E.

(1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. doi:10.1037/h0040957

24.

Eccles

J. S.

Wigfield

Midgley

Reuman

Iver

D. M.

Feldlaufer

(1993). Negative effects of traditional middle schools on students’ motivation. The Elementary School Journal, 93, 553–574. doi:10.1086/461740

25.

Falco

L.D.

Summers

J. J

. (2017). Improving career decision self-efficacy and STEM self-efficacy in high school girls: Evaluation of an intervention. Journal of Career Development. doi:10.1177/0894845317721651

26.

Ferry

T. R.

Fouad

N. A.

Smith

P. L.

(2000). The role of family context in a social cognitive model for career-related choice behavior: A math and science perspective. Journal of Vocational Behavior, 57, 348–364. doi:10.1006/jvbe.1999.1743

27.

Fouad

N. A.

Guillen

(2006). Outcome expectations: Looking to the past and potential future. Journal of Career Assessment, 14, 130–142. doi:10.1177/1069072705281370

28.

Fouad

N. A.

Smith

P. L.

Enochs

(1997). Reliability and validity evidence for the middle school self-efficacy scale. Measurement and Evaluation in Counseling and Development, 30, 17–31.

29.

Garcia

(2012). Is the social cognitive career theory model plausible for Hispanic inner city high school biology students? (Unpublished doctoral dissertation). California State University, Fullerton.

30.

Gullickson

A.R.

Howard

B.B.

(2009). The personnel evaluation standards: How to assess systems for evaluating educators (2nd ed.). Thousand Oaks, CA: Corwin Press.

31.

Hackett

Betz

N. E.

(1981). A self-efficacy approach to the career development of women. Journal of Vocational Behavior, 18, 326–339. doi:10.1016/0001-8791(81)90019-1

32.

Hackett

Betz

N. E.

(1992). Self-efficacy expectations in the career choices of college students. In Schunk

Meese

(Eds.), Student perceptions in the classroom: Causes and consequences (pp. 229–246). Hillsdale, NJ: Erlbaum.

33.

Hamel

(2014). Career camp: Elevating expectations for college-going and career self-efficacy in urban middle school students (Unpublished doctoral dissertation). Kansas State University, Manhattan.

34.

Howard

K. A.

Wendt

Hagness

Cramer

Diestelmann

Huang

T. L

. (2012, 10). Work in progress: Grand challenges for engineering in the middle school classroom: Preliminary results. Proceedings of the Frontiers in Education Conference (FIE), Seattle, WA. doi: 10.1109/FIE.2012.6462362

35.

L. T.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6, 1–55. doi:10.1016/0001-8791(81)90019-1

36.

Jackson

K. D. M.

(2014). Understanding Latina adolescents’ science identities: A mixed methods study of socialization practices across contexts (Unpublished doctoral dissertation). University of Texas at Austin, Austin.

37.

Keller

B. K.

Whiston

S. C.

(2008). The role of parental influences on young adolescents’ career development. Journal of Career Assessment, 16, 198–217.

38.

Kimberlin

C. L.

Winterstein

A. G.

(2008). Validity and reliability of measurement instruments used in research. American Journal of Health-System Pharmacy, 65, 2276–2284.

39.

Klassen

R. M.

Usher

E. L.

(2010). Self-efficacy in educational settings: Recent research and emerging directions. In Urdan

T. C.

Karabenick

S. A.

(Eds.), The decade ahead: Theoretical perspectives on motivation and achievement (pp. 1–33). Bingley, England: Emerald Group.

40.

Kline

R. B

. (2015). Principles and practice of structural equation modeling. New York, NY: Guilford.

41.

Lent

R. W.

Brown

S. D.

(2006). On conceptualizing and assessing social cognitive constructs in career research: A measurement guide. Journal of Career Assessment, 14, 12–35.

42.

Lent

R. W.

Brown

S. D.

Hackett

(1994). Toward a unifying social cognitive theory of career and academic interest, choice, and performance. Journal of Vocational Behavior, 45, 79–122. doi:10.1006/jvbe.1994.1027

43.

Lent

R. W.

Brown

S. D.

Hackett

(2000). Contextual supports and barriers to career choice: A social cognitive analysis. Journal of Counseling Psychology, 47, 36–49. doi:10.1037/0022-0167.47.1.36

44.

Lent

R. W.

Lopez

F. G.

Bieschke

K. J.

(1993). Predicting mathematics-related choice and success behaviors: Test of an expanded social cognitive model. Journal of Vocational Behavior, 42, 223–236. doi:10.1006/jvbe.1993.1016

45.

Luzzo

D. A.

Hasper

Albert

K. A.

Bibby

M. A.

Martinelli

E. A.

Jr . (1999). Effects of self-efficacy-enhancing interventions on the math/science self-efficacy and career interests, goals, and actions of career undecided college students. Journal of Counseling Psychology, 46, 233. doi:10.1037/0022-0167.46.2.233

46.

Macht Jantzer

Stalides

D. J.

Rottinghaus

P. J.

(2009). An exploration of social cognitive mechanisms, gender, and vocational identity among eighth graders. Journal of Career Development, 36, 114–138. doi:10.1177/0894845309345841

47.

Marsh

H. W.

Craven

R. G.

Debus

(1991). Self-concepts of young children 5 to 8 years of age: Measurement and multidimensional structure. Journal of Educational Psychology, 83, 377.

48.

Marsh

H. W.

Hau

Wen

. (2004). In search of golden rules: Comment on hypothesis testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Structural Equation Modeling, 11, 320–341. doi:10.1207/s15328007sem1103_2

49.

Messick

(1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741.

50.

Mueller

C. E.

Hall

A. L.

Miro

D. Z.

(2015). Testing an adapted model of social cognitive career theory: Findings and implications for a self-selected, diverse middle-school sample. Journal of Research in STEM Education, 1, 142–155.

51.

Muthén

L. K.

Muthén

B. O.

(2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Author.

52.

Navarro

R. L.

Flores

L. Y.

Worthington

R. L.

(2007). Mexican American middle school students’ goal intentions in mathematics and science: A test of social cognitive career theory. Journal of Counseling Psychology, 54, 320–335. doi:10.1037/0022-0167.54.3.320

53.

Ojeda

Piña-Watson

Castillo

L. G.

Castillo

Khan

Leigh

(2012). Acculturation, enculturation, ethnic identity, and conscientiousness as predictors of Latino boys’ and girls’ career decision self-efficacy. Journal of Career Development, 39, 208–228. doi:10.1177/0894845311405321

54.

Oliver

L. W.

(1979). Outcome measurement in career counseling research. Journal of Counseling Psychology, 26, 217.

55.

Olle

C. D.

Fouad

N. A.

(2015). Parental support, critical consciousness, and agency in career decision making for urban students. Journal of Career Assessment, 23, 533–544. doi:10.1177/1069072714553074

56.

Pajares

(1996). Self-efficacy beliefs and mathematical problem-solving of gifted students. Contemporary Educational Psychology, 21, 325–344. doi:10.1006/ceps.1996.0025

57.

Sass

D. A.

(2011). Testing measurement invariance and comparing latent factor means within a confirmatory factor analysis framework. Journal of Psychoeducational Assessment, 29, 347–363. doi:10.1177/0734282911406661

58.

Sawitri

D. R.

Creed

P. A.

Zimmer-Gembeck

M. J.

(2014). Parental influences and adolescent career behaviours in a collectivist cultural setting. International Journal for Educational and Vocational Guidance, 14, 161–180. doi:10.1007/s10775-013-9247-x

59.

Sechrest

(2005). Validity of measures is no simple matter. Health Services Research, 40, 1584–1604.

60.

Sickinger

P. H.

(2013). Social cognitive career theory and middle school student career exploration (Unpublished doctoral dissertation). Regent University, Virginia Beach, VA.

61.

Song

H. D.

(2004). The effects of goal-oriented contexts and peer group composition on intrinsic motivation and problem solving (Unpublished doctoral dissertation). The Pennsylvania State University, College Station.

62.

Stewart

A. L.

Thrasher

A. D.

Goldberg

Shea

J. A.

(2012). A framework for understanding modifications to measures for diverse populations. Journal of Aging and Health, 24, 992–1017.

63.

Stewart

A. L.

Ware

J. E.

(Eds.). (1992). Measuring functioning and well-being: the medical outcomes study approach. Durham, NC: Duke University Press.

64.

Stipek

Iver

D. M.

(1989). Developmental change in children’s assessment of intellectual competence. Child Development, 60, 521–538.

65.

Sullivan

K. R.

Mahalik

J. R.

(2000). Increasing career self-efficacy for women: Evaluating a group intervention. Journal of Counseling & Development, 78, 54–62. doi:0.1002/j.1556-6676.2000.tb02560.x

66.

Switzer

G. E.

Wisniewski

S. R.

Belle

S. H.

Dew

M. A.

Schultz

(1999). Selecting, developing, and evaluating research instruments. Social Psychiatry and Psychiatric Epidemiology, 34, 399–409.

67.

Tanaka

J. S.

(1993). Multifaceted conceptions of fit in structural equation models. Sage focus editions, 154, 10.

68.

Taylor

K. M.

Betz

N. E.

(1983). Applications of self-efficacy theory to the understanding and treatment of career indecision. Journal of Vocational Behavior, 22, 63–81. doi:10.1016/0001-8791(83)90006-4

69.

Turner

S. L.

Alliman-Brissett

Lapan

R. T.

Udipi

Ergun

(2003). The career-related parent support scale. Measurement and Evaluation in Counseling and Development, 36, 83–95.

70.

Turner

S. L.

Lapan

R. T.

(2005). Evaluation of an intervention to increase non-traditional career interests and career-related self-efficacy among middle-school adolescents. Journal of Vocational Behavior, 66, 516–531. doi:10.1016/j.jvb.2004.02.005

71.

Urdan

Pajares

(Eds.). (2006). Self-efficacy beliefs of adolescents. Greenwich, CT: Information Age Publishing.

72.

Van de Schoot

Lugtig

Hox

(2012). A checklist for testing measurement invariance. European Journal of Developmental Psychology, 9, 486–492. doi:10.1080/17405629.2012.686740

73.

Whiston

S. C.

Quinby

R. F.

(2009). Review of school counseling outcome research. Psychology in the Schools, 46, 267–272. doi:10.1002/pits.20372

74.

Widaman

K. F.

Reise

S. P.

(1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In Bryant

K. J.

Windle

M. E.

West

S. G.

(Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 281–324). Washington, DC: American Psychological Association.

75.

Worthington

R. L.

Whittaker

T. A.

(2006). Scale development research: A content analysis and recommendations for best practices. The Counseling Psychologist, 34, 806–838. doi:10.1177/0011000006288127

76.

Yarbrough

D.B.

Shula

L.M.

Hopson

R.K.

Caruthers

F.A.

(2010). The program evaluation standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Corwin Press.

77.

Zumbo

B. D

. (2006). 3 Validity: Foundational issues and statistical methodology. Handbook of Statistics, 26, 45–79. doi:10.1016/S0169-7161(06)26003-6

78.

Zytowski

D. G.

Betz

E. L.

(1972). Measurement in counseling research: A review. The Counseling Psychologist, 3, 72–81.