Measuring Urban Teachers’ Beliefs About African American Students

Abstract

Understanding urban teachers’ beliefs about African American students has become important because (a) many teachers are reluctant to teach students from other cultures, and (b) most teachers are European American. To construct a psychometrically sound measure of teacher beliefs, the authors investigate the measurement properties of a teacher beliefs factor. This factor was selected from an inventory of items that purported to measure urban teachers’ cultural awareness and beliefs. Measurement invariance of the teacher beliefs factor across European American, African American, and Hispanic American teachers addressed its construct validity. The authors examine the psychometric properties of these items using graded response multilevel analysis. The final 5-item factor showed highest level of invariance for African American and European American teachers but did not fit Hispanic American teachers well. All the five items had good psychometric properties. Analyses of latent means showed that African American teachers had more positive beliefs about African American students than European American teachers did. However, the latent scores were bimodally distributed for African American teachers showing that one subgroup of African American teachers had similar beliefs as European American teachers while another subgroup had more positive beliefs.

Keywords

teacher beliefs African American students urban education measurement invariance graded response model Bayesian item response theory

Many teachers are reluctant to work in culturally diverse settings (Bleicher, 2011; Futrell, Gomez, & Bedden, 2003; Terrill & Mark, 2000). Gay (2010) has noted that the reasons for this reluctance may be clarified by understanding teachers’ beliefs about students from diverse backgrounds. In general, teacher beliefs significantly influence teacher efficacy, behavior, perceptions, instructional judgments and decisions, and pedagogical practices (Bandura, 1986; Dewey, 1933; Pajares, 1992). In fact, Pajares (p. 329) called teacher beliefs the “single most important construct in educational research.” Understanding teacher beliefs about students of color may help improve teachers’ willingness to work in diverse settings, increase teacher efficacy, and advance pedagogical practices.

Several studies on teachers’ beliefs about teaching students of color (e.g., Kea, Trent, & Davis, 2002; Love & Kruger, 2005; McDermott, Gormley, Rothenberg, & Hammer, 1995; Phuntsog, 2001) were conducted on small sample data (less than 66 teachers), which is a disadvantage for performing advanced quantitative analyses. Most of these studies only provided limited evidence of score validity in the form of Cronbach’s α. Webb-Johnson and Carter (2005) constructed the Cultural Awareness and Beliefs Inventory (CABI) as an initial step in developing a quantitative measure of urban teachers’ cultural attitudes and beliefs. A principal component analysis (Natesan, Webb-Hasan, Carter, & Walter, 2012) yielded eight factors. The Teacher Beliefs factor (eight items) was purported to measure teachers’ beliefs about African American students. The psychometric properties of these items were not studied. Moreover, one of the items, “I believe that students in poverty are difficult to teach,” does not directly refer to African American students. Two more items, “I believe I would prefer to work with students and parents whose cultures are similar to mine,” and “I believe students from certain ethnic groups appear lazy when it comes to academic engagement,” focus more on beliefs about students of different ethnic groups rather than on African American students specifically. Finally, one item that did specifically refer to families of African American students, “I believe my ISD families of African American students are supportive of our mission to effectively teach all students,” did not load on the Teacher Beliefs factor. These observations lead us to believe that a more thorough psychometric analysis of the Teacher Beliefs factor needs to be conducted before researchers can use these items to measure urban teachers’ beliefs about African American students.

The purpose of the present study is to conduct measurement invariance and psychometric analysis of a measure of urban teachers’ beliefs about African American students. In particular, we analyze (a) measurement invariance across ethnicity to support construct validity; (b) unidimensionality, local item independence, and item fit; and (c) latent mean differences between teachers from different ethnic groups. Establishing measurement invariance shows that the construct (Teacher Beliefs) is defined similarly across groups. This allows comparisons of teacher beliefs about African American students across different ethnic groups of teachers. A lack of local item independence or unidimensionality would indicate that the responses to the items depend on some other construct in addition to their teacher beliefs (de Ayala, 2009). This speaks directly to the discriminant validity of the construct. Local item dependence (LID) also leads to artificial inflation of test reliability, test information (Wainer & Thissen, 1996), and item and person parameters (Thissen, Steinberg, & Mooney, 1989). In the next section we briefly define the construct.

Teachers’ Beliefs About African American Students

Unconscious racial biases of teachers and lower expectations for students of color have been linked to lower academic achievement of students of color (Castro-Atwater, 2008). Teachers’ beliefs and expectations about the academic performance of students vary by students’ ethnicity (McCombs & Gay, 2001). For example, teachers expect Asian students to perform better (Cheng & Starks, 2002) and Hispanic students to perform worse than White students academically (McCombs & Gay, 2001). Combining all non-White ethnic groups may have a confounded effect when measuring teachers’ beliefs about students of color. Therefore, we considered teachers’ beliefs about African American students specifically.

In a cultural deficit perspective, the academic failure of students from disadvantaged backgrounds is blamed on the students’ cultural backgrounds (Nieto, 2004). We therefore define the construct, teachers’ beliefs about African American students, to include (a) common misconceptions of teachers about the attitudes of African American students toward academic achievement, (b) stereotypical diagnosis of African American classroom behavior as a disciplinary problem, (c) not bringing enough strengths to the classroom, and (d) lacking the family support to do so. A narrative analysis using critical race theory of the open-ended items of the CABI also showed that deficit perspectives were dominant in the given sample (Natesan et al., 2012).

Research indicates that beliefs about teaching students from diverse settings may differ by the teacher’s ethnicity. Bakari (2003) found that African American teachers were more willing to teach African American students than European American teachers. Where European American teachers used more negative adjectives to describe African American students, African American teachers used more positive adjectives (Gottlieb, 1964). Kea et al. (2002) reported that teachers feel more prepared to teach students from backgrounds similar to their own. Although these studies show that teacher beliefs may differ by ethnicity, no study has quantified this variation due to ethnicity. Therefore, the present study sought to quantify differences in the beliefs of African American, European American, and Hispanic American urban teachers.

Method

Participants

In 2006, 54 campuses in a Houston metropolitan school district volunteered to participate in a professional development program that specifically addressed culturally responsive pedagogy. Out of the 3,731 in-service teachers who participated in the professional development program, 1,253 volunteered to complete the CABI. Despite the promise of anonymity, one fourth of the respondents did not report their ethnicity, and their responses were not included in this study. We included only African American, European American, and Hispanic American teachers because the total number of teachers in other ethnic groups was inadequate for multigroup confirmatory factor analysis (n < 50). Approximately 2% of the remaining responses were missing and therefore imputed using matching cases methodology in LISREL 8.80, followed by listwise deletion. Table 1 presents descriptive statistics of the students in the school district and the teachers in the initial (N = 1,253) and final samples (N = 860).

Table 1.

Descriptive Statistics of Participants

	Student population in 2006	Teacher population in 2006	Teachers in initial sample (n = 1,253)	Teachers in final sample (n = 860)
Gender (%)
Male		22.0	23.6	25%
Female		78.0	74.9	75%
Ethnicity (%)
African American	31.4	36.3	22.3	39.3
European American	4.2	44.2	25.6	47. 7
Hispanic American	62.4	17.5	6.5	13.0
Asian American	2.0	1.9	0	0
Other	0.1	0.1	20.69	0

Note: As much as 24.91% of the teachers in the initial sample did not indicate their ethnicity.

Instrument

Each item in the inventory uses a 4-point Likert scale (strongly disagree, disagree, agree, and strongly agree). A principal component analysis with varimax rotation has indicated that the 36 items form 8 factors (Natesan et al., 2012). Based on item content, these factors were named Teacher Beliefs, School Climate, Home and Community Support, Cultural Awareness, Curriculum and Instruction, Culturally Responsive Classroom Management, Cultural Sensitivity, and Teacher Efficacy.

The Teacher Beliefs factor contained the most items (8) and explained the most variance in total scores (7.78%). These eight items plus one additional item (Family Support) that referred to African American student families were considered for analysis. Table 2 presents item wording, response frequencies, corrected item-total correlations, and Cronbach’s α if the item were to be deleted from the factor. De Vaus (2002) and Leong and Austin (2006) have suggested a minimal corrected item-total correlation (r_total) of .3 and .4, respectively. Based on these suggestions, Family Support (r_total = .24) was deleted. Although r_total = .38 for Similar Culture was lower than .4, the item was retained at this stage, until there was additional evidence for discarding the item from the factor. After dropping Family Support from the analysis, the internal consistency of scores was acceptable (α = .79, 95% CI [confidence interval] [.77, .81]). The internal consistency was comparable for African American (α = .79, 95% CI [.76, .82]) and European American (α = .76, 95% CI [.72, .79]) teachers, and slightly higher for Hispanic American teachers (α = .81, 95% CI [.75, .86]).

Table 2.

Descriptive Statistics and Internal Consistency of Items

			Percentage in Category
Item	Description	Label	1	2	3	4	r _total	α_{dropped, 95% CI}
1	ISD families of AA students supportive of mission to effectively teach all students	Family support	11.98	55.81	26.28	5.93	.24	.79 [.77, .81]
2	AA students consider performing well in school “acting White”	Acting White	32.44	44.19	17.67	5.70	.42	.77 [.75, .80]
3	AA students have more behavioral problems than other students	Behavioral problems	30.12	43.37	19.77	6.74	.59	.75 [.72, .77]
4	AA students not as eager to excel in school as White students	Eager to excel	33.72	48.26	15.35	2.67	.65	.74 [.71, .77]
5	Students who live in poverty are more difficult to teach	Poverty	23.95	41.28	29.42	5.35	.43	.77 [.75, .79]
6	AA students do not bring as many strengths to classroom as their White peers	Bringing strengths	39.88	44.88	13.02	2.09	.57	.75 [.73, .78]
7	I would prefer to work with students and parents whose cultures are similar as mine	Similar culture	26.16	56.86	13.72	3.26	.38	.78 [.75, .80]
8	I have experienced difficulty in getting families of AA communities involved in education	Family involvement	14.07	36.74	40.23	8.95	.48	.76 [.74, .79]
9	Students from certain ethnic groups appear lazy when it comes to academic engagement	Appearing lazy	31.16	46.67	18.37	2.79	.49	.76 [.74, .79]

Note: AA = African American; 95% CI = 95% confidence interval; Independent School District (ISD).

Procedures

We used structural equation modeling (Analysis 1) and item response modeling (Analysis 2) to analyze measurement invariance and psychometric properties of the items, respectively.

Analysis 1

We investigated measurement invariance across African American (AA), European American (EA), and Hispanic American (HA) teachers using multigroup confirmatory factor analysis (see, for example, Wu, Li, & Zumbo, 2009). Given the ordinal nature of the Likert-scaled data, we used polychoric correlation and asymptotic correlation matrices using unweighted least squares estimation. Instead of fixing the factor variance, we set the factor loading of Acting White equal to 1 because we considered possible differences in factor variances between ethnic groups of interest. Model fit was deemed acceptable when comparative fit index (CFI) ≥ 0.95, root mean square error of approximation (RMSEA) ≤ 0.05, and standardized root mean square residual (SRMR) ≤ 0.08 (Hu & Bentler, 1999; MacCallum, Browne, & Sugawara, 1996).

First, a single-factor model was fit to the pooled data and AA, EA, and HA teachers separately. When a model did not fit a certain group well, the factor structure was modified and the new factor structure was fitted again. Errors between items were not allowed to covary even if modification indices suggested otherwise, because this would be an indication of LID, which is an undesirable psychometric property.

Second, we tested four nested models with increasing levels of measurement invariance: equivalence of factor structure (configural invariance, Model 1), factor loadings (metric invariance, Model 2), item intercepts (scalar invariance, Model 3), and error variances (error variance invariance, Model 4) across groups (see Meredith, 1993). Each of these invariances must be established before testing the proceeding measurement model of invariance. Although earlier studies considered groups comparable after scalar invariance is established (e.g., Vandenberg & Lance, 2000), more recent research (e.g., Deshon, 2004) require error variance invariance before latent means can be compared across groups. Chen (2007) has suggested that configural invariance model be retained when ΔCFI ≥ -0.005 and ΔRMSEA < 0.01, or ΔSRMR < 0.025, and the other models be retained when ΔCFI ≥ -0.005 and ΔRMSEA < 0.01 or ΔSRMR < 0.05. These differences in fit indices were computed by subtracting the fit index of the less restrictive model from the fit index of the more restrictive model. Finally, two structural invariance models, factor variance invariance (Model 5) and latent mean invariance (Model 6), were fitted to the data.

Analysis 2

We fitted a graded response multilevel model (GRMM) to the response data. A multilevel extension of Samejima’s (1969) graded response model (GRM), the GRMM models the response Y_pi of person p to item i using item-level and person-level parameters (Natesan, Limbers, & Varni, 2010). At the item level, each 4-point Likert-scaled item i was modeled by one discrimination parameter a_i and three threshold parameters b_i1 through b_i3. The probability that teacher p with belief level θ_p responds in category k to item i is given by

P (Y_{p i} = k | θ_{p}, a_{i}, b_{i 1}, …, b_{i 3}) = \frac{1}{1 + \exp (- a_{i} (θ_{p} - b_{i (k - 1)}))} - \frac{1}{1 + \exp (- a_{i} (θ_{p} - b_{i k}))},

where b_i0 = -∞ and b_i4 = ∞. At the person level, the latent trait level θ_p was regressed on ethnicity using dummy coding. We estimated model parameters within a Bayesian framework (see, for example, Fox, 2010). A standard normal distribution N(0,1) was used as prior for person attitude parameters, where the variance of θ_p was set to 1 to identify the metric. A relatively noninformative normal distribution N(0,10) was used for item threshold parameters (subject to the ordering b₁ ≤ b₂ ≤ b₃) and regression coefficient β. A truncated normal distribution N(0,10) was used as prior for item discrimination parameters, subject to the restriction a > 0. Four chains of samples from the posterior distributions of the model parameters were drawn using JAGS (Plummer, 2003). The estimated potential scale reduction (Gelman, 1996) provided evidence that the chains had reached their stationary distribution after 10,000 iterations. Therefore, the first 10,000 iterations were discarded as burn-in and an additional 2,500 samples were drawn for each chain. Using item parameter estimates, we ordered the items according to the expected response they would elicit from the population.

Unidimensionality, LID, and item fit were examined using posterior predictive model checking (PPMC; see, for example, Sinharay, Johnson, & Stern, 2006). PPMC compares the observed data Y_obs with replicated data Y_rep using discrepancy measures D(Y,θ,ξ), where θ and ξ denote the person and item parameters, respectively. The posterior predictive p value (PPP) of a discrepancy measure D equals

p (D) = P (D (Y_{r e p}, θ, ξ) \geq D (Y_{o b s}, θ, ξ) | Y_{o b s})

PPP values near 0 or 1 indicate model misfit. We used three discrepancy measures, computed using replicated response sets based on the posterior draws. Yen’s Q₃ statistic (Yen, 1993) for items i and j is the correlation between residuals across persons. This statistic was used to detect violations of the unidimensionality and LID assumptions. Item fit (Q_i) was examined using sums of squared standardized Bayesian residuals (see Fox, 2010, for details).

Results

Analysis 1

The single-factor model with all eight items had adequate fit for the pooled data and each of the three ethnic groups (see Table 3). The configural invariance model did fit adequately, but the metric invariance model did not (SRMR = 0.087). Modification indices suggested that the factor loadings of Poverty (Δχ² = 55.9 for EA, Δχ² = 46.25 for AA) and Similar Culture (Δχ² = 23.7 for EA, Δχ² = 16 for HA) were different across groups. Therefore, we deleted these two items and fitted a single-factor model with six items. This model did not fit the data for Hispanic American teachers adequately (RMSEA = 0.092). We decided to remove Appear Lazy from the analysis. Although its factor pattern coefficient was sufficiently large (.74), the fact that this item talks about “certain ethnic groups” and not African American students in particular made it the most appropriate candidate for deletion. Even with five items, the model did not have adequate fit for Hispanic American teachers (χ² = 9.16, CFI = 0.985, RMSEA = 0.087, 90% CI [.000, .174], SRMR = 0.055). It was unclear why the model did not fit HA teachers well. Although the sample was small (n = 112), this is adequate to fit a 5-item factor model. We decided to drop this group from our analysis and discuss the limitations of this decision in the Discussion section. Our aim was subsequently changed to examine the measurement invariance of the 5-item factor across African American and European American teachers.

Table 3.

Model Fit of Measurement Invariance Models Across African American, European American, and Hispanic American Teachers for the 8-Item and 6-Item Factor

Model	Comparison	χ²	df	CFI	ΔCFI	RMSEA	ΔRMSEA	SRMR	ΔSRMR	90% CI RMSEA
Eight items
M0-pooled		35.853	20	.996		.030		.027		[.013, .046]
M0-AA		33.360	20	.993		.045		.043		[.014, .070]
M0-EA		34.990	20	.990		.043		.049		[.017, .066]
M0-HA		30.413	20	.984		.069		.064		[.000, .115]
M1		86.229	58	.994		.041		.064		[.021, .059]
M2	M2 vs. M1	121.301	72	.990	-.004	.049	.008	.087	.023	[.033, .064]
Six items
M0-pooled		19.606	9	.996		.037		.026		[.014, .060]
M0-AA		21.356	9	.990		.064		.044		[.029, .099]
M0-EA		10.285	9	.999		.019		.032		[.000, .061]
M0-HA		17.547	9	.980		.093		.064		[.018, .156]

Note: M0, M1, M2 = Model 0, Model 1, and Model 2, respectively; AA = African American; EA = European American; HA = Hispanic American.

The configural, metric, scalar, error variance, and factor variance invariance models fit the data and each comparison passed Chen’s criteria (see Table 4). The factor variance invariance model was retained as final model. The factor pattern coefficients for Acting White (.55), Behavioral Problems (.81), Eager to Excel (.85), Bringing Strengths (.71), and Family Involvement (.49) all exceeded .4. All models tested using multigroup CFA (confirmatory factor analysis) were statistically significant at .05 level. The relatively large difference in fit between the latent means and factor invariance model suggests that the latent means are unequal for the two groups. The difference between the means of African American (M = 0 according to model specification, SD = 0.55) and European American (M = 0.31, SD = 0.55) teachers was medium (Cohen’s d = .57). Cohen’s d for scores obtained by summing the item scores was .4.

Table 4.

Model Fit of Measurement Invariance Models Across African American and European American Teachers for the 5-Item Factor

Model	Comparison	χ²	df	CFI	ΔCFI	RMSEA	ΔRMSEA	SRMR	ΔSRMR	90% CI RMSEA
M0-pooled		8.097	5	.998		.029		.019		[.000, .064]
M0-AA		7.160	5	.998		.036		.022		[.000, .089]
M0-EA		8.197	5	.996		.040		.034		[.000, .086]
M1		15.133	9	.997		.043		.034		[.000, .079]
M2	M2 vs. M1	20.426	13	.997	.000	.039	-.004	.038	.005	[.000, .070]
M3	M3 vs. M2	35.875	18	.992	-.005	.052	.013	.038	-.001	[.026, .076]
M4	M4 vs. M3	44.010	23	.990	-.002	.049	-.003	.048	.010	[.026, .071]
M5	M5 vs. M4	46.454	24	.990	-.001	.050	.001	.056	.008	[.028, .072]
M6	M6 vs. M5	70.040	25	.979	-.011	.070	.020	.058	.002	[.051, .089]

Note: M0, M1, M2, M4, M5, M6 = Model 0, Model 1, Model 2, Model 3, Model 4, Model 5, and Model 6, respectively; AA = African American; EA = European American; CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; 95% CI = 95% confidence interval.

Analysis 2

Table 5 shows posterior summaries of the Q₃ statistic. There is no indication that unidimensionality or local item independence are violated. Similarly, none of the item fit indices Q_i showed that the GRM was not appropriate for analyzing these items (see Table 6). The posterior means and the lower limit of the 95% credibility intervals of all discrimination parameters were greater than 1, except for Item 8. Figure 1 orders the items according to the expected response they would elicit from the population in decreasing order from Bringing Strengths to Family Involvement. The lightest gray rectangle indicates the strongly disagree category and the darkest gray indicates the strongly agree category with intermediate categories represented by increasing darkness in the shades of gray. A person with teacher beliefs value to the left of the scale in the figure has more positive beliefs about African American students.

Table 5.

Posterior Predictive Model Checking for Unidimensionality and Local Item Independence

		Posterior $Q_{3}^{i, j}$
Item i	Item j	M	SD	95% CI	PPP
2	3	0.04	0.03	[-0.01, 0.09]	.81
	4	-0.05	0.03	[-0.11, 0.00]	.13
	6	-0.04	0.02	[-0.09, 0.00]	.17
	8	0.02	0.01	[-0.01, 0.05]	.73
3	4	-0.04	0.03	[-0.10, 0.03]	.25
	6	-0.03	0.03	[-0.09, 0.03]	.32
	8	0.02	0.02	[-0.03, 0.06]	.67
4	6	0.05	0.04	[-0.02, 0.12]	.80
	8	-0.03	0.03	[-0.08, 0.02]	.29
6	8	-0.04	0.02	[-0.07, 0.00]	.18

Note: 95% CI = approximate 95% confidence interval; PPP = posterior predictive p value.

Table 6.

Posterior Summaries of Item Parameters and Item Fit Indexes

	A			B ₁			b ₂			b ₃			Q_i
Item	M	SD	95% CI	M	SD	95% CI	M	SD	95% CI	M	SD	95% CI	M	SD	95% CI	PPP
2	1.23	0.11	[1.01, 1.46]	-0.70	0.09	[-0.89, -0.53]	1.31	0.12	[1.09, 1.58]	2.83	0.25	[2.39, 3.38]	0.94	0.90	[0.05, 3.40]	.52
3	2.41	0.21	[2.03, 2.84]	-0.54	0.06	[-0.67, -0.42]	0.88	0.07	[0.75, 1.02]	1.91	0.12	[1.70, 2.18]	0.73	0.74	[0.04, 2.73]	.51
4	3.26	0.34	[2.65, 4.00]	-0.39	0.06	[-0.50, -0.28]	1.06	0.07	[0.93, 1.20]	2.23	0.14	[1.97, 2.52]	0.61	0.69	[0.02, 2.51]	.52
6	2.05	0.18	[1.72, 2.41]	-0.24	0.06	[-0.36, -0.11]	1.36	0.09	[1.19, 1.56]	2.71	0.20	[2.34, 3.13]	0.80	0.88	[0.03, 3.22]	.53
8	1.00	0.10	[0.81, 1.19]	-2.13	0.21	[-2.58, -1.78]	0.04	0.09	[-0.13, 0.21]	2.68	0.26	[2.24, 3.23]	1.09	1.05	[0.06, 3.95]	.55

Note: 95% CI = approximate 95% confidence interval; PPP = posterior predictive p value.

Figure 1.

Five items in decreasing order of expected response. Gray scales indicate the regions on the latent Teacher Beliefs factor where each category is the most likely response to an item.

A Welch’s t test showed a statistically significant difference: t(641.91) = -6.59, p < .001, in teacher beliefs (i.e., θ_p) between African American (M = -0.24, SD = 0.97) and European American teachers (M = 0.43, SD = 0.78) with a large effect (Cohen’s d = .87). The posterior mean of the regression coefficient of the effect of African Americans over European Americans was -.56 with a posterior 95% credibility interval [-.73, -.40]. The plot of the kernel density estimates in Figure 2 shows a bimodal distribution for the teacher beliefs of the African American teachers in the sample. The higher mode of the African American teachers is around 0.33 and is approximately equal to the mode for European American teachers. The other mode is located at -1.19.

Figure 2.

Kernel density estimates of the distribution of teacher beliefs of African American teachers (solid) and European American teachers (dashed) in the sample.

Discussion

Teacher beliefs about students of color have an impact on the performance of these students (Castro-Atwater, 2008). Teacher beliefs vary both by ethnicity of the students (McCombs & Gay, 2001) and ethnicity of the teacher (Bakari, 2003). A better understanding of teachers’ beliefs may clarify the reluctance of many teachers (Bleicher, 2011) to teach in a culturally diverse setting (Gay, 2010). The present study provided a detailed psychometric analysis of nine items from the CABI (Webb-Johnson & Carter, 2005) aimed at constructing a quantitative measure of teacher beliefs about African American Students. We investigated (a) measurement invariance of Teacher Beliefs across African American, European American, and Hispanic American teachers; (b) unidimensionality, local item independence, and item fit of the retained items; and (c) the difference in latent means between African American and European American teachers.

Four of the original nine items were deleted from the analysis because they were not sufficiently correlated with the construct or because factor pattern coefficients were not equal across groups, indicating lack of measurement invariance. The remaining 5-item factor did not fit adequately for Hispanic American teachers. This may indicate that the construct is defined differently for Hispanic American teachers. The question how to measure beliefs of Hispanic American teachers about African American students remains open.

Measurement invariance for the 5-item factor did hold across African American and European American teachers. This suggests that the construct is defined similarly for these two groups. Moreover, various model fit indices indicated that a unidimensional GRM was appropriate for the response data. The five items seem to form a quantitative measure that could be used to compare the beliefs of African American and European American teachers about African American students. Ordering these five items according to the expected response in the population provides some insight into the construct. Teachers are most likely to agree that they experience difficulties in involving African American families in education and that African American students have more behavioral problems than other students. Teachers are less likely to agree that African American students are not as eager to excel and do not bring as many strengths to the classroom as other students.

A latent means model showed that on average European American teachers hold less positive beliefs about African American students than African American teachers do. A similar conclusion could be drawn from the GRMM. The effect size of this difference was medium to large. In this way, the present study provides quantitative support for the qualitative results of Bakari (2003), who found that African American teachers were more willing to teach African American students than European American teachers, and for Kea et al. (2002), who found that teachers feel more prepared to teach students from backgrounds similar to their own.

Although African American teachers held more positive beliefs than European American teachers on average, a closer look at the distribution of teacher beliefs reveals another picture. The distribution of the teacher beliefs of African American teachers in our sample was almost bimodal. The higher mode was located near the mode of the distribution of teacher beliefs of European American students. The lower mode corresponds to considerably more positive beliefs about African American students. Therefore, it would be inaccurate to conclude that all African American teachers have more positive beliefs than European American teachers. Instead, a subgroup of African American teachers holds considerably more positive beliefs than European American teachers. A possible explanation could be the determination of some African American teachers to debunk the stereotype about the lower academic achievement of African Americans (Walker nee Haynes, 2011). At the same time, another subgroup of African American teachers hold beliefs similar to that of European American teachers. Perhaps this is due to the fact that most teachers are still trained in the Eurocentric teaching models. Further research may be able to explain the reasons for this bifurcation in African American teachers’ beliefs about African American students.

Limitations of the Present Study

The 5-item factor shows promise as a quantitative measure due to the high level of item fit and measurement invariance across African American and European American teachers. The quantitative nature of the present study allowed for a larger sample compared to qualitative studies. This sample size may have facilitated detecting the bimodal distribution of the beliefs of African American teachers. On the other hand, a complex construct such as teacher beliefs about African American students is difficult to capture with only 5 Likert-scaled items. Although these particular items show desirable psychometric properties, more items should be developed to increase measurement precision and construct coverage before researchers can confidently measure teacher beliefs. The fact that measurement invariance could not be established across Hispanic American teachers is another limitation. More research is necessary before a quantitative measure of Hispanic American teachers’ beliefs about African American students can be constructed.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Bios

Prathiba Natesan is assistant professor of statistics, research methods, and measurement in the Department of Educational Psychology at the University of North Texas. Her research interests include Bayesian methods, psychometrics and educational statistics such as item response theory, structural equation modeling, and multilevel modeling, and urban education.

Vincent Kieftenbeld is assistant professor of mathematics in the Department of Mathematics & Statistics at Southern Illinois University Edwardsville. His research interests include educational assessment, Bayesian statistics, and the preparation of mathematics teachers.

References

Bakari

(2003). Preservice teachers’ attitudes toward teaching African American students: Contemporary research. Urban Education, 68, 640-654.

Bandura

(1986). Social foundations of thought and action: A social cognitive theory. Upper Saddle River, NJ: Prentice-Hall.

Bleicher

(2011). Parsing the language of racism and relief: Effects of a short-term urban field placement on teacher candidates’ perceptions of culturally diverse classrooms. Teaching and Teacher Education, 27, 1170-78.

Castro-Atwater

S. A.

(2008). Confronting colorblindness: Teachers, race, and teachable moments in the classroom. Journal of Instructional Psychology, 35, 246-253.

Chen

F. F.

(2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14, 464-504.

Cheng

Starks

(Oct, 2002). Racial differences in the effects of significant others on students’ educational expectations. Sociology of Education, 75, 306-327.

de Ayala

R. J.

(2009). The theory and practice of item response theory. New York, NY: Guilford.

De Vaus

(2002). Analyzing social science data. London: SAGE.

Deshon

R. P.

(2004). Measures are not invariant across with error variance homogeneity. Psychology Science, 46, 137-149.

10.

Dewey

(1933). How we think: A restatement of the relation of reflective thinking to the educative process. Lexington, MA: Heath.

11.

Fox

J.-P.

(2010). Bayesian item response modeling: Theory and applications. New York, NY: Springer.

12.

Futrell

Gomez

Bedden

(2003). Teaching the children of a new America. Phi Delta Kappan, 84, 381-385.

13.

Gay

(2010). Acting on beliefs in teacher education for cultural diversity. Journal of Teacher Education, 61, 143-152.

14.

Gelman

(1996). Inference and monitoring convergence. In Gilks

W. R.

Richardson

Spiegelhalter

D. J.

(Eds.), Markov chain Monte Carlo in practice (pp. 131-144). London: Chapman & Hall.

15.

Gottlieb

(1964). Teaching and students: The views of Negro and White teachers. Sociology of Education, 37, 345-353.

16.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.

17.

Kea

C. D.

Trent

S. C.

Davis

C. P.

(2002). African American student teachers’ perceptions about preparedness to teach students from culturally and linguistically diverse backgrounds. Multiple Perspectives, 4, 18-25.

18.

Leong

F. T. L.

Austin

J. T.

(Eds.). (2006). The psychology research handbook (2nd ed.). Thousand Oaks, CA: SAGE.

19.

Love

Kruger

A.C.

(2005). Teacher beliefs and student achievement in urban schools serving African American students. Journal of Educational Research, 99, 87-98.

20.

MacCallum

R. C.

Browne

M. W.

Sugawara

H. M.

(1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1, 130-149.

21.

McCombs

R. C.

Gay

(2001). Effects of race, class, and IQ information on social judgments of parochial grade school teachers. Journal of Social Psychology, 128, 647-652.

22.

McDermott

Gormley

Rothenberg

Hammer

(1995). The influence of classroom practical experiences on student teachers’ thoughts about teaching. Journal of Teacher Education, 46, 184-191.

23.

Meredith

(1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58, 525-543.

24.

Natesan

Limbers

Varni

J. W.

(2010). Bayesian estimation of graded response multilevel models using Gibbs sampling: Formulation and illustration. Educational and Psychological Measurement, 70, 420-439.

25.

Natesan

Webb-Hasan

Carter

N. P.

Walter

(2012). Validity of the cultural awareness and beliefs inventory of urban teachers: A parallel mixed methods study. Journal of Multiple Research Approaches, 5, 238-253.

26.

Nieto

(2004). Affirming diversity: The sociopolitical context of multicultural education (4th ed.). Boston, MA: Allyn & Bacon.

27.

Pajares

M. F.

(1992). Teacher’s beliefs and educational research: Cleaning up a messy construct. Review of Educational Research, 62, 307-322.

28.

Phuntsog

(2001). Culturally responsive teaching: What do selected United States elementary school teachers think? Intercultural Education, 12, 51-64.

29.

Plummer

(2003, March). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In Hornik

Leisch

Zeileis

(Eds.), Third international workshop on distributed statistical computing. Vienna, Austria: Technische Universität Wien.

30.

Samejima

(1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 17. Richmond, VA: Psychometric Society.

31.

Sinharay

Johnson

M. S.

Stern

H. S.

(2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298-321.

32.

Terrill

M. M.

Mark

D. L. H.

(2000). Preservice teachers’ expectations for schools with children of color and second-language learners. Journal of Teacher Education, 51, 149-155.

33.

Thissen

Steinberg

Mooney

(1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement, 26, 247-260.

34.

Vandenberg

R. J.

Lance

C. E.

(2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-70.

35.

Wainer

Thissen

(1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability? Educational Measurement: Issues and Practice, 15, 22-29.

36.

Walker nee Haynes

K. L.

(2011). Deficit thinking and the effective teacher. Education and Urban Society, 43, 576-597. doi:10.1177/0013124510380721

37.

Webb-Johnson

Carter

N. P.

(2005). Cultural awareness and beliefs inventory. Unpublished data.

38.

A. D.

Zumbo

B. D.

(2009). Decoding the meaning of factorial invariance and updating the practice of multi-group confirmatory factor analysis: A demonstration with TIMSS data. Practical Assessment, Research & Evaluation, 12, 1-26.

39.

Yen

W. M.

(1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187-213.