Abstract
Understanding urban teachers’ beliefs about African American students has become important because (a) many teachers are reluctant to teach students from other cultures, and (b) most teachers are European American. To construct a psychometrically sound measure of teacher beliefs, the authors investigate the measurement properties of a teacher beliefs factor. This factor was selected from an inventory of items that purported to measure urban teachers’ cultural awareness and beliefs. Measurement invariance of the teacher beliefs factor across European American, African American, and Hispanic American teachers addressed its construct validity. The authors examine the psychometric properties of these items using graded response multilevel analysis. The final 5-item factor showed highest level of invariance for African American and European American teachers but did not fit Hispanic American teachers well. All the five items had good psychometric properties. Analyses of latent means showed that African American teachers had more positive beliefs about African American students than European American teachers did. However, the latent scores were bimodally distributed for African American teachers showing that one subgroup of African American teachers had similar beliefs as European American teachers while another subgroup had more positive beliefs.
Keywords
Many teachers are reluctant to work in culturally diverse settings (Bleicher, 2011; Futrell, Gomez, & Bedden, 2003; Terrill & Mark, 2000). Gay (2010) has noted that the reasons for this reluctance may be clarified by understanding teachers’ beliefs about students from diverse backgrounds. In general, teacher beliefs significantly influence teacher efficacy, behavior, perceptions, instructional judgments and decisions, and pedagogical practices (Bandura, 1986; Dewey, 1933; Pajares, 1992). In fact, Pajares (p. 329) called teacher beliefs the “single most important construct in educational research.” Understanding teacher beliefs about students of color may help improve teachers’ willingness to work in diverse settings, increase teacher efficacy, and advance pedagogical practices.
Several studies on teachers’ beliefs about teaching students of color (e.g., Kea, Trent, & Davis, 2002; Love & Kruger, 2005; McDermott, Gormley, Rothenberg, & Hammer, 1995; Phuntsog, 2001) were conducted on small sample data (less than 66 teachers), which is a disadvantage for performing advanced quantitative analyses. Most of these studies only provided limited evidence of score validity in the form of Cronbach’s α. Webb-Johnson and Carter (2005) constructed the Cultural Awareness and Beliefs Inventory (CABI) as an initial step in developing a quantitative measure of urban teachers’ cultural attitudes and beliefs. A principal component analysis (Natesan, Webb-Hasan, Carter, & Walter, 2012) yielded eight factors. The Teacher Beliefs factor (eight items) was purported to measure teachers’ beliefs about African American students. The psychometric properties of these items were not studied. Moreover, one of the items, “I believe that students in poverty are difficult to teach,” does not directly refer to African American students. Two more items, “I believe I would prefer to work with students and parents whose cultures are similar to mine,” and “I believe students from certain ethnic groups appear lazy when it comes to academic engagement,” focus more on beliefs about students of different ethnic groups rather than on African American students specifically. Finally, one item that did specifically refer to families of African American students, “I believe my ISD families of African American students are supportive of our mission to effectively teach all students,” did not load on the Teacher Beliefs factor. These observations lead us to believe that a more thorough psychometric analysis of the Teacher Beliefs factor needs to be conducted before researchers can use these items to measure urban teachers’ beliefs about African American students.
The purpose of the present study is to conduct measurement invariance and psychometric analysis of a measure of urban teachers’ beliefs about African American students. In particular, we analyze (a) measurement invariance across ethnicity to support construct validity; (b) unidimensionality, local item independence, and item fit; and (c) latent mean differences between teachers from different ethnic groups. Establishing measurement invariance shows that the construct (Teacher Beliefs) is defined similarly across groups. This allows comparisons of teacher beliefs about African American students across different ethnic groups of teachers. A lack of local item independence or unidimensionality would indicate that the responses to the items depend on some other construct in addition to their teacher beliefs (de Ayala, 2009). This speaks directly to the discriminant validity of the construct. Local item dependence (LID) also leads to artificial inflation of test reliability, test information (Wainer & Thissen, 1996), and item and person parameters (Thissen, Steinberg, & Mooney, 1989). In the next section we briefly define the construct.
Teachers’ Beliefs About African American Students
Unconscious racial biases of teachers and lower expectations for students of color have been linked to lower academic achievement of students of color (Castro-Atwater, 2008). Teachers’ beliefs and expectations about the academic performance of students vary by students’ ethnicity (McCombs & Gay, 2001). For example, teachers expect Asian students to perform better (Cheng & Starks, 2002) and Hispanic students to perform worse than White students academically (McCombs & Gay, 2001). Combining all non-White ethnic groups may have a confounded effect when measuring teachers’ beliefs about students of color. Therefore, we considered teachers’ beliefs about African American students specifically.
In a cultural deficit perspective, the academic failure of students from disadvantaged backgrounds is blamed on the students’ cultural backgrounds (Nieto, 2004). We therefore define the construct, teachers’ beliefs about African American students, to include (a) common misconceptions of teachers about the attitudes of African American students toward academic achievement, (b) stereotypical diagnosis of African American classroom behavior as a disciplinary problem, (c) not bringing enough strengths to the classroom, and (d) lacking the family support to do so. A narrative analysis using critical race theory of the open-ended items of the CABI also showed that deficit perspectives were dominant in the given sample (Natesan et al., 2012).
Research indicates that beliefs about teaching students from diverse settings may differ by the teacher’s ethnicity. Bakari (2003) found that African American teachers were more willing to teach African American students than European American teachers. Where European American teachers used more negative adjectives to describe African American students, African American teachers used more positive adjectives (Gottlieb, 1964). Kea et al. (2002) reported that teachers feel more prepared to teach students from backgrounds similar to their own. Although these studies show that teacher beliefs may differ by ethnicity, no study has quantified this variation due to ethnicity. Therefore, the present study sought to quantify differences in the beliefs of African American, European American, and Hispanic American urban teachers.
Method
Participants
In 2006, 54 campuses in a Houston metropolitan school district volunteered to participate in a professional development program that specifically addressed culturally responsive pedagogy. Out of the 3,731 in-service teachers who participated in the professional development program, 1,253 volunteered to complete the CABI. Despite the promise of anonymity, one fourth of the respondents did not report their ethnicity, and their responses were not included in this study. We included only African American, European American, and Hispanic American teachers because the total number of teachers in other ethnic groups was inadequate for multigroup confirmatory factor analysis (n < 50). Approximately 2% of the remaining responses were missing and therefore imputed using matching cases methodology in LISREL 8.80, followed by listwise deletion. Table 1 presents descriptive statistics of the students in the school district and the teachers in the initial (N = 1,253) and final samples (N = 860).
Descriptive Statistics of Participants
Note: As much as 24.91% of the teachers in the initial sample did not indicate their ethnicity.
Instrument
Each item in the inventory uses a 4-point Likert scale (strongly disagree, disagree, agree, and strongly agree). A principal component analysis with varimax rotation has indicated that the 36 items form 8 factors (Natesan et al., 2012). Based on item content, these factors were named Teacher Beliefs, School Climate, Home and Community Support, Cultural Awareness, Curriculum and Instruction, Culturally Responsive Classroom Management, Cultural Sensitivity, and Teacher Efficacy.
The Teacher Beliefs factor contained the most items (8) and explained the most variance in total scores (7.78%). These eight items plus one additional item (Family Support) that referred to African American student families were considered for analysis. Table 2 presents item wording, response frequencies, corrected item-total correlations, and Cronbach’s α if the item were to be deleted from the factor. De Vaus (2002) and Leong and Austin (2006) have suggested a minimal corrected item-total correlation (rtotal) of .3 and .4, respectively. Based on these suggestions, Family Support (rtotal = .24) was deleted. Although rtotal = .38 for Similar Culture was lower than .4, the item was retained at this stage, until there was additional evidence for discarding the item from the factor. After dropping Family Support from the analysis, the internal consistency of scores was acceptable (α = .79, 95% CI [confidence interval] [.77, .81]). The internal consistency was comparable for African American (α = .79, 95% CI [.76, .82]) and European American (α = .76, 95% CI [.72, .79]) teachers, and slightly higher for Hispanic American teachers (α = .81, 95% CI [.75, .86]).
Descriptive Statistics and Internal Consistency of Items
Note: AA = African American; 95% CI = 95% confidence interval; Independent School District (ISD).
Procedures
We used structural equation modeling (Analysis 1) and item response modeling (Analysis 2) to analyze measurement invariance and psychometric properties of the items, respectively.
Analysis 1
We investigated measurement invariance across African American (AA), European American (EA), and Hispanic American (HA) teachers using multigroup confirmatory factor analysis (see, for example, Wu, Li, & Zumbo, 2009). Given the ordinal nature of the Likert-scaled data, we used polychoric correlation and asymptotic correlation matrices using unweighted least squares estimation. Instead of fixing the factor variance, we set the factor loading of Acting White equal to 1 because we considered possible differences in factor variances between ethnic groups of interest. Model fit was deemed acceptable when comparative fit index (CFI) ≥ 0.95, root mean square error of approximation (RMSEA) ≤ 0.05, and standardized root mean square residual (SRMR) ≤ 0.08 (Hu & Bentler, 1999; MacCallum, Browne, & Sugawara, 1996).
First, a single-factor model was fit to the pooled data and AA, EA, and HA teachers separately. When a model did not fit a certain group well, the factor structure was modified and the new factor structure was fitted again. Errors between items were not allowed to covary even if modification indices suggested otherwise, because this would be an indication of LID, which is an undesirable psychometric property.
Second, we tested four nested models with increasing levels of measurement invariance: equivalence of factor structure (configural invariance, Model 1), factor loadings (metric invariance, Model 2), item intercepts (scalar invariance, Model 3), and error variances (error variance invariance, Model 4) across groups (see Meredith, 1993). Each of these invariances must be established before testing the proceeding measurement model of invariance. Although earlier studies considered groups comparable after scalar invariance is established (e.g., Vandenberg & Lance, 2000), more recent research (e.g., Deshon, 2004) require error variance invariance before latent means can be compared across groups. Chen (2007) has suggested that configural invariance model be retained when ΔCFI ≥ -0.005 and ΔRMSEA < 0.01, or ΔSRMR < 0.025, and the other models be retained when ΔCFI ≥ -0.005 and ΔRMSEA < 0.01 or ΔSRMR < 0.05. These differences in fit indices were computed by subtracting the fit index of the less restrictive model from the fit index of the more restrictive model. Finally, two structural invariance models, factor variance invariance (Model 5) and latent mean invariance (Model 6), were fitted to the data.
Analysis 2
We fitted a graded response multilevel model (GRMM) to the response data. A multilevel extension of Samejima’s (1969) graded response model (GRM), the GRMM models the response Y pi of person p to item i using item-level and person-level parameters (Natesan, Limbers, & Varni, 2010). At the item level, each 4-point Likert-scaled item i was modeled by one discrimination parameter a i and three threshold parameters bi1 through bi3. The probability that teacher p with belief level θ p responds in category k to item i is given by
where bi0 = -∞ and bi4 = ∞. At the person level, the latent trait level θ
p
was regressed on ethnicity using dummy coding. We estimated model parameters within a Bayesian framework (see, for example, Fox, 2010). A standard normal distribution
Unidimensionality, LID, and item fit were examined using posterior predictive model checking (PPMC; see, for example, Sinharay, Johnson, & Stern, 2006). PPMC compares the observed data Y obs with replicated data Y rep using discrepancy measures D(Y,θ,ξ), where θ and ξ denote the person and item parameters, respectively. The posterior predictive p value (PPP) of a discrepancy measure D equals
PPP values near 0 or 1 indicate model misfit. We used three discrepancy measures, computed using replicated response sets based on the posterior draws. Yen’s Q3 statistic (Yen, 1993) for items i and j is the correlation between residuals across persons. This statistic was used to detect violations of the unidimensionality and LID assumptions. Item fit (Q i ) was examined using sums of squared standardized Bayesian residuals (see Fox, 2010, for details).
Results
Analysis 1
The single-factor model with all eight items had adequate fit for the pooled data and each of the three ethnic groups (see Table 3). The configural invariance model did fit adequately, but the metric invariance model did not (SRMR = 0.087). Modification indices suggested that the factor loadings of Poverty (Δχ2 = 55.9 for EA, Δχ2 = 46.25 for AA) and Similar Culture (Δχ2 = 23.7 for EA, Δχ2 = 16 for HA) were different across groups. Therefore, we deleted these two items and fitted a single-factor model with six items. This model did not fit the data for Hispanic American teachers adequately (RMSEA = 0.092). We decided to remove Appear Lazy from the analysis. Although its factor pattern coefficient was sufficiently large (.74), the fact that this item talks about “certain ethnic groups” and not African American students in particular made it the most appropriate candidate for deletion. Even with five items, the model did not have adequate fit for Hispanic American teachers (χ2 = 9.16, CFI = 0.985, RMSEA = 0.087, 90% CI [.000, .174], SRMR = 0.055). It was unclear why the model did not fit HA teachers well. Although the sample was small (n = 112), this is adequate to fit a 5-item factor model. We decided to drop this group from our analysis and discuss the limitations of this decision in the Discussion section. Our aim was subsequently changed to examine the measurement invariance of the 5-item factor across African American and European American teachers.
Model Fit of Measurement Invariance Models Across African American, European American, and Hispanic American Teachers for the 8-Item and 6-Item Factor
Note: M0, M1, M2 = Model 0, Model 1, and Model 2, respectively; AA = African American; EA = European American; HA = Hispanic American.
The configural, metric, scalar, error variance, and factor variance invariance models fit the data and each comparison passed Chen’s criteria (see Table 4). The factor variance invariance model was retained as final model. The factor pattern coefficients for Acting White (.55), Behavioral Problems (.81), Eager to Excel (.85), Bringing Strengths (.71), and Family Involvement (.49) all exceeded .4. All models tested using multigroup CFA (confirmatory factor analysis) were statistically significant at .05 level. The relatively large difference in fit between the latent means and factor invariance model suggests that the latent means are unequal for the two groups. The difference between the means of African American (M = 0 according to model specification, SD = 0.55) and European American (M = 0.31, SD = 0.55) teachers was medium (Cohen’s d = .57). Cohen’s d for scores obtained by summing the item scores was .4.
Model Fit of Measurement Invariance Models Across African American and European American Teachers for the 5-Item Factor
Note: M0, M1, M2, M4, M5, M6 = Model 0, Model 1, Model 2, Model 3, Model 4, Model 5, and Model 6, respectively; AA = African American; EA = European American; CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; 95% CI = 95% confidence interval.
Analysis 2
Table 5 shows posterior summaries of the Q3 statistic. There is no indication that unidimensionality or local item independence are violated. Similarly, none of the item fit indices Q i showed that the GRM was not appropriate for analyzing these items (see Table 6). The posterior means and the lower limit of the 95% credibility intervals of all discrimination parameters were greater than 1, except for Item 8. Figure 1 orders the items according to the expected response they would elicit from the population in decreasing order from Bringing Strengths to Family Involvement. The lightest gray rectangle indicates the strongly disagree category and the darkest gray indicates the strongly agree category with intermediate categories represented by increasing darkness in the shades of gray. A person with teacher beliefs value to the left of the scale in the figure has more positive beliefs about African American students.
Posterior Predictive Model Checking for Unidimensionality and Local Item Independence
Note: 95% CI = approximate 95% confidence interval; PPP = posterior predictive p value.
Posterior Summaries of Item Parameters and Item Fit Indexes
Note: 95% CI = approximate 95% confidence interval; PPP = posterior predictive p value.

Five items in decreasing order of expected response. Gray scales indicate the regions on the latent Teacher Beliefs factor where each category is the most likely response to an item.
A Welch’s t test showed a statistically significant difference: t(641.91) = -6.59, p < .001, in teacher beliefs (i.e., θ p ) between African American (M = -0.24, SD = 0.97) and European American teachers (M = 0.43, SD = 0.78) with a large effect (Cohen’s d = .87). The posterior mean of the regression coefficient of the effect of African Americans over European Americans was -.56 with a posterior 95% credibility interval [-.73, -.40]. The plot of the kernel density estimates in Figure 2 shows a bimodal distribution for the teacher beliefs of the African American teachers in the sample. The higher mode of the African American teachers is around 0.33 and is approximately equal to the mode for European American teachers. The other mode is located at -1.19.

Kernel density estimates of the distribution of teacher beliefs of African American teachers (solid) and European American teachers (dashed) in the sample.
Discussion
Teacher beliefs about students of color have an impact on the performance of these students (Castro-Atwater, 2008). Teacher beliefs vary both by ethnicity of the students (McCombs & Gay, 2001) and ethnicity of the teacher (Bakari, 2003). A better understanding of teachers’ beliefs may clarify the reluctance of many teachers (Bleicher, 2011) to teach in a culturally diverse setting (Gay, 2010). The present study provided a detailed psychometric analysis of nine items from the CABI (Webb-Johnson & Carter, 2005) aimed at constructing a quantitative measure of teacher beliefs about African American Students. We investigated (a) measurement invariance of Teacher Beliefs across African American, European American, and Hispanic American teachers; (b) unidimensionality, local item independence, and item fit of the retained items; and (c) the difference in latent means between African American and European American teachers.
Four of the original nine items were deleted from the analysis because they were not sufficiently correlated with the construct or because factor pattern coefficients were not equal across groups, indicating lack of measurement invariance. The remaining 5-item factor did not fit adequately for Hispanic American teachers. This may indicate that the construct is defined differently for Hispanic American teachers. The question how to measure beliefs of Hispanic American teachers about African American students remains open.
Measurement invariance for the 5-item factor did hold across African American and European American teachers. This suggests that the construct is defined similarly for these two groups. Moreover, various model fit indices indicated that a unidimensional GRM was appropriate for the response data. The five items seem to form a quantitative measure that could be used to compare the beliefs of African American and European American teachers about African American students. Ordering these five items according to the expected response in the population provides some insight into the construct. Teachers are most likely to agree that they experience difficulties in involving African American families in education and that African American students have more behavioral problems than other students. Teachers are less likely to agree that African American students are not as eager to excel and do not bring as many strengths to the classroom as other students.
A latent means model showed that on average European American teachers hold less positive beliefs about African American students than African American teachers do. A similar conclusion could be drawn from the GRMM. The effect size of this difference was medium to large. In this way, the present study provides quantitative support for the qualitative results of Bakari (2003), who found that African American teachers were more willing to teach African American students than European American teachers, and for Kea et al. (2002), who found that teachers feel more prepared to teach students from backgrounds similar to their own.
Although African American teachers held more positive beliefs than European American teachers on average, a closer look at the distribution of teacher beliefs reveals another picture. The distribution of the teacher beliefs of African American teachers in our sample was almost bimodal. The higher mode was located near the mode of the distribution of teacher beliefs of European American students. The lower mode corresponds to considerably more positive beliefs about African American students. Therefore, it would be inaccurate to conclude that all African American teachers have more positive beliefs than European American teachers. Instead, a subgroup of African American teachers holds considerably more positive beliefs than European American teachers. A possible explanation could be the determination of some African American teachers to debunk the stereotype about the lower academic achievement of African Americans (Walker nee Haynes, 2011). At the same time, another subgroup of African American teachers hold beliefs similar to that of European American teachers. Perhaps this is due to the fact that most teachers are still trained in the Eurocentric teaching models. Further research may be able to explain the reasons for this bifurcation in African American teachers’ beliefs about African American students.
Limitations of the Present Study
The 5-item factor shows promise as a quantitative measure due to the high level of item fit and measurement invariance across African American and European American teachers. The quantitative nature of the present study allowed for a larger sample compared to qualitative studies. This sample size may have facilitated detecting the bimodal distribution of the beliefs of African American teachers. On the other hand, a complex construct such as teacher beliefs about African American students is difficult to capture with only 5 Likert-scaled items. Although these particular items show desirable psychometric properties, more items should be developed to increase measurement precision and construct coverage before researchers can confidently measure teacher beliefs. The fact that measurement invariance could not be established across Hispanic American teachers is another limitation. More research is necessary before a quantitative measure of Hispanic American teachers’ beliefs about African American students can be constructed.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
