Abstract
This study aimed to develop and validate the Evaluator Competencies Assessment Tool (ECAT) Cultural Competencies Subscale, which measures cultural competence among evaluators. By addressing the scarcity of validated tools in this area, the study offers professionals a valuable resource to assess their strengths and areas for improvement. The subscale comprises 11 items rated on a seven-point Likert scale, derived from the American Evaluation Association (AEA) Evaluator Competencies. Validation involved a survey of 116 AEA members, employing multiple validity procedures. The subscale demonstrated excellent internal consistency (α=.96) and significant correlation with the Cultural Competence of Program Evaluators scale, confirming its convergent validity. However, the subscale's structural validity yielded mixed results, indicating the need for further exploration. Moreover, the demographic analysis highlighted underrepresentation of Asian and younger members. Overall, the ECAT Cultural Competencies Subscale shows promise for assessing cultural competency, but refinement of its factor structure and additional research are needed.
Keywords
Evaluators are cultural beings who bring their background and identities into their work. Likewise, evaluators need to account for the culture of the people and communities where they work. As Frierson et al. (2010) put it, “There are no culture-free evaluations, (and) there are no culture-free evaluators” (p. 77). Therefore, it is important for evaluators to be culturally competent to best serve the communities they work in. This is particularly significant in the United States, where evaluators serve diverse communities, with about 40% of the population being people of color according to the U.S. Census Bureau (2019).
The American Evaluation Association (AEA), the largest professional organization for evaluators across the United States, with over 6,000 members American Evaluation Association (n.d.), has set up guidelines for cultural competence through its public statement on cultural competence. The statement defines culture as “the shared experience of people, including their language, values, customs, beliefs, and mores,” and cultural competence as the ability for evaluators to respect and be “prepared to engage with diverse segments of communities to include cultural and contextual dimensions important to the evaluation” (AEA, 2011, pp. 1–2). Moreover, it characterizes cultural competence as a long-term process of learning and relearning attitudes with the intention to be fair and equitable to stakeholders and warns that cultural insensitivity and stereotypes lead to systemic bias that threatens evaluation validity (AEA, 2011). Furthermore, the statement acknowledges that culture is embedded in all evaluation processes (AEA, 2011).
In 2018, the AEA introduced an official list of evaluation competencies, comprising 49 competencies grouped into five domains: professional practice, methodology, context, planning & management, and interpersonal (AEA, 2018). The list was developed through a task force composed of prominent scholars and practitioners, with engagement and endorsement from AEA members (AEA, 2018). The process involved aligning the competencies with other foundational AEA documents, including the statement on cultural competence (AEA, 2018). Therefore, cultural competence statements are integrated within all five domains. This study aimed to develop and validate a self-assessment tool for cultural competencies based on the AEA Evaluator Competencies, contributing to AEA's efforts to highlight the significance of cultural competence.
Cultural Competence in Evaluation Literature
The importance of cultural competence in evaluation can be seen by the growing attention in evaluation literature (Chouinard & Cousins, 2009; Hood et al., 2005, 2014; Thomas & Parsons, 2017). This is demonstrated by the two books edited by Hood et al. (2005, 2014) on the intersection between culture and evaluation; they contend that the term “culturally competent evaluation” has been mentioned in at least 283 articles or chapters, respectively, between 1990 and 2013. Similarly, Chouinard and Cousins (2009) identified over 50 empirical studies on cultural topics within evaluation, including cultural competence, between 1991 and 2008; they thematize the findings of these studies into a framework with five dimensions for practitioners to consider in their practice: personal (acknowledging and accounting for subjectivity), ecological (considering the unique community context), relational (promoting interaction among evaluation actors for shared knowledge), organizational (balancing framework requirements and community needs), and methodological (critically evaluating evaluation theories, data collection, and analysis designs). Furthermore, the growth in attention in literature has been accompanied by more academic courses and professional development workshops dedicated to cultural competence in evaluation (Thomas & Parsons, 2017).
In addition to cultural competence, the discussion about cultural influence in evaluation extends to related concepts such as multicultural, cross-cultural, and culturally responsive evaluations (Samuels & Ryan, 2011). These concepts draw from academic and evaluation traditions that critically examine social power dynamics and advocate for social justice, including critical race theory, indigenous knowledge, and responsive evaluation (Hopson, 2009; Greene, 2006). The underlying theme shared by these concepts is the need to be attentive to the lived experiences of indigenous, minority, and marginalized communities throughout the evaluation process (Hopson, 2009).
To engage in culturally competent evaluation practice, evaluators can utilize a range of strategies. First, evaluators can enhance cultural competence by assembling diverse evaluation teams that reflect the communities they serve, either through including evaluators from those communities or utilizing cultural bridges to supplement cultural knowledge (Frierson et al., 2010). Second, employing qualitative and mixed methods designs allows for a more comprehensive understanding of cultural nuances and complexities (Frierson et al., 2010). Third, evaluators should strive for multicultural validity by using measures that are sensitive to different cultural norms and values (Kirkhart, 2010). Fourth, disaggregating data along cultural groups when feasible provides insights into variations within the population being evaluated (Frierson et al., 2010). Lastly, to effectively communicate evaluation findings, it is important to adjust the way information is presented, such as translating reports for people who do not speak English and using simple language that everyone can understand to reach a wide range of people (Frierson et al., 2010).
Measuring Cultural Competence in Evaluation
Despite the growing interest in cultural competence in evaluation literature, not much has been done to develop validated measures of cultural competence. One notable exception is the Cultural Competence of Program Evaluators (CCPE) scale, which is the only published and validated tool found that is specifically designed to measure cultural competence among program evaluators (Dunaway et al., 2012). The CCPE offers a valuable foundation for evaluators to evaluate and enhance their cultural competence by examining three key dimensions: cultural knowledge, cultural skills, and cultural awareness (Dunaway et al., 2012). By addressing these dimensions, the CCPE acknowledges the significance of understanding diverse cultural backgrounds, possessing cross-cultural engagement skills, and recognizing one's own cultural biases and assumptions (Dunaway et al., 2012).
The CCPE serves as a valuable reference point for the development of new measures of cultural competence, given its positive correlations with established tools such as the Multicultural Counseling Inventory that is widely used in the counseling field (Dunaway et al., 2012). Moreover, evaluators who have undergone cultural competence training were shown to achieve higher scores on the CCPE (Dunaway et al., 2012). Therefore, the CCPE scale was utilized in this study as a comparative measure to investigate its relationship with the statements on cultural competencies embedded in the 2018 AEA list of evaluation competencies.
Development of the ECAT Cultural Competencies Subscale
The study's subscale was derived from the Evaluator Competencies Assessment Tool (ECAT), which allows evaluators to self-assess their competencies based on the AEA list of evaluation competency (Cho et al., 2022). The ECAT subdivides double barrel statements, those looking at more than one issue, to create a list of 93 items from the original 49 items. The ECAT was validated through Confirmatory Factor Analysis (CFA) and demonstrated that more experienced evaluators had higher scores (Cho et al., 2022). The ECAT Cultural Competencies Subscale was developed for this study by selecting 11 items that measure cultural competence from all five domains of the ECAT scale. The subscale retained the ECAT's seven-point Likert scale, ranging from zero (entry/novice) to six (mastery/expert), as per the template provided by Ghere et al. (2006) and the Minnesota Evaluation Studies Institute (2018). The subscale is presented in Appendix A.
Method
The validity of the ECAT Cultural Competencies Subscale was evaluated through a survey administered to members of the AEA. A power analysis was performed using semPower to determine the necessary sample size for a one-factor CFA involving 11 observed variables and one latent variable. The analysis indicated that a sample size of 322 participants was required to detect a small effect size difference of 0.05 between the null and alternative models, with a power of 90% and a significance level of 0.05. However, considering the data collection method and timeframe, this study aimed to recruit a minimum of 110 participants, adhering to the general guideline of 10 participants per item (Kline, 2016). Issues arising from low power are addressed in the discussion section.
Participants were recruited through the AEA Research Task Force that provided 1,000 randomly selected emails of AEA members. The inclusion criteria to participate was to be a member of the AEA. The survey was distributed for four weeks within November and December 2020. Statistical analyses were conducted on SPSS and R.
Participants
The survey received responses from 152 participants, yielding a response rate of 15.2% based on the targeted sample of 1,000 AEA members. Although this response rate is low, previous studies focusing on AEA members as the target sample have also encountered similar challenges (Coryn et al., 2016; Dunaway et al., 2012). Among the participants, 44 had missing data, and 36 of them were excluded due to having more than three missing responses on the ECAT Cultural Competencies Subscale. For the nine participants with some missing responses, the missing values were imputed using the average of the items within the scale. Additionally, five participants who had missing data on the CCPE scale and years of experience were excluded from the correlation analyses.
A final sample of 116 participants was included in the study. The gender distribution of the sample consisted of 73% female, 22% male, and 5% other. In terms of race and ethnicity, the sample comprised 68% White, 10% Black, 6% Mixed Race, 4% Latinx, 3% Native American, 2% Asian, and 7% other. Regarding educational attainment, 3% had a bachelor's degree, 47% had a master's degree, 45% had a doctorate, and 5% had other qualifications. The age range of participants was 25 to 76 years, with an average age of 49.3 years (SD = 12.5). The range of years of evaluation experience was 1 to 51 years, with an average of 15.5 years (SD = 11.0).
To assess the generalizability of the sample to the AEA membership, deidentified demographic data from the AEA was obtained, including information on gender identity, race, ethnicity, highest degree received, and year of birth for 5,173 members (AEA, 2021). However, unidentified data and data categorized as “other” were not included in the analysis. Chi-square tests were conducted to compare the representativeness of the sample in terms of gender identity, race, and ethnicity identity, and highest degree received. A z-test was performed to examine the between group difference in age. A comparison of demographic composition of the sample for this study and the AEA membership appears in Table 1.
Sample Demographics Compared to AEA Membership (AEA, 202, p. 1).
Note. Unidentified data or those categorized as “other” was not included.
AEA = American Evaluation Association; SD = standard deviation.
The results indicated that the study sample is similar the AEA membership in terms of gender identity and highest degree received. However, there was a difference in race and ethnic identity (χ2 (5) = 11.12, p = .049). Further analysis revealed that Asians were underrepresented in the sample (2% compared to 8% in the AEA membership), and Native Americans were slightly overrepresented (3% in the sample compared to 1% in the AEA membership). The average age of the study sample also differed from that of the AEA membership (z = 4.73, p < .001), with the study sample older than the AEA membership.
Measures
The survey utilized in this study was designed on Qualtrics and consisted of two main components. The first component was the ECAT Cultural Competencies Subscale, which comprised 11 items specifically developed for this study (see Appendix A). The second component included the 26-item CCPE scale, which was included with permission from the scale creators and adapted for use in Qualtrics. In addition to the scales, demographic information such as gender, race, education level, age, and years of evaluation experience was collected. A copy of the survey can be found online at https://cgu.co1.qualtrics.com/jfe/form/SV_4NMEqEirM7tHWGq.
Procedure
The validity of the subscale was evaluated through four methods. First, internal consistency, which examines the degree of agreement among items measuring the same construct, was assessed using Cronbach's alpha based on criteria established by George and Mallery (2003). Second, convergent validity, which examines whether different measures of the same construct yield similar results, was assessed through correlation analysis between the subscale and the CCPE scale (Bhattacherjee, 2012). Third, evidence of construct validity was further examined through correlation analysis between the subscale and years of evaluation experience, as previous research has indicated a positive relationship between experience and evaluation competency (Cho et al., 2022; Dewey et al., 2008). Fourth, the structural validity of the subscale, assessing whether the items align with the proposed structure of 11 items under one construct, was investigated through confirmatory factor analysis based on established criteria by Schreiber et al. (2006).
Results
The participants’ scores on the ECAT Cultural Competencies Subscale ranged from 21 to 77, with a mean of 54.6 (SD = 12.3), while scores on the CCPE scale ranged from 72 to 129, with a mean of 105.4 (SD = 11.6). Both measures exhibited a normal distribution, as indicated by their skewness and kurtosis values falling within the acceptable range.
First, the subscale demonstrated excellent internal consistency with a Cronbach's alpha of .96, exceeding the threshold for acceptability based on George and Mallery (2003) criteria. The inter-item correlation further supported the subscale's internal consistency.
Second, the moderate correlation (r = .58) between the subscale and the CCPE scale provided evidence of convergent validity, indicating that both measures captured the same construct of cultural competence among evaluators. This relationship is presented in Figure 1.

Relationship between the ECAT Cultural Competencies Subscale and the CCPE scale. Note. CCPE = Cultural Competence of Program Evaluators; ECAT = Evaluator Competencies Assessment Tool.
Third, a moderate correlation (r = .43) between the subscale and years of evaluation experience supported the construct validity, aligning with theoretical expectations that more experienced evaluators would exhibit higher evaluation competency (Cho et al., 2022; Dewey et al., 2008). This relationship is presented in Figure 2.

Relationship between the ECAT Cultural Competencies Subscale and Years of Experience. Note. ECAT = Evaluator Competencies Assessment Tool.
Fourth, the results of the CFA yielded mixed findings: χ2 (44) = 183.82, p < .001, root mean square error of approximation (RMSEA) = .17, comparative fit index (CFI) = .89, and standardized root mean square residual (SRMR) = .05. Based on the criteria set by Schreiber et al. (2006), the SRMR value met the required criterion but the RMSEA and CFI values, while close, did not meet the set criteria. These findings are presented in Table 2.
ECAT Cultural Competence CFA Results (N = 109).
Note. χ2 = chi-square; df = degrees of freedom; CFA = Confirmatory Factor Analysis; CFI = comparative fit index; ECAT = Evaluator Competencies Assessment Tool; RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual. ***p < .001.
The results in general presented the ECAT Cultural Competencies Subscale as a promising tool in measuring cultural competence among evaluators. However, more research is needed to establish the subscale structure.
Discussion
The findings of this study highlight the potential of the ECAT Cultural Competencies Subscale as a valuable tool for evaluating cultural competence among evaluators and researchers. Grounded in the AEA list of Evaluator Competencies, the subscale demonstrates reliability and construct validity, making it a useful instrument in the field. While the findings regarding structural validity were mixed, there are encouraging aspects worth considering. Although some values fell slightly short of the CFA criteria, it is important to acknowledge the possibility that the study may have been underpowered. These factors indicate the need for further research to confirm the ideal structure of the subscale, specifically whether the current format of 11 items capturing one construct is optimal for assessing cultural competence in evaluation. Nonetheless, the findings reveal potential.
The timeliness of the subscale is underscored by the scarcity of validated measures available for evaluating evaluators’ cultural competence, despite the growing interest in cultural competence and its impact on evaluation. This subscale aims to complement the CCPE scale, potentially the only published and accessible validated scale for evaluators (Dunaway et al., 2012). By providing additional measurement tools, the subscale addresses the need for evaluators to comprehensively assess their cultural competence. Both the CCPE scale and the ECAT Cultural Competencies Subscale offer distinct advantages. The CCPE scale evaluates cultural competence based on dimensions of cultural knowledge, skills, and awareness (Dunaway et al., 2012). Conversely, the ECAT Cultural Competencies Subscale, rooted in the AEA Evaluator Competencies framework, assesses cultural competence within the broader context of evaluation competencies across five domains: professional practice, methodology, context, planning & management, and interpersonal (AEA, 2018).
The practical implications of the ECAT Cultural Competencies Subscale extend to training and self-improvement for evaluators. It can effectively serve as an assessment tool in various forms of evaluation training, including courses, workshops, internships, and mentoring programs, which play a vital role in professional growth (Dewey et al., 2008; Dillman, 2013; LaVelle & Donaldson, 2010). Moreover, evaluators can utilize the subscale for self-measurement, monitoring, and reflective practices to enhance their cultural competence in real-world evaluations. After all, cultural competence is a continuous and long-term process that evaluators should actively engage in throughout their careers (AEA, 2011). Tools such as the ECAT Cultural Competencies Subscale can provide valuable support in this ongoing developmental journey.
Culturally competent evaluators will improve evaluation practice by enhancing the relevance, inclusivity, and effectiveness of evaluation in diverse communities. Having culturally competent evaluators will result in indigenous, minority, and marginalized communities having their values represented in the evaluation process (Hopson, 2009). This includes their inclusion in evaluation decision-making processes and the use of culturally appropriate methodologies, measures, data analysis, and reporting practices (Frierson et al., 2010; Kirkhart, 2010). Culturally competent evaluators ensure that evaluations effectively capture the unique perspectives, needs, and aspirations of these communities. Their presence fosters greater trust and engagement from community members, leading to evaluation outcomes that align more closely with their values and priorities.
Limitations and Future Directions
This study has several limitations. First, it is important to note that the sample size was relatively small, which may influence the generalizability of the findings. The power analysis estimated a sample size of 322 participants for a one-factor CFA based on a small effect size difference. However, due to constraints in data collection and timeframe, the study achieved a sample size of many fewer participants. Future studies should strive to obtain a larger and more diverse sample to ensure greater representativeness and increase the generalizability of the findings.
Second, while the sample was representative of AEA membership in terms of gender and education level, care should be taken to generalize the findings to all AEA members given that Asian respondents were underrepresented, and native American and older respondents were overrepresented.
Third, this study focused on AEA members, who may not fully represent all evaluators. Therefore, caution should be exercised when generalizing the findings to evaluators outside of the AEA membership or outside the United States. Future research should explore measures of cultural competency for other geographic contexts.
Fourth, this study relied on self-assessment measures, which may be subjected to social desirability bias. Participants may have provided responses that they perceived as socially desirable, potentially influencing the accuracy of their self-reported cultural competencies (Krumpal, 2013). Future research could consider incorporating objective measures or alternative methods of assessment to mitigate the impact of social desirability bias.
Future research should address these limitations by conducting additional reliability and validity analyses along with exploring cultural competencies from multiple perspectives. By addressing these limitations, the validity and generalizability of the findings can be further enhanced.
Conclusion
There is growing encouragement for evaluators to be culturally competent in the United States and around the world. This includes efforts by the AEA that culminated in a public statement on cultural competence. Despite the growing interest in this topic, little has been done to develop validated tools for evaluators to use to self-assess their cultural competence. This study developed and validated the ECAT Cultural Competencies Subscale to fill this gap. The subscale demonstrates promising potential as an assessment tool for measuring cultural competence among evaluators. The study yielded positive results in terms of the subscale's reliability and construct validity, highlighting its value for professionals in the field. Despite some mixed findings regarding the subscale's structural validity, its overall effectiveness remains promising. Moreover, the subscale has practical implications for evaluator training and self-improvement; evaluators can use it as a tool for professional growth that will improve their ability to effectively engage diverse cultural contexts.
Footnotes
Appendix A
The Evaluator Competencies Assessment Tool (ECAT) Cultural Competencies Subscale
Please use the following rubric to answer the following questions.
Appendix B
Crosswalk between the 2018 American Evaluation Association (AEA) Evaluator Competencies (in bold and italic) and the Evaluator Competencies Assessment Tool (ECAT) Cultural Competencies Subscale:
Subscale: Act ethically through evaluation practice that respects people from different cultural backgrounds. Subscale: Act ethically through evaluation practice that respects people from indigenous groups. Subscale: Identify how evaluation practice can promote social justice. Subscale: Collect data using culturally appropriate procedures. Subscale: Analyze data using culturally appropriate procedures. Subscale: Clarify cultural assumptions. Subscale: Address aspects of culture in planning an evaluation. Subscale: Address aspects of culture in managing an evaluation. Subscale: Attend to the ways power and privilege affect evaluation practice. Subscale: Facilitate culturally responsive interaction throughout the evaluation.
Note. The 2018 AEA Evaluator Competencies that did not match the ECAT Cultural Competencies Subscale were not included in the crosswalk.
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
