Abstract
Research was conducted during the delivery of a series of workshops on language assessment with Haitian teachers in the spring of 2013. The final products of these workshops were several revised national English examinations presented to the Haitian Ministry of Education and Professional Training (MENFP). The research goal was to examine the language assessment literacy (LAL) development of both teachers and language assessment specialists during this collaboration. Data included the compiled feedback from Haitian teachers on draft examinations during the workshops, as well as survey and interview responses immediately following the workshops. Results reveal the complementary expertise of teachers and specialists, which facilitated LAL development by both parties. Results also identified challenges in collaborative decision making and consensus building to be addressed in future projects.
Keywords
Language assessment in the Haitian context
This article reports on research conducted during a series of workshops on language assessment given by Canadian facilitators to secondary school teachers in the spring of 2013 in Haiti. There are numerous obstacles to educational success in Haiti: by many estimates (Prou, 2009; UNESCO, 2014; Tondreau, 2008), the literacy rate in rural areas is less than 50%, the lowest in the western hemisphere. Class sizes are often very large, and few teaching materials and other resources are available, especially in rural areas. Following from Haiti’s colonial history, French remains the dominant instructional language at the secondary level and throughout the extensive network of private schools, despite generally inadequate French language skills of both students and teachers (Dejean, 2010). An additional barrier to better educational outcomes in Haiti is a lack of qualified teachers. Many teachers have no formal teacher education, and most teachers have no training in educational assessment (Prou, 2009). According to reports by the Inter-American Development Bank (2007) and the Groupe de Travail sur l’Education et la Formation (2010), between 70% and 80% of Haitian teachers lack accreditation from the Haitian Ministry of Education and Professional Training (MENFP), and approximately one quarter has an educational level of less than the ninth grade.
Knowledge of assessment is essential for teachers in this context, as the Haitian education system is characterized by a series of very high-stakes national pass/fail examinations. Passing these exams is mandatory for students in order to continue to each subsequent level of schooling. There are currently exams held at the end of primary school (sixth grade); two years later (at the end of what is termed l’école fondamentale), and also during each of the final two years of secondary school, called Bac I (or “Rheto”) and Bac II (or “Philo”). Bac I and Bac II exams are held in English as a Foreign Language as well as several other subjects (Math, Science, Social Studies, French, and Spanish). The Bac II or Philo level is required for entrance to university. Only about 1–2% of Haiti’s population currently attend university.
Unfortunately, there are serious problems with the content and construction of these national examinations. We focus here on the English as a Foreign Language (EFL) national examinations currently in place (Baker & Riches, 2013). Some of the irregularities are as follows:
There is no evidence of overall progression in knowledge or competency development in the examinations – it is in fact very difficult to determine from the exam contents which exam is intended for which grade. This is not surprising, given that the national examinations office (BUNEXE) prepared these exams without a well-articulated national curriculum for EFL. 1
The grammar items of the exams are based on invented de-contextualized sentences only, and contain multiple errors. For example, a review of a recent exam at the Philo level revealed that students were asked to turn a sentence into the passive form when it was already in the passive form.
In the writing section, some versions require the production of a text, whereas others involve a translation task (suggesting that these tasks are viewed as equivalent). The translation tasks require students to translate from their L3 (English) to their L2 (French) or vice versa – there are no translation tasks involving the L1 (Creole).
The potential negative impact on educational attainment in Haiti from poorly constructed examinations cannot be overstated. Only 20–25% of Haitian students currently complete high school, and this is often due to students dropping out following exam failure (Locher, 2010). The current project is an attempt to develop teachers’ language assessment literacy (LAL) in this context, using these examinations as a starting point. As exam content is submitted to BUNEXE by committees of teachers, familiarizing teachers with concepts of language assessment and exam development would arguably put these teachers in a better position to critique the current EFL examinations and suggest improvements. In addition, working together to design exam tasks has the potential to engender discussions of the concepts and competencies that should be attained at each grade level, potentially leading to positive washback in the classroom as well as in curriculum development.
Language assessment literacy
This project therefore addresses LAL development (Fulcher, 2012; Malone, 2013; Inbar-Lourie, 2008, 2013; Taylor, 2009, 2013). LAL draws from the area of educational assessment literacy more generally (Deluca & Klinger, 2010; Stiggins, 1991; Willis, Adie, & Klenowski, 2013), and can be defined as the competence required in language assessment by various stakeholder groups, whether they be developers of an assessment, assessment candidates, users of the results of assessment, or the public. In a special issue of Language Testing dedicated to language assessment literacy, Taylor (2013) presented visual depictions of LAL for various stakeholder groups. Figure 1 shows Taylor’s depiction for language teachers in particular. The elements of LAL are placed around the spider graph, and the more an element is judged to be necessary, the further out on the “web” it is placed (from 0 to 4). The figure indicates, for example, that Taylor sees teacher LAL as characterized by greater emphasis on knowledge of assessment related to pedagogy than knowledge of language assessment theory. Figure 1 is a visual reminder of the importance of representing LAL as a complex social practice, and of resisting binary representations of LAL where stakeholders are viewed as either “literate” or “illiterate.”

The LAL required for language teachers (from Taylor, 2013).
Taylor (2013) explains that the components of LAL included in this figure were derived from the other papers in the special journal issue in which this publication was found, and represent an attempt to “operationalize the range of key components making up the [assessment literacy/language assessment literacy] construct” (p. 411). The identification and definition of these key components still comprise very much a work in progress. A review of the LAL literature reveals that the elements to include can fall into three general categories: (1) assessment theory and/or conceptual knowledge; (2) skills in assessing; and (3) moral/ethical practice. Taylor (2013) refers to these three general categories, as does Davies (2008), who states that educational materials for teachers in particular should “not only include the knowledge and skills required to undertake language testing work but also would cover the principles and history that free practitioners need to make informed, ethical decisions” (p. 121). Fulcher (2012) also includes these three categories when he discusses how teachers need conceptual understanding to engage effectively in policy arguments, as well as “a range of strategies at their disposal to implement classroom assessment, and evaluate its success” (p. 114).
Taylor (2013) does not provide detailed definitions of the elements of Figure 1. Although the three general categories above are mentioned, it is difficult to assign all the elements of this visual depiction to one of these three categories, or to a single category exclusively. For example, the general category related to theoretical/conceptual knowledge could include the element “knowledge of theory,” but also that of “principles and concepts.” Know-how is a broad category, which can be represented by the element “technical skills” and conceivably “scores and decision making,” as well as “language pedagogy.” Ethics can possibly be related to “principles and concepts” as well as “sociocultural values.” The elements “language pedagogy” and “scores and decision making” may refer to underlying conceptual declarative knowledge or to procedural, technical know-how – or both. “Local practices” and “personal beliefs and attitudes” do not seem to fall into any of these three categories and seem to present an expansion of LAL to include self-knowledge. Elsewhere, Baker (2016) has argued that the element “sociocultural values” may best be conceptualized not as a component element but as something tacitly subsumed within all other elements. By bringing these elements from multiple studies together in the same figure, Taylor (2013) has highlighted the work yet to be done in creating a common language to describe the underlying construct of LAL.
Taylor (2013) explains that a figure like this is intended to emphasize the application of a knowledge base to our professional and educational endeavors; that is, in terms of active performance in the stakeholder role, rather than the acquiring of static knowledge. Therefore, in moving the definition of LAL forward there may be potential for the conceptualization of LAL as a professional competency by applying insights from research in professional workplace knowledge and development (see Baker, 2016). For example, Eraut and his colleagues (Eraut, 2004; McKee & Eraut, 2012) have worked to develop a “progression typology” (Eraut, 2004, p. 266) of professional knowledge – a socioculturally informed epistemology which includes elements related to propositional knowledge (i.e., discipline-based theoretical concepts and applied principles) as well as procedural knowledge (i.e., technical and decision-making skills). This typology comprises the following elements:
task performance in the domain (procedural);
awareness and understanding (propositional and related to the sociocultural context);
personal development (propositional self-knowledge);
teamwork (procedural and related to collaborative skills and strategies);
role performance (procedural, such as leadership within teams);
academic knowledge and skills (propositional, i.e., enacting research-based practice);
decision making and problem solving (procedural); and
judgment (procedural, such as setting priorities).
The usefulness of this categorization, as a complement to the depiction in Figure 1, will be explored in this study. Taylor (2013) has emphasized the importance of considering the LAL of all relevant stakeholders – including language testing experts. The majority of empirical work on LAL has investigated language teachers (Brown & Baily, 2008; Inbar-Lourie, 2008; Jeong, 2013; Jin, 2010; Malone, 2013; Vogt & Tsagari, 2014). Other studies have been performed with other users of language test scores such as policy makers (Pill & Harding, 2013), and university admissions officers (Baker, 2016; Baker, Tsushima, & Wang, 2014; Ginther & Elder, 2013; Hyatt & Brooks, 2009; O’Loughlin, 2011, 2013; Rea-Dickins, Kiely, & Yu, 2007; Smith 7 Haslet, 2007). To the authors’ knowledge, there have been no studies that focus on the LAL of language testing professionals.
The current study
In this study we examine the LAL development of teachers and of language assessment professionals (the workshop facilitators) during workshop activities. This study is an opportunity to explore further the characterization of LAL. LAL development from this study will be discussed in terms of the depiction offered by Taylor (2013), as well as how it might be usefully expanded with reference to work in professional competence development.
An additional research focus involves a critical examination of the challenges associated with collaboration during professional development in developing countries – collaboration between external assessment experts and local teachers, as well as among the teachers themselves. Previous research (Burke, 2013; Hismanoglu, 2010) has indicated that teachers may not always have a positive orientation towards professional development activities with their peers. In addition, in developing contexts, because a lack of resources can make travelling and meeting difficult (Akello et al., 2016), teachers generally have little experience with collaborative in-service professional development. We therefore took this opportunity to learn more about the teacher participants’ attitudes towards collaboration.
Context for the study and research questions
This study was held during the delivery of a week-long series of language assessment workshops with Haitian English teachers in 2013 called “Cultivating collaboration: Foundations, EFL, and assessment.” On a previous professional development visit by the facilitators, these teachers had shared their frustrations regarding English national examinations – frustrations with the consequences of these problematic exams on their students, and with their own lack of expertise in evaluation. During this week, facilitators presented language assessment concepts and principles (e.g., the notions of reliability, construct validity, and practicality in language assessment). Teachers then participated in hands-on technical workshops on language assessment development (e.g., critiquing current national exams and producing their own exam tasks, including activities like the selection of reading texts and the creation of accompanying comprehension questions).
Teachers were then provided with drafts of new potential national examinations that had been drafted by the facilitators and their Haitian-Canadian team. These new exam drafts followed the same basic format as the current examinations (i.e., a reading text with comprehension questions, followed by a set of grammar items and a writing task), but addressed some of the major technical problems identified in those examinations. Instead of decontextualized exercises, exam content was integrated, with grammar items relating to the content and linguistic structures of the initial reading task. In addition, a vocabulary-in-context task was added, as well as two single-paragraph writing tasks; both also related to the content of the reading text. Teachers were invited to provide feedback on these new draft exams from a theoretical, technical, and ethical standpoint (details below).
Following the week of workshops, these draft exams were revised according to the teacher feedback, and then submitted to the Haitian Ministry of Education and Professional Training (MENFP) to be considered as replacements for the current exams. Developing exam drafts in the absence of curriculum is far from ideal but was unavoidable in this context. There was no other option in our situation but to begin by creating new examinations that address some of the most obvious problems of the existing official examinations, with the hope that these revised exams would be implemented and that later conversations could be held at the Ministry level regarding the curriculum.
The research questions for this study were the following:
How is language assessment literacy developed in teachers during the collaborative process of critiquing and revising Haitian EFL national examinations?
How is language assessment literacy developed in language assessment specialists during this process?
What successes and challenges arise for teachers during this process of collaboration?
Methodology
Qualitative research methods were used to address these research questions. Data consisted of artifacts resulting from the workshop experience such as the comments made by teachers on the existing tests as well as teachers’ opinions elicited through surveys and interviews.
Participants
The participants in the study were in-service teachers attending the “Cultivating collaboration” workshops: approximately 120 high school EFL teachers, representing each of the approximately 70 high schools in the Northern Region of Haiti. Specific data on their level of schooling were not available, but all teachers reported that they had never had any specific training in language teaching or language assessment. Their teaching experience ranged from 1 year to more than 20 years. Only three participants were female – indicative of the lower educational attainment of women in Haiti.
Data collection
Three sets of data were collected:
Feedback on drafts of revised exams: During the workshop sessions, drafts of the revised exams were provided to teachers with an accompanying feedback form, which contained a series of closed-ended questions related to the appropriateness of the topic and the reading texts for their students, as well as what grade level they felt the exam best represented. The form also contained a number of open-ended questions, where teachers were asked to discuss issues such as the construct they believed the new exams were addressing compared to the old ones. Teachers were also invited to comment on practical issues related to the delivery of these exams. The feedback form was designed to support the revision of the examinations, but also to address the first two research questions: to allow teachers the opportunity to demonstrate their knowledge of assessment principles and technical issues, as well as to voice any ethical concerns and to provide facilitators with the possibility to learn more about the pedagogical context in which these exams are delivered, as well as the sociocultural values and the local practices of the teachers and the teachers’ beliefs and attitudes about teaching and assessment. These secure exam drafts cannot be shared, but see Appendix A for an example practice exam which follows the same principles.
Teachers worked in groups of about ten to review each draft exam. Teachers were instructed to try to arrive at common answers on the feedback forms as a group, to the extent possible. Each exam was reviewed by five out of ten groups (there was insufficient time for each group to review all eight exams within the five-hour session). This session resulted in approximately 55 answers for each of the yes/no questions for each item on the exam. In addition, approximately 20 open-ended comments were made for each item on each exam, and approximately 20 overall comments on each exam, for a total of almost 1000 comments.
Survey with teachers: The second source of data was intended to address all three research questions, and consisted of an anonymous open-ended survey conducted with the teacher participants immediately following the week of workshops (N = 92). The survey included questions on the following: the construct teachers felt was represented by the revised exams; what they learned and felt they had yet to learn about language assessment; the process of collaboration with all parties; and if/how they anticipated their practices changing as a result of these workshops. The survey was bilingual (English/Creole) and respondents were encouraged to answer in any language they desired (English, Creole or French). A total of 52 surveys were completed in Creole, 37 in English and three in French. An excerpt of the survey (with only those questions relating to the present study) is presented in Appendix B.
Teacher interviews: The third source of data also was intended to address all three research questions. It consisted of individual interviews with 11 of the teacher participants who agreed to stay after the final workshop day. Interview research has proven fruitful in similar contexts (see Rosendahl & Rönnerman, 2006) where researcher-facilitators and local teachers collaborate on educational change. These semi-directed interviews included questions about perceived learning in the workshops (including any surprising information or new skills they learned); details about their impressions of collaboration with teammates, and how they felt they contributed to facilitators and teammates in terms of their own expertise. The teachers were also invited to comment on anything they wanted to add, such as topics for future workshops. Four teachers chose to be interviewed in Creole, four in English, and three in a mix of English and French. Creole interviews were conducted by a native Creole-speaking research assistant, and French and English interviews were conducted by the facilitators. Interview data were then transcribed; the Creole interview responses were translated by two other Creole-speaking research assistants, into either English or French (depending on the strongest second language of the translator).
Data analysis
The first round of data analysis was to determine the consensus opinion on the revisions to be undertaken on the sample national exams in preparation for submission to and consideration by the government. This analysis is therefore not directly related to the research questions addressed in this study, but served to help us orient the study to the teachers’ perspective.
The second round of analysis was done on the responses on the feedback sheets combined with the survey and interview responses to address the first and second research questions (the LAL development of teachers and of facilitators). Regarding teacher LAL development, initial open coding was conducted independently by the two authors and a research assistant – all facilitators of the workshops – in order to identify all responses that represented claims or evidence of learning. Some of this evidence was more straightforward to identify, as teachers responded to direct questions in the survey and interview about their perceived learning, but all utterances were read to evaluate whether they demonstrated learning by the teachers. In an additional round of coding by the authors, each of the statements identified in the initial open coding was placed into categories of learning through an iterative process (Mills, 2011; Dornyei, 2007; Srivastava & Hopwood, 2009).
During this process, some decision-making procedures were established by the researchers in order to determine final categories. For example, sometimes comments were identical on the exam feedback sheets of all teachers in the same group – indicating that the groups had discussed and had all agreed on the point during the workshop. Sometimes, similar comments were reported independently by individuals in different groups. If similar comments were made by an entire group, or by more than 10 teachers overall across all sources of data, then they were deemed common enough to constitute a substantive independent category.
During open coding, all concepts or information from all data sources that were identified as new for the facilitators were placed apart. An iterative analysis was then performed on these data in order to categorize facilitator learning – addressing the second research question. Regarding the third research question, the same procedure as above was repeated again, only with the survey and interview responses, focusing on the categorization of the teachers’ responses to direct questions on successes and challenges during collaboration.
Results
First, the categories related to learning (LAL development) of the teachers and facilitators will be presented to respond to the first and second research questions. Then, the third question will be addressed by a presentation of the categories of responses related to successes and challenges in collaboration – both between facilitators and teachers, and among teachers themselves. In the discussion, the authors will relate these categories to both the elements of Taylor’s (2013) framework as well as to elements of professional competence.
Language assessment literacy development of teachers
LAL development of teachers was evidenced by the comments provided on the draft examinations as well as survey and interview responses. Common categories are reported here from the analysis described above on the combination of all three data sources.
The first salient category was related to learning how to create reading comprehension questions, as observed in the survey and interview comments below:
“[One thing I learned was] how to formulate inferencing question. And I am very happy to learn about it.” (Survey E20). [In response to a question about how he applied knowledge from the workshops:] “Mwen wè seyans denfòmasyon an li benefik anpil … Egzamen sa amte kreye a sak fò nou wè ke reading comprehension la te trè, trè, trè, fò” (This [workshop] I see as being very beneficial … This exam I created, we see the reading comprehension was very, very, very strong) (Interview 301).
This new knowledge was also demonstrated in the exam feedback. For example, on one reading section, seven teachers commented (correctly) that only lower-level literal understanding questions had been asked, and some suggested multiple-choice questions that elicited more inferencing and higher-level thinking, even suggesting appropriate distractors. Therefore, LAL development was evidenced not only by teachers’ self-reports, but by this demonstration of their knowledge.
The second category concerns the importance of integrating vocabulary tasks into their own teaching and into assessment.
“What I have learned is the following: Without grammar we can make ourselves understood, but without vocabulary we can’t say nothing. That is to say vocabulary plays a vital role in language learning” (Survey E28).
“Èske atelye sa a chanje fason ou menm ou planifye pou anseye nan klas ou? (Has this workshop changed the way you plan to teach in your class?)
“Chanjman ki gen pou kapab fèt se si paske sa mwen fin aprann lan, m pral mete annaplikasyon. Se fè elèv la konnen bokou plis mo vokabilè paske nou reyalize ke vokabilè se li ki baz la. San vokabilè, elèv la p ap ka ekri anyen, p ap ka di anyen … Paske Baz la menm vokabile. Nou pral ensiste bokou plis sou li” (Changes have to occur, because what I have learned I will put in application. It’s to make the student learn much more vocabulary because we realize that vocabulary is the foundation. Without vocabulary, the students won’t be able to write anything, won’t be able to say anything … The foundation is vocabulary. We will insist much more on it) (Interview 304).
In the third category, the teachers commented that they had become convinced of the value of basing all exam sections on the same topic (making use of the reading text to create a common context for all students for the grammar, vocabulary, and writing sections).
“Meyè pati a se kapab planifye yon egzamen ki baze sou yon sèl ide” (The best part of it is to be able to plan an exam based on one idea) (Survey C6). “I learned how to create an exam starting from a text, meaning all the grammar, vocabulary and writing questions are built from the text” (Survey E19).
Teachers also demonstrated development in this area by suggesting additional words and phrases that should be pre-taught or provided in a glossary on the exams, following the models provided in the workshops.
The fourth category was the teachers’ reports of their increased appreciation of the connection between teaching and assessment:
“[I learned that] we measure what we should be teaching” (Survey E19). “I learned to evaluate on the things that they have seen in class” (Survey E12).
The fifth category relates to a broadening of the teachers’ understanding of the construct of language ability relative to what they had previously held. Teachers identified elements of language proficiency in the revised examinations that were judged to be absent or insufficiently covered in the current examinations, based on what they learned in the workshops that week.
“[Our revised exam] measures students’ understanding of a text, grammar points that have been taught in class, vocabulary that they should know and their ability to think” (Survey E28).
The sixth category is related to the teachers’ beliefs concerning their role in supporting their student’s success on exams, rather than actively blocking their success. These comments suggest a re-orientation to a more student-centered approach:
“Mwen aprann evalye bagay ke yo wè nan klas. Mwen aprann ke li pa nesesè pou bay elèv egzamen difisil, men se pou n wè kòman nou ka ede yo reyesi nan egzamen yo” (I learned that it isn’t necessary to give difficult exams to the students, but instead to see what we can do to help them to be successful on the exams) (Survey C56). Interviewer: “Why are you pleased with the workshop?” “I am pleased because this is what I’ve always wanted for … most of the Haitian teachers to know, that you are not giving an English exam to students … to actually keep them from going to Philo or University … I think they should give students a fair exam” (Interview 201). “In the approach [of the workshops] we think the students would be very happy because the system isn’t a skull stuffing … it’s not … to trap the student but to find comprehension in the student.” (Interview 304).
A final category is included here, but it remains debatable as to whether it can be related to teacher learning with confidence. Teachers reported having learned about validity, reliability, and practicality (terms defined for them in the workshops). This demonstrated an awareness of the concepts, but there is no evidence of appropriate use of the terms in discussing the exam content. In the one case where these terms are applied, they are misused:
“For my students, the text is not appropriate because [students] lack the validity, reliability and practicality to do it” (Anonymous comment on draft examination).
Language Assessment Literacy development of facilitators
The categories of comments that emerged here, showing LAL development of the facilitators, are the following: knowledge about the cultural suitability of certain reading texts for the local context, the appropriate levels for the grammar items, the additional scaffolding needed for the writing tasks, political issues that surface during grading, practical constraints in exam delivery, and the level of student-centeredness in the average classroom, as well as levels of student motivation.
First, facilitators learned to become more sensitive to the type of topics to avoid in future materials creation. Three reading texts were determined by the majority of teachers to be unsuitable for the Haitian context, either because they were too difficult or because they related to culturally inappropriate topics. For example, a majority of teachers (70%) commented that a reading on allergic reactions in the body was not appropriate because its scientific nature was too intimidating, and a reading on the chemical creation of diamonds was culturally insensitive in a poor nation:
“Viewpoint: In Haiti, don’t know much about diamonds. Not good for the background of Haitian students” (Anonymous exam comment). “Topic and text are inappropriate for our students” (Anonymous exam comment).
Second, facilitators learned that for every grade level they were underestimating students’ grammatical competence and overestimating their writing competence. Teachers commented that they spent the majority of their classes on explicit grammar instruction and that they are most comfortable teaching grammar explicitly:
“For the grammar, I don’t think I have much problem [covering difficult material]” (Interview 202).
On average, half of all exam feedback comments regarding grammar were about how the items were too easy for the level. On the other hand, the teachers explained that the writing tasks we presented were too difficult and students would have difficulty generating sufficient language to respond:
“Very difficult for students. Students don’t have enough experience to answer the questions” (Anonymous exam comment). “We are unable to [write] about something or somebody without having knowledge about it … we can’t say more than we know” (Anonymous exam comment).
Therefore, facilitators learned of the need to support students more through the use of detailed writing prompts that provided suggestions on content. Through the interviews, facilitators also learned more about the problematic, highly political process by which these examinations are graded. They learned that because grading is a paying job for teachers, some teachers volunteer to grade (or are chosen to grade because of their connections), despite inadequate English skills. Facilitators learned that responses to writing questions have long been graded impressionistically (or “at their taste,” according to one teacher) and the use of a rubric may be met with a great deal of resistance by exam graders. These concerns were raised by the members of several groups, including the following by all 10 members of one of the participant groups:
“[The scoring guides are] not clear and practical to use because teachers have no background about using scoring guides” (Group comment on draft exam).
Facilitators therefore learned about adjustments to be made to writing prompts and grids, and the necessity of better supporting the grading process. Facilitators also learned of several very practical constraints relating to exam delivery in Haiti. For example, the examinations need to be able to fit on one page; otherwise, printing costs will be too high.
From teacher comments in the surveys and interviews, facilitators also learned about teachers’ attitudes towards student success in their classrooms. As previously mentioned, teachers commented that they had not always made a connection between their own teaching and their assessment activities. During the workshop week, facilitators learned that student success was not always a preoccupation with the teachers; indeed, teachers discussed a common practice in exams, whereby they would try to “trap” students with difficult material or material they were not likely to have seen before.
“[Before these workshops] we used to give the students hard questions or all difficult questions” (Interview 101).
Finally, facilitators learned that many students were not motivated to learn English or were very insecure about learning it:
“Most of them are not interested in English. Maybe they love English because they love the person who teaches English” (Interview 202). “When we speak about English [the students] are so afraid” (Interview 102).
Successes and challenges in collaboration
Regarding the third research question, teachers reported success in working with facilitators, and satisfaction regarding collaboration, but that they were sometimes frustrated regarding communication and cooperative decision making with their colleagues. First, teachers reported that they felt like equals with facilitators and valued in the exam creation process – as illustrated in the following survey and interview comments:
“[The exams we created] unlocked an awareness, because we recognize we are doing a work that has value. There’s a concerned body [the facilitators] that allows other groups of people to benefit from our work” (Interview 304). “Mwen te santi ke nou tap travay tankou frè ak sè” (I felt as if [facilitators and teachers] were working together like brothers and sisters) (Survey C16).
However, regarding collaboration with fellow teachers, two competing categories emerged. On one hand, the opportunity to work in teams with peers to share expertise was greatly appreciated, with about 90% of teachers speaking highly of the benefits of collaboration:
“Mwen the sezi wè kòman plizyè pwofesè ka travay an n ekip. Li te trè dinamik. Mwen santi m gen espwa toujou pou peyi m si n travay ansanm” (I was surprised at the extent to which many fellow professors were able to work as a team. It was very dynamic. I feel that there is still hope for my country if we work together) (Survey C56). “Despite the lack of agreement [in coming up with a main idea of a text] it allowed us to become stronger. The lack of comprehension didn’t bring division, they united us more, which allowed us to find ideas to work” (Interview 303).
Despite belief in the benefits of this team approach, most teachers also highlighted the difficulty of coming to an agreement with their peers, and their lack of experience in working to achieve consensus in a group setting. Many of these comments were very diplomatic in tone:
“Natirèlman, nou tout te gen diferan ide ak opinyon, men li te kreye ti konfli … Nou aprann kòman pou n fòme ak pataje ide n.” (Naturally, we all had different ideas and opinions, but it often created little conflicts … We learned how to better formulate and share our ideas) (Survey C4). “In our group sometimes we [had to] correct some [people] because we discovered very different answers” (Interview 204).
Some teachers frankly admitted to having trouble accepting the ideas of others during these sessions.
“Pi gwo defi pou mwen se te aksepte opinyon kolèg mwen yo” (The biggest challenge for me was to accept the opinions of my colleagues) (Survey C30). “The biggest challenge was to accept the point of view of a member of your group without anger” (Survey E20).
Other teachers expressed their frustration that some of their peers were less willing to compromise, or only wanted to defend or impose his or her own opinion.
“Gen manm nan group lan ki awogan” (Some group members were arrogant) (Survey C38). “Some of my fellow teachers are very arrogant. Some of them think they speak and write better than Shakespeare” (Survey E35).
It is interesting to note that several of the teacher comments can be interpreted as evidence of the arrogant behavior that was mentioned. Two teachers clearly stated that they were the leaders of their respective group, even though no leaders were assigned. Some teachers expected their opinions to be accepted by the group as a matter of course, and expressed frustration when they were not obeyed:
“Gen pwofesè ki te refize aksepte opinyon m, menm lè m te gen rezon” (Certain teachers refuse to accept my opinion even though I am right) (Survey C50). “Some of my collaborators found me too demanding as leader of the group. [One thing I would change would be] to have more tolerance for others next time I lead a group” (Survey E25).
Discussion
Synthesizing LAL development of teachers and facilitators
This week of workshops had as its practical objective the co-creation of revised national examinations that could address the major technical problems of the current examinations, in order to allow a fairer examination of secondary school students. The research objective during this week was to take this opportunity to examine the activities of the workshops for evidence of LAL development by all parties, as well as perceptions of the benefits and challenges of collaboration. These results can be used to inform future workshops with these groups, and have also provided food for thought regarding the identification and categorization of elements of LAL for future research.
As previously mentioned, the language assessment literature has referred to three general categories of LAL: assessment theory and/or conceptual knowledge; skills in assessing; and moral/ethical practice. Although most of the elements of LAL identified by Taylor (2013) in Figure 1 can conceivably fall into more than one of these three categories, two other elements do not seem to fall neatly into one of these categories: “local context” and “personal beliefs and attitudes.” We will therefore discuss our results in terms of these three general categories of LAL above, in addition to these two elements that do not fit – five categories in total. One additional level of precision to be added at the same time is whether these categories refer to propositional declarative knowledge (“what”) or procedural knowledge (skills; “how to”). It has also been suggested here that LAL may be usefully represented as a professional competency. Therefore, we will also discuss these same results in relation to the typology of professional competence developed by Eraut (2004) and his colleagues.
An analysis of the data reveals that teachers reported and demonstrated development in terms of their theoretical and conceptual knowledge in the following ways:
They improved their ability to describe a complex construct of reading comprehension that includes inferencing and vocabulary knowledge;
They demonstrated an awareness of principles of integrated exam creation through the drafting and discussion of exam items based on a common text.
In this study, this type of learning is categorized best as propositional, declarative knowledge. Teachers also reported learning new concepts related to assessment, such as reliability and validity, although this reported learning was not sufficiently verified in action. Future activities with this group should include collecting more information on teachers’ understanding of these concepts as applied to their teaching and assessment practice.
Teachers also reported and demonstrated improvement in terms of their skills in assessing. The workshops covered technical issues in the development and revision of reading comprehension questions, vocabulary and grammar items, and writing prompts with related grading rubrics. Of all these topics, teachers reported and demonstrated their learning of technical skills in the creation of reading comprehension items and vocabulary items in particular. Results found that while teachers reported a certain level of comfort with grammar items, they had never explicitly focused on vocabulary before, and they had never given much thought to the creation of reading comprehension items – which can best be classified here as procedural as opposed to propositional knowledge.
There was also evidence of development in areas relating to moral and ethical practice, in that teachers reported a shift to a more student-centered orientation. There was some evidence that teachers became more open to viewing exams as an opportunity for students to demonstrate their best work, rather than intentional obstacles or as a way of trapping or catching students. Arguably, this awareness could also be related to a shift in their personal beliefs (Taylor, 2013), regarding their personal role as teachers in facilitating student success. This learning would best be described as propositional rather than procedural.
The LAL development of facilitators concerns primarily propositional knowledge, and consists of new information related to the local context, which includes local cultural concerns and teachers’ personal beliefs and common pedagogical practices, as well as attitudes towards collaboration. One example is that these teachers alerted facilitators to the cultural suitability of certain exam topics. Also, as almost all language instruction in Haiti is essentially grammar instruction, facilitators became aware that expectations are high for students to be able to demonstrate explicit grammatical knowledge. Despite this emphasis on grammatical accuracy in teaching, facilitators also learned that teachers considered content to be a more important criterion than language accuracy in writing tasks.
Facilitators also learned about the challenges to be faced in persuading graders to adopt a rubric for the grading of writing in the classroom as well as in formal examinations. Facilitators subsequently worked to revise the rubrics and create associated guidelines to support these particular teachers effectively. This demonstrates acquisition of a certain type of procedural knowledge being acquired by the facilitators, that is, in skills of assessing.
This characterization of the results in useful for identifying the elements not covered by the study. For example, data were not collected on how the personal beliefs of the facilitators were altered as a result of working with these teachers. Further introspective data would be useful in order to document the extent to which the facilitators’ own personal views and beliefs evolved after engagement with teachers on the ground.
Applying Eraut’s typology
Table 1 summarizes the major results of this study, categorized according to a Taylor’s (2013) aspects of LAL as well as Eraut’s (2004) typology regarding the development of professional competency. We assert that by viewing both categorizations side by side as applied to this study, it is possible to see where the two overlap conceptually and where they may complement each other.
Results of the study categorized in terms of LAL as well as professional competency development.
A review of this table reveals, for example, that the category technical skills has its parallel in what Eraut and his colleagues have termed task performance, and local practices (Taylor, 2013) overlaps with awareness and understanding (Eraut, 2004) – if we take local practices to mean awareness of local practices, as was done in this study. One can see how the enactment of local practices would instead overlap with role performance and be considered procedural rather than propositional, declarative knowledge.
Although some categories do overlap, in some cases the categories of either Taylor (2013) or Eraut (2004) do a better job of making sense of the results of this study. Some learning by the teachers is related to specific skill development in language assessment, which cannot be addressed satisfactorily in any model designed for general professional competency development. On the other hand, one result of this study – the challenge to develop more effective strategies for collaborative decision making – is categorized better within Eraut’s model for professional development because it is not directly associated with language pedagogy or assessment. We would argue that it is essential to include such elements. If language test stakeholders represent professional groups (which they often do), then it is worthwhile to include elements of effective professional competence for those stakeholders – such as collaborative decision making – in our conception of LAL.
Figure 2 represents a suggested alternate depiction of the LAL required for the two stakeholder groups represented in this study. This depiction is different than Figure 1 in the following ways:
the names personal beliefs/attitudes and local practices have been changed to awareness of personal beliefs/attitudes and awareness of local practices, in order to highlight their reference to propositional/declarative knowledge;
knowledge of theory and principles and concepts are folded together (and called theoretical and conceptual knowledge) to represent better the lack of any distinction between these types of knowledge found in the literature as well as in this study;
scores and decision making – this element is reduced to decision making, which includes the process of scoring and emphasizes the procedural nature of this element;
language pedagogy is defined as procedural knowledge (i.e., the enactment of language pedagogy) rather than knowledge of pedagogical principles (belonging to theoretical and conceptual knowledge);
the term technical skills is replaced with the term task performance, a broader term referring to all procedural knowledge related to the design, administration, and validation of language assessments;
collaboration is added, suggested by this study as necessary to consider as an element of this professional competency; and
sociocultural values is placed outside the figure to symbolize how it informs all the other elements.

An alternate characterization of the LAL required for language teachers and language assessment professionals.
Figure 2 is not intended to be a definitive depiction of the elements of LAL but was useful for engaging with the results of this study, and may represent a step towards a more systematic definition of LAL for research purposes. The names of these elements are tentative, and the level of each element required by these two groups was suggested based on our experience with the workshop and the results.
In comparing the suggested levels of expertise required of these two stakeholders on the same figure, one can observe that the expertise of the two groups complement each other. This underlines the importance of collaboration between the two groups. For example, in a team approach language testing professionals can contribute theoretical and conceptual knowledge and teachers can provide expertise in local practice and the sociocultural environment. Previous work in professional development activities in developing contexts has demonstrated that when no consideration is given to these social realities during training, the relative benefits of the activities are reduced (Kayaoglu, 2014). External language assessment specialists must certainly have some basic knowledge of the local political and educational context before any visit, but the highest level would not be necessary where a given project involves local teacher contributors and follows an inclusive research agenda (Nind, 2014), where the lived experiences of local participants are as relevant as research results stemming from a professional research perspective. In inclusive research, both the researcher and the “researched” are required to work together in order to co-create knowledge.
One of the most important discoveries during this week was the need for increased support of the collaborative process. Although most teachers perceived this cooperative decision-making experience as enriching and beneficial, there was evidence that teachers needed to be provided with more tools in order to manage conflict and build consensus. There are materials available that provide tools specifically for education professionals to resolve conflicts and work more collaboratively (see, e.g., Howden & Kopiec, 2001) and these could be made available and implemented in future visits.
Conclusions
In this study, we assumed that the LAL of both language assessment experts and language teachers would develop in tandem during a collaborative professional development session. We therefore undertook an examination of their relative LAL development, as well as examining the extent to which a typology of professional competence might describe our results and help in working towards a useful operational definition of LAL for language assessment researchers.
This study responds to Taylor’s (2013) call for such work. She states that her visual characterization of LAL “merits further thought and exploration” and that “[h]opefully, some of these ideas can be refined more fully to reveal new insights and approaches in this important area for our field” (p. 411). We have discussed how LAL varies for these two groups, but that the knowledge of each group can be viewed as complementary during a collaborative endeavor. This way of viewing LAL development resists casting any one group in a deficit light. As Willis et al. (2013) remind us, teacher assessment literacy constitutes “dynamic social practices which are context dependent and which involve teachers in articulating and negotiating classroom and cultural knowledges with one another and with learners, in the initiation, development and practice of assessment to achieve the learning goals of students” (p. 256). Envisioning LAL for teachers as a type of professional competency may help to expand upon previous work, in order to ensure that LAL continues to be represented as a complex social practice.
This project was not conducted in ideal pedagogical conditions. For example, we were obliged to engage in exam reviewing and revision in the absence of curricular documentation. Therefore, a well-grounded construct-driven process for exam creation was not possible. However, this procedure of retrospection on existing tests (see Pellegrino, 2016) was nonetheless beneficial as we examined the teachers’ perceptions of the instructional relevance of the exams and developed LAL. Also, as a group we were able to address some of the glaring problems that had been identified in previous examinations, for example, by improving accuracy and contextualization of grammatical items, and by including more precise writing prompts and accompanying scoring guides.
Many additional challenges remain in our continued work in Haiti, not least of which is the acceptance of these new examinations at a national level. The revised examinations were completed and presented at MENFP at an important national education round table in Port-au-Prince in April 2014, the “Assises nationales sur la qualité de l’éducation et l’enseignement supérieur.” They were again presented to the Education Minister in the fall of 2015. However, since that time there has been no news as to whether or not they have been adopted. These examinations are far from perfect, but teachers and facilitators alike agreed that they are an improvement over the exams currently used to decide the fate of thousands of high-school students every year.
Finally, there are limitations to a professional development model which consists of occasional workshops, not to mention one facilitated by external parties, no matter how collaborative the approach. The literature on professional development for teachers has suggested that while short-term interventions are useful for acquiring some knowledge and raising commitment to improvement, a more sustainable model – one where Haitian institutions are equal partners in both professional development and in related research activities – would provide Haitian teachers with greater self-determination as they work towards their students’ success.
Footnotes
Appendix A
Appendix B
Excerpt of survey with Haitian teachers following the workshops
Acknowledgements
We first and foremost want to thank the English teacher participants, who worked tirelessly with dedication and passion. We also warmly thank Gerald Jean Baptiste, the Director of the Institution Vision Nouvelle in Limbe, for hosting us and coordinating all elements of the workshop in Limbe. Without him, none of our activities in Haiti would be possible. We thank Pierre Lubin for his immense help in translation in cultural consultation, in data collection in Haiti and in analysis following the trip. We thank BEd students for the contributions to the draft examinations: Hajjara Azam, Guillaume Fecteau, and Shannon Rea. We also thank Roselor Francois for cultural consultation on the draft examinations, and for translations of the Creole teacher responses; Noémi Leclerc for creating the reports of the teacher feedback on the draft examinations; and M. Duclona and Soeur Christianne Gervais of the MENFP for the collaboration in the recruiting and transport of the teachers in the Northern Department. This trip was made possible with the support of the Social Sciences and Humanities Research Council of Canada as well as the International Language Testing Association.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of thisarticle.
