Defining assessment literacy: Is it different for language testers and non-language testers?

Abstract

Language assessment courses (LACs) are taught by professionals who have majored in the area of language testing (language testers or LTs), but also by others who come from different language-related majors (non-language testers, non-LTs). Different language assessment courses may be developed, depending on who teaches the course and the instructors’ understanding of assessment literacy. This study seeks to investigate the effect instructors bring in shaping the characteristics (i.e., content and structure) of language assessment courses. Findings from an online instructor survey (N = 140) and in-depth follow-up phone interviews (N = 13) show that there are significant differences in the content of the courses depending on the instructors’ background in six topic areas: test specifications, test theory, basic statistics, classroom assessment, rubric development, and test accommodation. Interview results confirm non-LTs are less confident in teaching technical assessment skills compared to LTs and have a tendency to focus more on classroom assessment issues. The paper ends by stressing the importance of possessing a common understanding of assessment literacy among stakeholders within the testing community, but also among non-LTs who teach language assessment courses so as to maintain course quality and to better meet student teachers’ needs.

Keywords

Assessment culture assessment literacy language assessment courses language assessment literacy language testers

Assessment literacy is being promoted to various groups of stakeholders (students, teachers, policy makers, administrators) who are impacted by language tests. There is an increase in the move towards sharing assessment knowledge with the stakeholders influenced by language assessment. It is vital for these stakeholders to implement actions based on appropriate assessment knowledge.

One of the most common and widely known ways to promote assessment literacy is through language assessment courses. These courses, which are usually offered through MA TESOL programs, have been one of the first doorways to promote and execute assessment knowledge for language teachers. In the past, there have been a few studies that looked specifically into these courses and how they are taught, but none of the studies looked precisely at the instructors who come from a non-testing background. It is important to know where these courses stand in terms of teaching language assessment to pre-service/in-service teachers and how they differ in content and quality compared to the courses taught by language testers. Student teachers who take these courses will be making important decisions based on their assessment practices informed by the content of the LACs that they have studied; therefore instructors must guarantee the courses maintain high quality.

Literature review

The review of literature on assessment literacy is based on expanding the concept of assessment literacy and the structure and format of language assessment courses. The concept of assessment literacy, introduced in general education by scholars like Stiggins (1999a, 2001) and Popham (2008), began to appear in the language testing literature about ten years ago. The notion of educating stakeholders is strongly connected with LACs and a few empirical studies have been conducted to find out how such courses are taught.

Expanding assessment literacy

Stiggins (1999a) describes assessment literacy as involving teachers’ understanding of what assessment methods to use and when to use them so teachers can gather reliable information/data about students’ achievements. Assessment literacy also incorporates teachers’ ability to communicate assessment results effectively to students, parents, and other educational professionals (Stiggins, 1999a). Inbar-Lourie (2008, p. 389) writes of assessment literacy as ‘having the capacity to ask and answer critical questions about the purpose for assessment, about the fitness of the tool being used, about testing conditions, and about what is going to happen on the basis of the results.’

For language teachers, being assessment literate means possessing assessment literacy skills combined with language-specific competencies (Inbar-Lourie, 2008). In order for teachers to be assessment literate in their classroom practices, it is important that they are provided with the appropriate teacher training in assessment. Recently, with the increased influence of testing, assessment literacy is being perceived as knowledge not only required for teachers but also for other stakeholders (e.g. policy makers, examination boards, parents, and the general public) within the education testing culture (Taylor, 2009). Move citation to end of sentence. Despite calls for expanding the knowledge of assessment, there has been hardly any discussion about the assessment literacy of non-LTs who teach language assessment courses. It is stressed that teachers should be assessment literate to correctly implement classroom assessment, to explain the results of standardized tests to stakeholders, and to follow the standards of assessment rules. But, is there a required level of assessment literacy to teach an LAC? For example, do the instructors who teach language assessment courses have a testing background? What kind of assessment knowledge do they possess? How do they teach the course and does it meet the expectations and needs of the student teachers? Former studies of LACs focused only on the overall instructor group rather than on differentiating LTs and non-LTs though, as Taylor (2009) has noted, many non-language-testing individuals work within the testing area.

According to a syllabi review conducted by Jeong (2009), 15 out of the 30 instructors who taught introductory language assessment courses came from a non-testing background. The purpose of this syllabi review (N = 30) was to find out the topics covered in language assessment courses and the external (department, position in the program, and geographic factors) and internal (instructor, target student population) factors that influence the content and structure of the course. It was found that the non-LTs’ diverse backgrounds in foreign languages (e.g., Spanish, French, German, etc.), special education, bilingual education, quantitative research methods, and rhetoric) translated into different course structures. Courses taught by LTs offered a variety of topics, such as computer adaptive testing, ethics in testing, statistics, and performance assessment. Courses taught by non-LTs focused more on delivering the basic theory of testing and emphasized classroom assessment. Statistics or measurement was taught by 13 instructors, 10 of whom had testing backgrounds. Only three non-language tester instructors covered statistics or measurement in their courses. Findings from this syllabi review identified the existence of two different instructor groups: language testers and non-language testers.

The findings from the syllabi review showed that, just as LTs can teach courses out of their disciplinary areas, such as material development and teaching methodology, non-LTs may end up teaching language assessment courses (Jeong, 2009). Concerns naturally arise when instructors teach a subject area they have limited knowledge or different understandings of. It is particularly a concern for language assessment courses since assessment often involves very high stakes and raises quality concerns of how non-LTs teach the courses compared to LTs, as essential topics that should be covered for a language assessment course can vary depending on the instructor’s background.

Studies in language assessment courses

The first study on language assessment courses was done by Bailey and Brown (1996, and repeated in 2008). The purpose of the study was to ‘investigate the instructors’ backgrounds, the topics they covered, and their students’ apparent attitudes toward those courses’ (p. 351). Brown and Bailey’s (2008) study was a starting point in the research of LACs. It gave information on how the courses were taught and which topics were covered and to what degree. Despite the insights gained from the study, it did not investigate is the issue of non-LTs who teach LACs. Unlike Brown and Bailey’s survey, which reported that almost all of the respondents had experience in language testing, 50% (15 out of 30) of the instructors from Jeong’s (2009) syllabi review did not have experience related to the field. This difference may be due to the characteristics of the mailing list Bailey and Brown used for their study, the LTRC (Language Testing Research Colloquium) mailing list, as LTRC members are likely to come from a language-testing background.

Few studies about language assessment courses have been done by instructors who taught assessment courses. An important study by Kleinsasser (2005) covers challenging aspects of language assessment courses from the instructor’s perspective. Kleinsasser states that one of the major difficulties in teaching a language assessment course is connecting theory with practice: ‘The bridge between the (theoretical) class discussions and the final (practical) test/assessment product, however, was not well constructed. Challenges in getting the students to move from theoretical issues to practical ones often surfaced’ (p. 82). Kleinsasser reports that students felt the time spent defining constructs, developing and piloting assessment materials, and rewriting and rethinking the various assessment tasks and items was quite burdensome, since many felt this is not the typical process they go through in a real classroom situation. However, the group work process encouraged them to include various stakeholders’ perspectives in test development and widen their views of testing.

In general a review of the literature points at the paucity of research studies on LAC’s and there is no currently available research on the instructor’s language testing background. This article therefore reports on a study that explores how LACs are constructed and taught in various countries depending on the course instructor, using a mixed methods approach.

To explore these issues, the following research questions were posed, juxtaposing language testers versus non-language testers as course instructors:

Language testers vs. non-language testers: How are their language assessment courses similar?

Language testers vs. non-language testers: How are their language assessment courses different?

The mixed methods approach

A mixed methods approach is used in this study for purposes of development and complementarity. The purpose of using a blend of methods is to elaborate, enhance, illustrate, and clarify the results from one method (quantitative) with the results from another method (qualitative) to increase meaningfulness (Greene, 2007). In a complementarity study, two methods look into overlapping areas from different facets of a phenomenon. By collecting both qualitative and quantitative data, a more complete and contextual explanation for the research questions could be derived.

The type of mixed methods design employed is an integrative design (Greene, 2007) with the study taking the form of an iterative design where all phases interact with each other throughout the process. Although the implementation of each phase was done separately, the analysis was closely linked to the next phase of the study.

Method

The participants for the study consisted of two instructor groups: language testers (LTs) and non-language testers (non-LTs). Language testers are individuals or professionals whose primary research interest is in areas of language testing. Non-language testers are defined as those whose primary interest is in other areas of language teaching (e.g. second language acquisition), but who have had experience in language-assessment-related activities (e.g., developed standardized tests, worked with a testing agency, etc.).

A total of 140 instructors completed the survey. The survey data was entered in SPSS v. 17 for Windows. First, it was examined using descriptive statistics to organize and summarize the data. Next, to investigate differences between groups (LTs vs. non-LTs) regarding the time spent on teaching certain topics, t-tests were conducted. More detailed demographic information regarding the instructors will be presented in the findings section.

The participants for the interviews (N = 13) were selected through a combination of convenient sampling and purposeful sampling. Key semi-structured interview questions are presented in Appendix B. Open coding was used for the initial coding of the interview data. Transcripts were unitized and concepts were highlighted and labeled. Subsequent coding took place by comparing the current transcript with the survey data. Additional topics emerged as the coding proceeded and eight course topics that received the most attention were identified: test theory, classroom assessment, performance assessment, test specifications, rubric development, statistics, ethics, and test accommodation.

Data collection process and analysis

The online survey was first launched on July 2, 2010 and was closed August 20, 2010. The survey questions relevant to this study are presented in Appendix A. The survey participants were contacted via professional ESL/EFL organizations and language testing organizations (e.g., LTEST, MwALT, ECOLT, SCALAR, TESOL, TESL-L, AAAL, CALICO, NYTESOL, CATESOL) and through personal emails. Instructors and student teachers took the survey at a time and location convenient to them. The website that provided the online survey was ‘Surveygizmo’ (www.surveygizmo.com/s/282990/language-assessment-courses) and consisted of a total of 29 items. Completion of the survey lasted approximately 15~20 minutes. The survey consisted of demographic information and questions that asked how much class time was devoted on teaching certain topics and which topics instructors found important for the student teachers.

Findings

The findings of this study will be presented in two main sections: people and courses.

People

A total of 140 self-identified LAC instructors participated in the survey (see Figure 1). Of the instructors, 47% (n = 66) said that language testing was their primary research area, 52% (n = 73) said it was not, and one person did not provide this information. The non-LTs’ most common assessment-related experience was working with classroom teachers on testing (n = 77), having experience in developing standardized language tests (n = 47), and working as a rater (n = 46). In terms of their final degrees, there was little difference between the two groups. For both groups, applied linguistics (LT = 32, non-LT = 21) was the area most respondents reported having earned their final degrees in. The only notable difference was from the education and foreign language majors. Nine out of 10 education majors identified themselves as non-language testers, and all the foreign language majors (n = 3) were non-language testers. Education majors included education leadership, international development in education, and literacy education. Therefore, instructors’ majors cannot be an identifier in separating LTs from non-LTs.

Figure 1.

Survey instructors’ demographic overview.

The main target audience for the courses for both groups was student teachers (64.2%, N = 85). Some courses targeted regular undergraduates (20.44%, n = 30) and graduate students who were not student teachers (13%, n = 18). The majority (68.4%, n = 50) of the target audience of non-LTs was student teachers, a percentage higher for LTs (53.8%, n = 35). This shows that roughly two-thirds of the student teachers took language assessment courses taught by a non-LT instructor.

Regarding teaching experience, 70% (n = 93) of all instructor were veterans in language teaching, with more than 10 years’ experience. Looking more closely at the instructors’ teaching experience, most (80.3%, n = 106) answered that the grade level they taught was college, and 19.7% (n = 26) taught K–12. More specifically, 13% (n = 20) of the instructors had experience teaching at the secondary grade level, and only 6% (n = 6) had taught elementary school. Non-LTs had more K–12 teaching experience compared to LTs. While 24.6% (n = 18) of the non-LTs had taught K–12, only 12.3% (n = 8) of LTs had experience teaching K–12.

Language assessment courses: Course structure

The results of the survey and interview show that the structure and organization of the courses were similar for the two groups. Both LTs and non-LTs began with an overview of the theory of the course and ended with an activity that requires student teachers to use the content learned earlier from the course. The end product of the course differed but was in similar forms. Some instructors asked STs to develop a classroom test either independently or collaboratively; others were required to pilot and report the results of the test. The differences in how many activities were done relied more on the individual instructor’s characteristics than on his or her background. Thus, structure-wise (content and format), the courses had similar features regardless of being taught by LTs or non-LTs.

Topic time coverage: LTs vs. non-LTs

To identify the similarities and differences regarding what topics were covered in LACs and to what extent, instructors were asked to report how much time they spent on 14 commonly covered topics on a Likert-scale item (see Figure 2). Independent t-tests were used to identify if there were significant differences between the language tester group and the non-language tester group in this respect.

Figure 2.

Topic time coverage: LTs vs. non-LTs.

The top five areas in which instructors spent the most time teaching were test theory, classroom assessment, alternative performance assessment, test specifications, and rubric development (Figure 2; specific numbers in Table 1). Topics that received similar or same time coverage by both LTs and non-LTs were test critique, history of language testing and advanced statistics. There were six areas of significant differences between the two groups with the greatest area of disagreement being test accommodation. Language testers ranked this category towards the bottom (13th out of 14 categories, M = 1.64), while non-language-testers ranked it ninth (M = 2.39) out of the categories. Thus, compared to non-LTs, LTs spend little time covering test accommodation.

Table 1.

Different topic time coverage: LT vs. non-LT.

	Language testers		Non-language testers		T-test
	N	M	N	M	t	p
Test specifications	61	3.26	68	2.93	2.364	.020*
Test theory	63	3.48	70	3.19	2.251	.022*
Basic statistics	58	2.62	66	2.23	2.092	.023*
Classroom assessment	62	3.15	69	3.46	−2.3000	.024*
Rubric development	61	2.74	66	3.09	−2.278	.020*
Test accommodation	53	1.64	66	2.39	−4.208	.000*

Out of the 14 topics, instructors spent the most time teaching test theory. For specific groups, LTs ranked this first (M = 3.48), and non-LTs ranked this third (M = 3.19). Both groups of instructors thought it was important to teach test theory, but results show that language testers value its importance more than non-LTs. Another important topic that was covered in language assessment courses was classroom assessment. From the instructor survey, classroom assessment ranked the second most covered topic (M = 3.31) in language assessment courses. Non-LTs responded that they spend the most time (M = 3.46) teaching classroom assessment above all other topics, and LTs ranked this third (M = 3.15). For test specifications, there was a significant difference (p = .020) between LTs and non-LTs. The overall ranking of this topic was fourth (M = 3.09), but for LTs, this was the second topic on which they spent the most time (M = 3.26) teaching. However, for non-LTs, it ranked sixth out of the 14 topics, with a mean of 2.93, which was significantly lower than the mean for the LTs.

Important topics for classroom teachers: LTs vs. non-LTs

In a second set of Likert-scale items, instructors were asked to identify how important the course topics were for classroom teachers (Appendix B). For all the instructors (N = 140) the five most important topics for classroom teachers are as follows: classroom assessment (m = 3.76), alternative assessment (M = 3.51), test specifications (M = 3.37), test theory (M = 3.36), and rubric development (M = 3.35).

Significant differences (p < 0.034) between LTs and non-LTs were found in five areas: test specifications (p < .012), test-taking skills and strategies (p < .042), alternative assessment (p < .020), rater training (p < .032), and test accommodation (p < .010). Out of these five, two topics (test specifications, alternative assessment) were placed among the five most important topics for classroom teachers and topics that got the most coverage in teaching. Similar to earlier results, test specifications received a significantly higher mean (M = 3.54) from language testers compared to non-LTs (M = 3.23). Language testers ranked this the second most important topic for classroom teachers while non-LTs ranked it fifth. For alternative assessment, the results were the exact opposite. Non-LTs pointed this as the second most important topic (M = 3.63), but LTs placed it as fourth (M = 3.37).

Course materials: Language assessment course textbook usage

To obtain more insight on the courses and especially on course materials used in language assessment courses a follow-up interview was conducted with 13 instructors from the original sample (six language testers and seven non-language testers). The first topic of inquiry was the textbooks used as it was deemed that content analysis of the types of books used in language assessment courses can help in predicting the topics taught in the courses and add information on the characterizing features of the two instructor groups (LTs and non-LTs). However, the interview findings reveal that currently available textbooks regarding language assessment courses were thought to be difficult and not very useful to the student teachers or the non-LT instructors. For example Smith, who is a non-language tester and head of a TESOL MA program in the UK, in particular commented that every time she hears that a new textbook has been published, she gets very excited, but once she goes over the content, she sighs and quickly closes the book. She felt the language assessment course textbooks were targeted for large-scale assessment usage, so the content covered in these books was not directly related to language teachers’ needs.

Anderson, who is a non-LT, the program coordinator of a university language program, commented, ‘I’d like to see [textbooks] more in classroom assessment. Ultimately, that is for the population I work with; that’s what they’re going to need.’ She did understand the need for teachers to be aware of the issues surrounding standardized assessment but felt a lot of the content covered in standardized assessment is not applicable to the classroom context. Yet, one of the interviewees, a veteran language tester who has written several books in this field, commented that there is little difference in theory regarding large-scale or small-scale assessment. He states, ‘Validity is validity, and reliability is reliability.’ Another opinion from a non-LT instructor was that textbooks seemed to be appropriate for PhD students majoring in language testing rather than MA, undergraduate pre-service or in-service teachers.

The instructors reported from their teaching experience that textbooks written by language testers covered more test-oriented topics such as test specifications, ethics, and a lot more on standardized assessment compared to classroom-assessment-oriented books. However, as was stated before, these books did not seem to be used as much by non-LTs in teaching their courses. Although Anderson appreciated new books in the field, she commented that they could be over the top for her student teachers. ‘I refer to it [a textbook] a lot; I read it out loud, and students borrow it and use it, [but it is] too dense and not practical enough [for my student teachers]. I do not think it would be a useful resource [for student teachers] to have them on their shelf. It’s a great [resource] for me, but not the students.’ Anderson liked the top-selling language assessment textbook on the market, not necessarily because it was the best in terms of the content, but because it was reader friendly and accessible to non-native speakers. As Smith mentioned in regard to the same textbook, ‘This textbook sells well, but not because anyone likes it; it’s the least worst one out there.’

To sum up this point, from the interviews it is apparent that there was a different level of satisfaction with regard to textbooks among the instructors. It appears that non-LTs were less satisfied with the textbooks than the LTs. This could be because the majority of language assessment textbooks are written by language testers who focus more on theoretical aspects.

In summary, the overall structure of the course was similar for both LTs and non-LTs, but the areas LTs focused on were more technical (e.g., test specifications, test theory, statistics) compared to non-LTs who spent more time on teaching classroom related topics (e.g., alternative assessment, classroom assessment). With regard to language assessment textbooks, similar to their preference in teaching, non-LTs thought textbooks that are more focused on day-to-day classroom activities were more helpful.

Discussion

The findings of this study show that language assessment courses may differ depending on the instructors’ background in the area of language testing. The following discussion considers possible reasons for this phenomenon and their implications.

Same name, different course: Different definition of assessment literacy?

As stated earlier in this paper, previous studies (Stiggins, 1999b, 2002) in assessment literacy have focused on and attempted to define the components of assessment literacy for teachers in general and language assessment literacy for language teachers. The findings of this study, however, raise questions as to whether instructors (both LTs and non-LTs) of language assessment courses share a common definition of language assessment literacy. When teaching a course, each instructor will bring his or her own uniqueness to the courses they teach. Even within the language testers’ group, the course taught by instructor A will be different from instructor B. However, more diversity arises when the training background of the instructors shows a weak connection to the courses they teach.

The research findings show that LTs spend significantly more time on test theory while non-LTs focus more on classroom assessment and test accommodations. This implies that the outcomes of the course will be different depending on the instructor, even though the course is offered under similar names (e.g., language testing, language assessment). For example, if a student teacher took a course from a LT, he or she would have a better chance of learning more about test specifications and statistics compared to student teachers who took the course from a non-LT.

According to the survey, half of the instructors come from a non-testing background. As there seem to be more language assessment courses taught by non-LTs compared to LTs, it is important that both instructor groups share a common knowledge of what should be covered in language assessment courses in general. Since many of the non-LTs come with little training or experience in language testing, the language testing community should offer a set of objectives and guidelines or even training to the non-LTs. LTs should also check if the course content is appropriate for the target student audience. Non-LTs are often suddenly asked to teach the course with little preparation on a needs basis (e.g., Olson). Others (e.g., Smith, Ryan in this study) self-developed the course since the curriculum did not have an assessment course and were asked to teach it because there was nobody else to do so. Past experience or education in language testing was not a crucial factor in making these decisions, and none of the non-LTs interviewed received additional training after being selected to teach the course.

The six language testers interviewed understood the reality that many courses were taught by non-language testers. All of them said that it would be preferable if assessment courses were taught by those who majored in the field, but due to the lack of personnel it was acceptable for a non-LT to teach the course. From the follow-up interview, Goldberg, a language tester, commented, ‘Something is better than nothing. They [non-LTs] will represent a different take on the knowledge that I would have. The kind of background they [instructors] have really makes a difference [in teaching the course], [but] somebody has to do it. There is a demand. There is a need [for language assessment courses]. It would be better than nothing.’ Goldberg thinks it is better for non-LTs to teach an assessment course than not to have a course at all, but she expects that the content of the course will be more related to classroom issues, which differs from her course.

As shown in the findings, there are similarities and differences in the content and structure of the courses between LTs and non-LTs. The reasons lie on notions as to what constitutes assessment literacy. Taylor (2009, p. 27) writes, ‘Training for assessment literacy entails an appropriate balance of technical know-how, practical skills, theoretical knowledge, and understanding of principles.’ In addition, Taylor states, (2009, p. 27) ‘the level of literacy to be acquired may vary according to the context.’ However, in order to maintain course relevance, for instructors who teach language assessment courses, a common understanding of assessment literacy is needed. Then how should this be acquired and what can the language testing community do to promote sharing assessment literacy with non-LTs?

Characteristics of the language assessment community

Prior to going into detail about ways to increase collaboration between LTs and non-LTs, it is important to think of the unique characteristics of the language assessment community. Traditionally, the testing community has been portrayed as being ‘highly technical’ (Taylor, 2009, p. 21), which makes it difficult for outsiders to approach the group. This was a shared notion among the non-LTs who were interviewed in this study as well. Even though the non-LTs felt the need to expand their knowledge in the area, they expressed difficulty because, first, they did not know how to reach relevant sources, and second, they did not feel comfortable asking for guidance from the language testing community. As noted by a non-LT, Smith, attempting to gain technical knowledge, especially statistical knowledge, increased the feelings of eliteness of the LT group and the inferiority of the non-LTs. The non-LTs who expressed uneasiness and vulnerability in teaching technical topics also felt pressured by the high walls of the language testing community. This concern is aligned with Davies’ (2008) and Spolsky’s (2008) work on the risks of isolation of the language testing community due to its distinct nature. Although testing professionalism makes the community unique and separates it from other applied linguists, it also acts as a dividing wall. If the goal of LT is to build a ‘broader and inclusive community of good practice’ (Taylor, 2009, p. 29), one of the first things that should be done is to train non-LTs to become proficient in language assessment.

Attempts have been made over the years and more so recently to hold workshops to promote language assessment literacy at conferences such as the Language Testing Research Colloquium (LTRC) and EALTA (European Association for Language Testing and Assessment) (Taylor, 2009). However, these workshops were not well known to non-LTs. Only one of the non-LTs interviewed had attended LTRC. In addition to putting effort into widely advertising these workshops, LTs could link up with non-LTs by attending the conferences that the non-LTs are affiliated with. For example, increased workshops in TESOL or other regional conferences can be a stepping stone to enhance the literacy of non-LT instructors.

One practical way to do this is by offering a set of guidelines of what should be covered in language assessment courses. An array of topics that the community agrees on should be covered in an introductory course and an advanced course could be the first step towards training LT professionals. The focus of the topics can vary depending on the target audience, but there are core topics and areas that should be focused on more for particular target groups (e.g., K–12 vs. adult).

Identity of a language tester

Unlike other stakeholders (i.e., policy makers, examination board, parents, and teachers) in the testing culture, non-LT instructors may be perceived differently depending on who the perceiver is. Within the language testing community, they may be perceived as non-language testers; however, outside this community, the non-LT instructors can be identified as language testing specialists. This is similar to the identity issue of native speaker vs. non-native English teachers stated by Inbar-Lourie (2005). A non-native speaker who identifies himself or herself as a non-native speaker can be perceived to be a native or near-native speaker if the person is positioned in a context where he or she has the most knowledge of the content. The same reasoning applies for language testers; there is no clear criterion as to who a language tester is and who is not. Then what is the signifier for a language tester? What qualities or features make them LTs? For the sake of convenience in this study I gave a simple definition of who is and who is not a language tester; however, the identity issue is more complex than I expected. Even within the group of language testers, the group can be divided into sub-groups. Similar to the native/non-native EFL teacher debate (Inbar-Lourie, 2005), there are self- versus socially perceived language testers and non-language testers.

For language testers, the identity can begin with the meanings attributed by others. For example, expectations society has from language testers (e.g. in order to qualify as an LT, one has to have taken certain courses, should know the basics of the field and beyond). Similar to lack of clarity in the definition of what distinguishes a native speaker from a non-native speaker, the debate of who is a language tester is and what enables one to be one is ongoing.

Conclusion

This study is not only about a language assessment course, but als about the language testing community. For the future of language testing, I suggest that it is time for language testers to move towards other applied linguists. Language testers gained power in the community along with the power of language tests, but this power also brought fear to the testers of being isolated from other applied linguists as the area of expertise is too technical and specialized (Spolsky, 2008). It is important for LTs to preserve their specialty, but also it is essential to share the knowledge and make it accessible to those who are part of the language assessment culture. It is the role of the LT community to make the field approachable to others.

Footnotes

Appendix A

Instructor survey

Appendix B

Instructor interview questions

Appendix C

Table A1.

Important topics for classroom teachers: Instructors’ perspective.

Topic	Instructors				Language testers				Non-language testers
	n	M	SD	Rank	n	M	SD	Rank	n	M	SD	Rank	t	p
1. Test specifications	138	3.37	.735	3	67	3.54	.611	2	71	3.23	.814	5	2.555	.012*
2. Test administration	137	2.93	.727	8	66	2.95	.711	9	71	2.92	.751	8	.312	.756
3. Test critiquing	136	3.18	.696	6	65	3.22	.718	6	71	3.14	.682	7	.621	.536
4. Test-taking skills or strategies	137	2.72	.781	11	66	2.58	.842	11	71	2.85	.690	10	−2.039	.042*
5. Test theory	136	3.36	.674	4	66	3.47	.661	3	70	3.27	.679	4	1.723	.087
6. Basic statistics	136	2.89	.863	9	66	2.92	.865	10	70	2.86	.873	9	.450	.653
7. Advanced statistics	136	1.74	.795	14	65	1.74	.853	14	71	1.76	.746	14	−.161	.872
8. Test ethics	135	3.12	.751	7	65	3.03	.684	7	70	3.19	.804	6	−.1.209	.229
9. History of language testing	136	2.31	.753	13	66	2.38	.696	13	70	2.26	.793	13	.948	.345
10. Classroom assessment	138	3.76	.494	1	67	3.69	.556	1	71	3.82	.425	1	− 1.541	.126
11. Alternative assessment	137	3.51	.642	2	67	3.37	.648	4	70	3.63	.618	2	−.2.363	.020*
12. Rubric development	137	3.35	.751	5	67	3.31	.656	5	70	3.40	.824	3	−.679	.499
13. Rater training	135	2.89	.795	9	66	3.03	.744	7	69	2.74	.816	12	2.164	.032*
14. Test accommodations	136	2.60	.781	12	65	2.42	.659	12	71	2.76	.853	11	−2.624	.010*

Notes: (1) The response scale was as follows: 1 = hardly any time, 2 = a little time, 3 = some time, 4 = extensive time. (2) Multiple independent t-tests were used to identify significant differences between the two groups. Using the Bonferroni Adjustment required significance at the level (p < .0035).

Acknowledgements

I would like to Fred Davidson for his guidance and feedback throughout the study. I would also like to thank Hongling Sun for her thoughtful suggestions and help.

Funding

Funding to conduct this study was provided by TOEFL Small Grants for Doctoral Research in Second or Foreign Language Assessment and the Hardie Dissertation Award from the University of Illinois at Urbana Champaign.

References

Bailey

K. M.

Brown

J. D.

(1996). Language testing courses: What are they? In Cumming

Berwick

(Eds.), Validation in language testing (pp. 236–256). Philadelphia, PA: Multilingual Matters.

Brown

J. D.

Bailey

K. M.

(2008). Language testing courses: What are they in 2007? Language Testing, 25(3), 349–383.

Davies

(2008). Textbook trends in teaching language testing. Language Testing, 25(3), 327–347.

Greene

J. C.

(2007). Mixed methods in social inquiry. San Francisco, CA: Jossey-Bass.

Inbar-Lourie

(2005). Mind the gap: Self and perceived native speaker identities of EFL teachers. Educational Linguistics, 5, 265–281.

Inbar-Lourie

(2008). Constructing a language assessment knowledge base: A focus on language assessment courses. Language Testing, 25(3), 385–402.

Jeong

(2009, November). Syllabi review of language assessment courses. Poster presented at the Thirty First Language Testing Research Colloquium, Denver, Colorado.

Kleinsasser

R. C.

(2005). Transforming a postgraduate level assessment course: A second language teacher educator’s narrative. Prospect, 20, 77–102.

Popham

W. J.

(2008). Assessment literacy for teachers: Faddish or fundamental? Theory into Practice, 48(1), 4–11.

10.

Spolsky

(2008). Language testing at 25: Maturity and responsibility? Language Testing, 25(3), 297–305.

11.

Stiggins

R. J.

(1999a). Assessment, student confidence, and school success. Phi Delta Kappan, 81(3), 191–198.

12.

Stiggins

R. J.

(1999b). Evaluating classroom assessment training in teacher education programs. Educational Measurement: Issues and Practice, 18(1), 23–27.

13.

Stiggins

R. J.

(2001). The unfulfilled promise of classroom assessment. Educational Measurement: Issues and Practice, 20(3), 5–15.

14.

Stiggins

R. J.

(2002). Assessment crisis: The absence of assessment FOR learning. Phi Delta Kappan, 83(10), 758–766.

15.

Taylor

(2009). Developing assessment literacy. Annual Review of Applied Linguistics 29, 21–36.