Book Review: Evaluating language assessments,by Kunnan,A. J.

Abstract

As one of the volumes in the series “New Perspectives on Language Assessment” published by Routledge in 2018, Evaluating Language Assessments centers around an ethics-based approach to assessment evaluation in terms of fairness and justice.

The author of the book, Dr. Kunnan, is now Professor of Applied Linguistics at the University of Macau. He earned his doctoral degree on Applied Linguistics from University of California, Los Angeles, in 1991 under the supervision of Prof. Bachman, the most influential founding father in the field of language testing. The author, as a prominent expert in the field, has published numerous books and articles mainly in relation to fairness and justice in language assessment, language policy, and statistical analysis. This book epitomizes his previous academic endeavors and distills them into a theoretical framework of ethics-oriented assessment evaluation.

The volume is comprised of 10 chapters, which can be classified into three broad parts: Introduction, Body, and Summary, serving the clarification of why-, what-, and how-issues of the new approach to assessment evaluation.

To begin with, Introduction (Chapters 1 and 2) presents the reason for developing a new approach to assessment evaluation from two aspects. For one thing, Chapter 1 focuses on the necessity of regularly evaluating assessments, as exemplified by the problems with fairness in language assessments in a variety of contexts (e.g., civil service, literacy, immigration, and schooling) which were never challenged by either test takers or the community, though. For another, two existing approaches to assessment evaluation (i.e., the Standards-based and the Argument-based approaches) are reviewed in Chapter 2. Specifically, the following deficiencies are pointed out: no intellectual foundation, fairness included only as one aspect of the evaluation, no attention paid to institutional justice, and narrow scope of fairness investigation. These two chapters set the background for the new approach.

Next, this new approach—the ethics-based approach to evaluation of fairness and justice in assessments, together with how to apply it—is mainly covered in the Body (Chapters 3–9), which is the most important part of the book. Specifically, Chapter 3 elaborates on the theoretical (intellectual) basis of the new approach. The author, drawing on theoretical views of justice proposed by two moral philosophers—Rawls and Sen, discusses what should be considered in the application of ethics to language assessment. They are, respectively, two fundamental issues (i.e., universal or contextual, fact-independent or fact-dependent), four fundamental aspects (i.e., transparency, equity, impartiality, and uniformity), and six fundamental questions related to individual rights, consequences of assessment and public justification. Then, built on previous studies on ethics-based principles in language testing (e.g., International Language Testing Association [ILTA] Code of Ethics), two general principles of fairness and justice and their respective sub-principles are put forward.

After the elaboration on what the new approach is, the book goes on to explore the other important issue—how to apply it. One aspect of application is to build an argument for the principles of fairness and justice in the process of assessment evaluation. First of all, Chapter 4 presents the Toulmin argumentation model which consists of articulating claims based on the fairness and justice principles, providing warrants, and supplying the backing. In addition to building an argument for fairness and justice, it is also necessary to evaluate its quality, in terms of seven key components (p. 106) which are indispensable in an acceptable argument. As a result, the evaluated argument can be given reasoned full acceptance, reasoned partial acceptance, reasoned rejection, or deferred judgment, considering the quality and quantity of the available evidence for it.

Then, for the illustration of the argumentation process proposed in Chapter 4, Chapters 5 to 8, respectively, explore the articulation of four fairness and justice claims together with their evaluation by presenting warrants, backing or rebuttal. Specifically, Chapter 5 presents the first sub-principle of the principle of fairness—the opportunity to learn (OTL), which refers to the opportunities “in the classroom for learning content and skills with the help of teachers and textbooks and related activities” (p. 110). The claim in relation to it can be articulated in terms of four aspects: (a) curriculums, materials, and feedback, (b) adequate time for preparation, (c) adequate practice with new technology, and (d) relevant social practices and embodied experiences. Four illustrative studies are provided. Besides, two other related concepts are mentioned—the opportunity for success (OFS) in the assessment context and OTL after the assessment.

The second sub-principle of the principle of fairness—meaningfulness is examined in Chapter 6, which refers to the validity (e.g., content, concurrent, or construct validity) of an assessment. Argumentation for it can be conducted from five aspects: (a) the blueprint and test specifications, (b) cognitive processes of test takers, (c) item or test consistency, (d) the construct underlying test performance, and (e) test consequences. The argumentation process from the first, third, and fourth aspects are illustrated by two empirical studies.

Chapter 7 discusses the third sub-principle of the principle of fairness—absence of bias. Test bias means that the test “yields scores that have a different meaning for members of one group from their meaning for members of another” (p. 167). There are three main causes of test bias: cognitive-irrelevant variance (e.g., dialects, grammatical structure, and vocabulary), affective-irrelevant variance (e.g., content, topics), and physical-irrelevant variance (e.g., visuals, media, and equipment) sources. The claim for absence of bias can be articulated from the following five perspectives: (a) dialect, content or topic across test-taker groups, (b) differential performance across gender, age, race or L1 test-taker groups of similar ability, (c) score interpretations, standard-setting and decision-making across test-taker groups, (d) appropriate accommodations for test takers with disabilities, and (e) cost, uniformity, and free of fraud. Three differential item functioning (DIF) studies are presented to illustrate the argumentation of the claim for absence of bias from the aforementioned first three perspectives.

The two sub-principles of the principle of justice are discussed in Chapter 8. The general claim can be stated that the assessment is just in that it fosters beneficial consequences to test takers and the community and meanwhile promotes positive values. It can be operationalized by three sub-claims in terms of the availability of administrative remedies, provision for legal challenges, and the correction of existing injustice. An illustrative example with U.S. Naturalization Test is provided to elaborate on the argumentation for the justice claim.

Another important aspect of the new approach application is to advance the principles of fairness and justice among a variety of assessment stakeholders. Chapter 9 proposes two ways to realize this purpose. One way is to foster ethical thinking, as illustrated by having ethical decision-making practices in both hypothetical scenarios and language assessment ones. The other is by expanding the training curriculum on language assessment and including courses on applied ethics and responsible assessment development and use. Thus, the ethical-critique curriculum is proposed, which covers not only knowledge, skills, and abilities in language testing practice and the principles for practice but also historical, social, political, and philosophical contexts in which the assessment is situated.

Finally, summarizing the main content discussed in previous chapters, Chapter 10 ends the volume with highlighting the two key points in the new ethics-based approach to assessment evaluation. One is applying ethical thinking which involves such key concepts as impartial evaluation, public justification, and global justice. The other is applying assessment standards. More importantly, once principles on fairness and justice are established on the above two points, it is time to operationalize them by claims and sub-claims which can be evaluated by the Toulmin argumentation model with warrants, backing or rebuttal.

Overall, the major contribution of this book rests with the proposed ethics-based approach to assessment evaluation. The approach carries forward previous frameworks of assessment evaluation. Specifically, it follows Messick’s (1989) incorporation of consequences into the scope of validation and borrows the prevalent argument-based approach to validation of language assessments which is initiated by Kane (1992) and further improved by Bachman and Palmer (2010). Furthermore, extending Shohamy’s (2001) critical perspective on the use of language assessments, it has its own innovative feature by introducing ethics into the validation framework and endowing assessment evaluation with philosophical underpinnings. Compared with Xi’s (2010) approach, in which the fairness investigation is embedded within the six-claim validity argument framework, this approach is unique in assigning a prominent and central status to fairness and justice, with each claim tailor-made for them and thus conducting fairness investigations in an all-round manner. In addition to this, another merit of the book lies in an abundance of illustrative examples in various contexts, which help to not only drive home the points illustrated but also facilitate the application of the points to practice. In this sense, these examples can well serve as materials for the ethical-critique curriculum to advance the principles of fairness and justice, as discussed in Chapter 9.

Meanwhile, one deficiency of the book is concerned with the arrangement of content that the first three claims of the principle of fairness are examined, yet leaving the last claim untouched. Besides, the examples provided in Chapters 5 to 7 are limited in coverage, and it would be preferable to also provide exemplification for some sub-claims, such as adequate practice with new technology, relevant social practices and embodied experiences in the claim for the OTL in Chapter 5, cognitive processes of test takers in the claim for meaningfulness in Chapter 6, as well as appropriate accommodations for test takers with disabilities, cost, uniformity, and free of fraud for the claim of absence of bias in Chapter 7.

All in all, crystallizing Dr. Kunnan’s years of scholastic explorations on fairness in language assessment, this book will serve as a crucial reference for the validation of language assessment from the perspective of fairness and justice. As such, it is a must-read for not only researchers and practitioners in the field of language testing but also language policy makers, institutional administrators, and other stakeholders as well.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Shi Yali

References

Bachman

L. F.

Palmer

A. S.

(2010). Language assessment in practice. Oxford University Press.

Kane

(1992). An argument-based approach to validity. Psychological Bulletin, 112, 527–535.

Messick

(1989). Validity. In Linn

(Ed.), Educational measurement (3rd ed., pp. 13–103). Macmillan.

Shohamy

(2001). The power of tests: A critical perspective on the uses of language tests. Longman.

(2010). How do we go about investigating test fairness? Language Testing, 27(2), 147–170.