Abstract

There have been several single-volume surveys of the field of language testing and assessment published in recent years (Coombe, Davidson, O’Sullivan, & Stoynoff, 2012, Fulcher & Davidson, 2012), including one that forms part of a larger encyclopedia (Shohamy & Hornberger, 2008), but now here we have a four-volume reference set to top them all. Antony Kunnan has undertaken the massive task of eliciting and compiling 140 articles by 185 authors into an all-inclusive and authoritative account of the field. The scope of the work has allowed the editor not only to cover very systematically all of the conventional topics that the other books have dealt with but also to move into whole new areas which have received little attention in the reference literature to date.
The Companion is published both in print (hardback) and in electronic form through the Wiley Online Library. The print version is a handsome set of books with all of the advantages that reading a hard copy brings, especially for those of us who have known no other way of accessing reference works for most of our lives. The hardback edition is an expensive option at the time of writing, especially since the books can be bought only as a whole set, but the publisher plans to produce a paperback edition by the end of 2015, with the four volumes available for separate purchase. Realistically, though, most academics and graduate students are likely to work with the electronic version through their university library, once they have persuaded the accessions department to obtain it.
The e-book I found easier to navigate than other online encyclopedias I have accessed. For one thing, the full table of contents is provided, in the form of a sequence of drop-down menus, going from the four volumes to the part divisions in each volume and then to the individual chapters. The chapters are presented in both html and in pdf formats, in the latter case with all the standard features and the same attractive look as the hard copy version.
Apart from your conscientious reviewer and other particularly dedicated readers, most users are not likely to read the volumes systematically from cover to cover, or file to file. This makes it important to be able to locate easily the topics or terms one is interested in. Obviously, the table of contents is the first port of call and, at the end of each chapter, there are cross-references to other related chapters. The hard copy version has a comprehensive author and subject index, whereas the online equivalent is the search window, which is not quite so convenient to use. First, one has to remember to click on “search in this book” at the bottom of the window; otherwise, the search results will come from the whole Wiley Online Library. The initial query produces just a list of chapters in which the search term occurs and so a second search within each pdf document may be necessary to locate the pages on which the term is found.
The overall structure of the Companion represents a coherent way of organizing the major areas in the field, with mostly logical clusters of chapters within each section. There are a few chapters that do not quite seem to fit where they have been placed, but that is inevitable in a complex and multifaceted work like this, which has striven for maximum coverage of the subject matter. Similarly, although numerous chapters overlap in content, there is not a strong sense of unnecessary duplication. Both of these concerns are less of an issue in a reference work, which readers will access selectively, rather than reading from chapter to chapter.
Volume I opens with a historical survey of the field – at least over the last 50 years that it has developed into a formal academic discipline – by Alan Davies. He uses three earlier survey articles published in Language Teaching as a framework and adds a fresh account of work over the last decade. This leads into Part 2, which addresses the abilities we set out to test, covering the four macro-skills, the three components of language knowledge (pronunciation, grammar, and vocabulary), and more besides. Thus, in the area of written production, we find not only a chapter on assessing writing, but others on assessing literacy, assessing integrated skills, assessing responses to literature, and assessing translation. The latter two topics do not normally figure in the language testing literature. I particularly liked Catherine Doughty’s chapter on assessing aptitude, with its thorough treatment of the original work by pioneers such as Carroll, Sapon, and Pimsleur in the 1950s and 1960s, as well as recent developments, notably the Hi-LAB instrument that Doughty herself has worked on at Maryland. Incidentally, this chapter seems to contradict Davies’s statement in Chapter 1 that language aptitude testing has been “little researched since the 1960s” (p. 13).
Parts 3 and 4 of Volume I are titled respectively “Assessment Contexts” and “Assessing Learners”. The editor appears to have made some arbitrary decisions in assigning chapters to one part or the other. For instance, the chapter on assessing whether English learners in US schools are ready to exit from ESL programmes is assigned to Part 3, whereas the ones on young language learners and heritage language learners are found in Part 4. Similarly, in the area of language for specific purposes (LSP), the assessment of government and military personnel, court interpreters and translators, immigrants, and asylum seekers is dealt with in Part 3, while tests for language teachers, international aviation personnel and health professionals are discussed in Part 4. We have become much more aware in recent years of the use of language assessments outside of educational contexts and the ethical issues that arise when political and bureaucratic considerations dictate both policy and practice in these high-stakes situations. This means it was important for these LSP contexts to be well represented in the Companion. Perhaps educational versus non-educational contexts would have been a better basis for dividing the chapters in these two parts.
In Volume II the first part (Part 5) is concerned with broad approaches to assessment: norm-referenced, criterion-referenced, task-based, and computer-assisted. The chapter on performance assessment could also be seen as belonging here, rather than in the following Part 6, which has the title “Assessment and Learning”. In Part 6 we find chapters on the so-called alternative forms of assessment that figure prominently in the work of classroom teachers: portfolios, dynamic assessment, self-assessment, peer assessment, and diagnostic feedback. I was pleased to see the two chapters on monitoring learner progress and achievement, particularly the one by Michael Kieffer (whose academic background is in education) on measuring achievement and growth in the classroom, because I think a lot of language educators are not aware of how this should be done on a properly systematic basis.
This brings us to Part 7 of Volume II, which covers the whole process of assessment development, from defining constructs and writing test specifications all the way through to reporting scores and standard setting. One chapter that impressed me was the one by Anthony Green on adapting or developing source material for listening and reading tests, which combines a historical perspective with a stimulating discussion of current concerns. In addition, there are chapters on testwiseness strategy research in task development and on detecting plagiarism and cheating – both matters of great concern to publishers of large-scale, high-stakes tests. The other section in Volume II, “Technology and Assessment”, deals with the role of new media and computer-automated scoring of constructed responses, as well as three innovative technologies in language testing research: corpus analysis, eye-tracking, and acoustic and temporal analysis of speech.
Research in the field also figures prominently in Volume III, where the two central sections are devoted to quantitative analysis (Part 10) and qualitative and mixed method analysis (Part 11). The former section covers all of the mainstream statistical procedures from the classical theory to multifaceted Rasch (or is it structural equation modelling that is at the far end of the sophistication scale?). Part 11 includes content analysis, introspective methods, and spoken and written discourse analysis, and the authors in this section have illustrated their respective methodological approaches effectively with reference to specific research studies in language assessment. A couple of other chapters, on questionnaires (Part 10) and writing research reports (Part 11), are more generic in content and offer advice that would be of value to anyone in applied linguistics.
Volume III actually opens with Part 9, which has the bland title “Designing Evaluations”, but it contains four essential chapters on validation theory, fairness and justice, test accommodations, and consequences/impact/washback. At the other end of the volume is Part 12, dealing with interdisciplinary themes. Like other sections, it is a somewhat mixed bag in terms of coherence. The chapters on philosophy, cognitive theory, second language acquisition, and the law certainly involve interfaces with other disciplines, whereas this does not seem to apply to the chapters on classroom-based assessment and the ethics of “industrial” language testing. Although the chapter on bilingual assessment is reasonable in scope, I see this topic as important enough to have warranted another chapter, which could have covered translanguaging and cross-language mediation, for example. Perhaps these phenomena were still below the radar when the Companion was being planned.
The final chapter in Volume III, Lyle Bachman’s “Ongoing Challenges in Language Assessment”, reads like a closing piece to the whole work. Apparently, it was originally intended as such – but wait, there’s more to come! Volume IV arguably offers the most original contribution to the literature by taking an areal approach. Although there have been a few special issues of the journals that have presented work from particular regions in the world, such as Asia (Ross, 2008) and Australia and New Zealand (Phakiti & Roever, 2011), nothing on this scale has been done before. The amount of networking and follow-up of leads needed to locate the diverse group of authors for the chapters in this volume must have been formidable. The opening section (Part 13) surveys English language assessment in eight regions of the world, together with a chapter on assessing English as a lingua franca (a much-heralded topic whose time, it appears, is yet to come). As is to be expected, the major international proficiency tests loom large in these accounts, alongside more traditional domestic examining practices.
However, the bulk of Volume IV is devoted to 36 chapters, organized into six regional groupings on pragmatic grounds, discussing the assessment of languages other than English. This antidote to the dominance of English in the language assessment – and indeed the applied linguistic – literature is most welcome. The range includes the major world players (Chinese, Spanish, Arabic, etc.); a host of “less commonly taught” languages (as they are known in the United States); indigenous languages like Maori, Hawaiian, and 14 tribal languages in Taiwan; and American Sign Language. The authors were given a template to organize their chapter (Introduction, Description of the Language(s), Teaching–Learning Contexts, Assessment Practices, Challenges/Future Directions), which they could modify as they saw fit. A few chapters, notably the one on Nepali, devote too much space to the linguistic description, but generally speaking there is interesting and worthwhile discussion of the assessment issues, both in the home country and (where applicable) in the teaching of the languages in the United States and elsewhere. The chapters on Native American languages and Maori raise important questions about the cultural appropriateness of assessment practices. Another kind of challenge faced by those assessing languages like Arabic, Sinhala, and Tamil is the diglossic nature of the language, where the “high” variety taught in school is very different from the “low” variety acquired in the home and used in everyday life. Thus, there is much to ponder in these chapters, which take us well beyond the notion that the assessment of languages other than English is simply characterized by traditional, pre-scientific practices – widespread though they may still be.
All in all, this is a landmark publication in our field and a major achievement for the editor. I am so pleased to have had the opportunity as a reviewer to obtain this set of books to add to my bookshelf – when they are not open on my desk being referred to for a dependable, state-of-the-art account of work on this topic or that. It is also wonderful to know that it is all available online for ready access elsewhere. The challenge will be to maintain the value of the work through updates or periodic new editions as time goes by.
