Abstract

This book offers a lucid introduction to the principles of Rasch analysis. Through detailed examples and hands-on practice, the authors guide the reader through the process of data coding to data entry, preliminary analyses, Rasch modeling, quality checks, and final reporting of the results. Furthermore, clear instructions are provided so that the readers can do the analyses in Winsteps software (Linacre, 2012a).
The first chapter provides an overview of the objectives of the book and its structure. In addition, a number of common problems that can be easily handled in Rasch analysis, such as missing data and equating, are reviewed. In Chapter 2, the authors provide a detailed and clear discussion of what rating scales are and how data obtained from these instruments can be analyzed. Instructions are given as to how the data must be entered in a spreadsheet and how negatively worded items must be recoded. Chapter 3 describes clearly how to construct a control file and do a Rasch analysis in Winsteps software (Linacre, 2012a). It assumes no knowledge of Winsteps on the reader’s part.
A unique feature of the Rasch model is that items and persons are both measured on the same scale. Hence, the output of a Rasch analysis will offer a similar set of statistics for both items and persons. Chapters 3 and 4 discuss person and item measures respectively. The relevant tables in Winsteps are introduced and a sufficient explanation is provided on how to interpret the results. A consequence of measuring items and persons on the same scale is that they can be presented on the same graph, known as the Wright Map or the item–person map. The authors draw on this feature of the Rasch model in Chapters 6 and 7 and delineate in detail the advantages that the map can offer in scale development and analysis.
The authors go on to explain fit analysis in Chapter 8. The chapter starts with a definition of fit analysis for both persons and items. The main fit indices are then introduced and, as with other chapters, the readers are walked through the analyses in Winsteps. Chapter 9 is devoted to the analysis of rating scales. The discussion is primarily focused on checking whether the rating scale categories are functioning as expected (e.g., if the average ability estimates for the categories increase in the expected direction, if disordering happens, etc.). Chapter 10 goes on to discuss various types of indices for examining test reliability, such as person and item reliabilities, and separation indices. While the term “strata” is also referred to in the title of one section, the concept is not explained in the text.
Chapter 11 provides an interesting discussion of the logistic ogive function and its implications for interpreting raw scores. Through detailed examples, the authors show how nonlinearity of the raw scores may mask the real differences among the individuals or how gain scores may be misinterpreted by drawing on raw scores. Chapter 12 explains how the probability of success may be changed from the default 0.5 to other numbers such as the 0.62 value used by PISA, and discusses reasons for such a practice. The next chapter deals with differential item functioning (DIF) and what it means in the context of Rasch modeling (i.e., DIF is defined as the failure of invariant measurement). Issues such as the anchoring of parameters are also briefly discussed in this chapter. Chapter 14 discusses the logic and process of linking. Although the chapter is titled “Linking Surveys and Tests,” the discussion is limited to the more difficult task of linking rating scales with polytomous items.
The next two chapters are devoted to standard setting. Chapter 15 presents a succinct overview of the use of Wright Maps to set cut scores. Chapter 16, on the other hand, presents a more extended discussion of how to set multiple cut points, as is typically done when defining band scores. Again, the Wright Map is instrumental, and the procedure for doing the analysis in Winsteps is described. The authors go on to discuss the sample size requirements for a Rasch analysis in Chapter 17. The chapter is roughly divided into two sections: the issues relevant to sample size and how it affects measurement, and the recommendations or “rules of thumb” in opting for a sample size.
The next chapter addresses the problem of missing data and ways of handling the problem. The chapter is for the most part focused on the effect of different ways of coding missing responses on the estimated parameters, rather than presenting a more solid and rigorous argument for the advantages of Rasch analysis in dealing with these problems. Chapter 19 discusses the partial credit model. Three examples show contexts where a partial credit rather than a rating scale model may be more plausible. The examples vividly demonstrate the justifications for, and the additional benefits of, doing a partial credit modeling. The authors go on to explicate many-facet Rasch measurement (MFRM) in Chapter 20. Example outputs from the Facets software (Linacre, 2012b) are provided to make the discussion more concrete.
The remaining four chapters concisely go over some other issues in Rasch analysis. The title of Chapter 21 (“The Rasch Model and Item Response Theory Models: Identical, Similar, or Unique?”) induces the reader to think that there will be a discussion of the differences between the Rasch model and other IRT models, but the entire comparison falls within two paragraphs. The rest of the chapter is devoted to other topics, such as the definition of measurement and the links to K-12 science teacher education. The next chapter tells the reader which tables in Winsteps will give the relevant information as to whether the analysis has been properly done. The penultimate chapter provides a list of further resources for learning about Rasch modeling, and the final chapter reiterates that Rasch measurement is “high-quality measurement” and presents some quotations from the authors’ students that reflect their views on the Rasch model.
The book has a number of unique features. The explanation of the Rasch model is mostly clear, especially in the beginning chapters, and assumes little knowledge of either psychometrics or Rasch modeling on the part of the readers. In each chapter, after reviewing the relevant theory, detailed instructions are provided on running the analyses in Winsteps. I know of no other book that offers the same blend of Rasch theory and detailed software instructions.
A useful feature of the book is the extensive set of exercises provided both within the text and at the end of each chapter. The authors have provided detailed answers for the majority of exercises. Interspersed within the text are interesting discussions by two fictional characters that provide still another opportunity for the readers to grasp the ideas. Another strong point of the book is the detailed captions provided for each table/figure, which help the readers to interpret the information they present.
Each chapter ends with a suggested list of references that would potentially be very helpful to the readers, although the recommended readings for some chapters do not seem to be well related to the main theme of the chapter. The suggested references for Chapter 11, which discusses the ogive logistic function, are an example. The authors’ description for the two suggested papers there reads: “A very good comparison of the Rasch model and the 2P and 3P IRT models,” while the comparison of the models is actually presented in Chapter 21. In other chapters, only one or two references are provided. For instance, in Chapter 20 (“Multifaceted Rasch Measurement”), I expected to see such references as McNamara (1996), Eckes (2011), and Engelhard (2013), yet only one paper is suggested.
There are some important omissions in the book. The Rasch model rests on strong theoretical assumptions. Although an explication of all the relevant issues might be considered to be beyond the scope of the book, some fundamental concepts must be explained. Two such concepts are dimensionality and local independence. Although the authors repeatedly refer to the importance of measuring a single “variable” throughout the text, it is not explained how readers may check to see whether unidimensionality holds for a given data set. Local independence receives no treatment whatsoever. Although the lack of discussion of these assumptions may be justified on the grounds that they are too technical for the intended audience of the book, these concepts are certainly not more difficult than DIF analysis, to which an entire chapter is devoted. Moreover, dimensionality and local independence can easily be checked in Winsteps, which should make the explication of the analyses relatively easy. It is, in fact, acknowledged in Chapter 10 (p. 218) that an examination of test reliability will necessarily include an evaluation of test dimensionality and that this can be easily done in Winsteps.
Another omission pertains to the limited set of references provided in Chapter 23 (“Key Resources for Continued Expansion of Your Understanding of Rasch Measurement”). The chapter is intended to give readers an idea of how they can further their knowledge of Rasch analysis. Only the following three books are referred to: Bond & Fox, 2007; Smith & Smith, 2004; Wright & Stone, 1979. Other pertinent introductory and more advanced books could be easily included (e.g., Eckes, 2011; Engelhard, 2013; Wilson, 2005). In addition, under the Software Manuals section, only Winsteps’ manual is referred to; no reference is made to other Rasch analysis software. Even the Facets software, which was referred to in Chapter 20, is not included here. Hence, future revisions of the book could benefit from a broader coverage of the sources on Rasch analysis. Bond and Fox (2007) provide such a useful section at the end of their book.
All in all, I agree with Boone, Staver, and Yale’s advice to the readers: “we suggest readers digest our book, practice with our data sets, and then move onto other books” (p. 5). The problems noted here notwithstanding, the book offers an invaluable opportunity for the beginners to get an idea of the merits of a Rasch modeling approach to data analysis. Even absolute beginners in Rasch analysis will find the book informative. It is a welcome addition to the literature on psychometrics in general and Rasch modeling in particular.
