Whither Replication?

Abstract

In this issue of Human Factors, we are proud to present the first of what we anticipate will be many articles arising from the journal’s replication initiative. The article “Validation of a Driving Simulator Study on Driver Behavior at Passive Rail Level Crossings” by Larue, Wullems, Sheldrake, and Rakotonirainy addresses replicability by virtue of carrying out two studies on pedestrian safety, one using a driving simulator and one using a real pedestrian crossing. The authors note that although studies on pedestrian safety at level crossings are typically carried out using simulators (for obvious reasons), no work has thus far been conducted on whether the use of crossing simulators is appropriate for this significant real-world problem.

Clearly this is an issue that should be addressed in the research literature, and in doing so, the issue of replicability arises. Fortunately, the authors’ conclusion is that simulators—or at least the simulator used in this study—are indeed suitable for use, as their data from a real level crossing and a simulated level crossing are appropriately similar. Such convergence of results from simulators and real-world environments also has been established in the driving domain (Bédard, Parkkari, Weaver, Riendeau, & Dahlquist, 2010; Lee, Cameron, & Lee, 2003; Shechtman, 2010).

The first published replication study under our new initiative is an indirect replication, with the authors attempting to replicate a study in two different environments but with the setups as similar as can be achieved. Simulators need to demonstrate both physical and psychological (Kantowitz, 1992) validity, and to show either or both absolute and relative validity. For simulator research more than some other modes of research, the issues of validity, generalizability, and replicability are bound up with one another in a way that is important not only scientifically but also pragmatically, because the typical aim of research conducted with simulators is to use the results to derive recommendations for real environments.

Replication is All Around Us

Replicability is, of course, the cornerstone of good science. If a finding can be replicated, this improves its generalizability and its validity. The replicability debate has been rumbling on for some time, one of the highlights being the Open Science Collaboration project (2015), which attempted to directly replicate 100 studies published in top psychology journals. Notoriously, most of the studies could not be replicated: Only 36% of the replication studies found statistically significant effects, whereas 97% of the original studies did so.

What this might mean is open to many interpretations, from statistical arguments, through methodological arguments, to the possibility that human behavior may have changed in some way since the study was originally carried out. The last of these is a particular problem for human factors research, as it is intimately tied to people’s relationship with technology. If a 30-year-old study fails to replicate, even though we have followed exactly the same procedure, what would this mean? And could we even replicate the exact procedure of a study done years ago, when technology has advanced since that time?

Aside from direct replication, which requires a rigorous set of conditions such as prepublication (i.e., registration) of the study, appropriate numbers of participants, and adequate effect size and significance predictions, researchers engage in partial and conceptual replication work (sometimes without consciously acknowledging this fact, Jones, Derby, & Schmidlen, 2010) in order to strengthen the generalizability and validity of their findings, which in turn moves the discipline forward. As Munafò and Davey Smith (2018) point out, triangulation of more than one line of evidence is important in research, and replicability is one part of the richness of our evidence. Human Factors aims to contribute to this important type of evidence and encourages authors to engage in our replication initiative.

Judy Reed EdworthyReplications Editor Patricia R. DeLuciaEditor in Chief

References

Bédard

Parkkari

Weaver

Riendeau

Dahlquist

(2010). Assessment of driving performance using a simulator protocol: Validity and reproducibility. American Journal of Occupational Therapy, 64(2), 336–340.

Jones

K. S.

Derby

P. L.

Schmidlin

E. A.

(2010). An investigation of the prevalence of replication research in human factors. Human Factors, 52, 586–595

Kantowitz

B. H.

(1992). Selecting measures for human factors research. Human Factors, 34, 387–398.

Lee

H. C.

Cameron

Lee

A. H.

(2003). Assessing the driving performance of older adult drivers: On-road versus simulated driving. Accident Analysis and Prevention, 35, 797–803.

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.

Munafò

M. R.

Davey Smith

(2018). Robust research needs many lines of evidence Nature, 553, 399–401

Shechtman

(2010). Validation of driving simulators. Advanced Transportation Studies, 53–62 (special issue).