Abstract
The author of this commentary argues that physical scientists are attempting to advance knowledge in the so-called hard sciences, whereas education researchers are laboring to increase knowledge and understanding in an “extremely hard” but softer domain. Drawing on the work of Popper and Dewey, this commentary highlights the relative similarities between hard sciences and education research in their rhetorical nature, while acknowledging the divergent paths of these two fields of inquiry with regard to prediction and generalizability. The author suggests that given the highly contextualized nature of educational processes, embedded in shifting complex social settings, and the relevance of all variables, very little education research is able to pursue predictive power.
I have been invited to reply to the prompt crafted by the editors of Educational Researcher—a prompt that, hopefully mischievously, embodies the sickness known as “physics envy.” To my mind, it raises the question of why education researchers should envy physical scientists and would want to be like them. Apart from higher status, higher salaries, fancier-looking lab equipment, and sparkling white lab coats, they are just like us! But yes, of course there is a pertinent difference: Physical scientists are attempting (with notable success) to advance knowledge in the so-called hard sciences, whereas we in education research are laboring (with possibly indifferent success) to increase knowledge and understanding in an “extremely hard” but softer domain. And in the course of doing this work, we face great difficulties—epistemological, methodological, and practical, liberally seasoned from time to time by the economic hardship of underfunding and by uninformed interference by governmental agencies issuing ideologically based methodological strictures. Given these travails, it makes perfect sense that we would never ask a group of physicists to respond to a prompt that suggests their field should be more like education research!
Personally, I regard this general issue as moot—the issue of whose field should be taken as benchmark and whose should be a copy—and it is moot because (with an important proviso) all competent inquiries (to use John Dewey’s [1938] felicitous expression from Logic: The Theory of Inquiry) have the same features. Leo Tolstoy famously remarked, in the opening sentence of Anna Karenina, that all happy families are alike—and they probably are, providing that one looks at them from afar, from a sufficient level of abstraction where individualizing details have become indiscernible. The same is true of the many families of happy (competently pursued) research or inquiry, spanning the physical, biological, and social sciences; education research; and possibly the humanities as well. 1 The philosopher Dagfinn Follesdal (1979) showed that the hypothetico-deductive method (often argued to be the logical pattern underlying the “scientific method”) is used in the far-off field of hermeneutical or interpretive literary inquiry (the examples he analyzed in depth were commentaries on Peer Gynt). John Dewey (1916) and Karl Popper (1972) argued (in surprisingly parallel analyses) that productive inquiry always followed the same general logical pattern—that started with engagement with a problem and eventually resulted in the trying out, the testing, of a hypothesis about the likely solution to the initial perplexity (which might end the inquiry or result in a further bout of activity; see also Dewey, 1933). Popper was specifically discussing research across the sciences, whereas Dewey had his eyes on all problem-solving inquiry (his examples included homespun scientific inquiries, such as a child investigating the production of soap bubbles, and everyday problem solving, such as deciding on the mode of transport to use in order to be on time for a cross-town meeting). Dewey even made the point that Popper became famous for canvassing (although Popper probably does deserve the credit as he made it in an extremely powerful way)—namely, researchers must ensure that they do not focus their efforts on proving that they are right; they must not, in the terminology sometimes used in the educational methodology literature, adopt a “confirmatory orientation.” Popper argued that it always is possible to find some evidence that one’s favored hypothesis is right, but this counts for little—what is crucial is that one attempts to find evidence that it is wrong.
The features discussed so far barely make inroads into the long list of similarities that philosophers, and empirical researchers themselves, have seen across the vast domain of competently pursued human inquiry (I provide a slightly less “hit-and-run” discussion in chapter 5 of my The Expanded Social Scientist’s Bestiary; Phillips, 2000). I will mention only one other—one that I have been particularly taken with since I pointed to it in various papers about a decade ago. I gave it the label “the platinum standard” to mark its superiority over the “gold standard” (the use of randomized controlled experiments), which was supposed to be the distinguishing feature possessed by all rigorous, scientifically oriented research. In brief, I have argued that research across many if not all fields can be thought of as attempting to make a compelling case for a hypothesis, by marshaling evidence of various types and crafting arguments, which taken as a whole warrant a conclusion about this hypothesis; the case, of course, has to be able to withstand critical scrutiny. In short, research is an exercise in argumentation, or in rhetoric (in the traditional and not the modern sense of the term, in which it implies nefariousness). Charles Darwin (1859) put it well when in the concluding chapter of Origin of Species—almost certainly the most important book ever written in the field of biology—he pointed out that the preceding chapters constituted a complex argument, and in this concluding chapter, he was setting about recapitulating the case that he had made. In similar vein, I argue that it is fruitful to regard inquiries into gender differences in moral development, the authorship of Hamlet, the relationship between time on task and learning, whether or not Richard III was a monster who murdered his own nephews in order to safeguard his grasp on the English throne, and so on as all being attempts to make a compelling case one way or the other.
So much for similarities between the hard and the very hard fields of inquiry. A brief comment needs to be made about differences between the fields, ones that refuse to vanish even when we look at the domains from afar. Actually, I will mention only one of these that I regard as particularly significant and especially challenging because—like Baroness Orczy’s Scarlet Pimpernel—it has the capacity to crop up in various disguises (prediction, generalizability, and contextualization). To start with prediction.
Perhaps the feature that has been most responsible for the astounding progress of the physical sciences over the past few centuries is its ability to put its hypotheses (and the cases that were made to support them) to the test, by making precise predictions that can then be subjected to empirical verification or refutation. In this way, the justificatory story goes, the truth is preserved and error is eliminated (faulty cases are discarded, or at least recalled for major revision). The soft field of education research is held by many to fare poorly here, for the making of precise predictions is vanishingly rare, and as a result, progress and error elimination are occurring extremely slowly (if at all).
The issue here, of course, concerns predictive power, or—what is a closely related, if not exactly the same, thing—the issue of generalizability of the findings of a research study. This issue is of enormous complexity and obviously cannot be pursued satisfactorily in the current venue. But it is undoubtedly the case that very little research in education can be regarded as being of high quality if the making of precise predictions (comparable to those made in the lab of a Nobel laureate in physics) is a key criterion. Some very fine pieces of education research aim to achieve in-depth understanding of a single, specific context of educational significance, with all of its relevant particular, individualizing features taken into account. (Anthropological or ethnographic research comes to mind here, as do mixed-method in-depth classroom studies and the like.) Such work can be rigorous and not only can shed light on the situation that is the focus but also can illuminate other phenomena in the same social/cultural setting and also situations in other more remote contexts. However, rarely if ever do researchers operating in this mode aspire to make predictions.
In other forms of research, large data sets are collected from large samples of subjects (who often have been randomly assigned to treatments), and statistical analysis is applied to produce findings that are probabilistic in nature and that apply to the group or population but not to individuals (findings such as “the average gain of the treatment/experimental group, compared with the average score of the control group, is statistically significant at the 5% level”). This type of finding provides no valid basis for predicting what the treatment would achieve with a single individual, nor indeed is it straightforward to generalize to a different population—one, for example, that differs with respect to gender and ethnic distribution, socioeconomic profile, quality of educational facilities that have been available, number of incidents involving firearms, and so on. In the hard physical sciences, confounding variables can eventually be controlled, but in research in educational settings, these factors are not nuisances but are of great human and educational significance—control here removes all semblance of ecological validity.
Learning is a phenomenon that involves real people who live in real, complex social contexts from which they cannot be abstracted in any meaningful way. Difficult as it is for researchers to deal with (especially if they are suffering from physics envy), learners are contextualized. They do have a gender, a sexual orientation, a socioeconomic status, an ethnicity, a home culture; they have interests—and things that bore them; they have or have not consumed breakfast; and they live in neighborhoods with or without frequent gun violence or earthquakes, they are attracted by (or clash with) the personality of their teacher, and so on. 2 It probably is the case, as some physical scientists have noted, that in a mature field of research, like physics, “it is obvious which variables are important and how to control them”; the problem is that in education, just about all the variables are relevant, and controlling them (even if possible let alone desirable) yields results that are difficult or impossible to generalize to the other almost infinite number of settings where these variables do, indeed, vary. I am not denying, of course, that physical scientists have to struggle to determine which variables to control, and to find out how actually to control them. But dealing with temperature, pressure, magnetic fields, and the like is one thing; dealing with culture, gender, socioeconomic status, human interests, and the like is quite another! This is why, while physics is a “hard science,” education research is a very hard—an extremely hard—one.
