Abstract
Psychological science is increasingly diverse in the tools available for research and the questions it is able to ask. But this potential is seriously limited by a lack of diversity in study populations, in situations and contexts explored, and in the researchers themselves. The current situation is problematic and difficult to change because of niche construction processes that favor the status quo. Systems level changes are needed to support a healthy psychological science.
I have always loved data, even as a freshman at Moorhead State College in Moorhead, Minnesota. Thanks to the then Psychology Department Chair, Robert Solso, a friend (who will go nameless since he flunked out of school) and I were able to get access to the laboratory late at night to run a study on approach–avoidance conflict learning in rats based on our (the two of us, not Solso) very naive reading of O. Hobart Mowrer’s (1960) Learning Theory and Behavior. Generally speaking, Solso let us do whatever we wished, which led to bad experiments but good learning opportunities.
The idea of being able to do something no one else had ever done was also of great appeal. My high school science teacher told me that I had a “gift for chemistry,” but I remember asking myself just how long it would take to get to the frontiers of chemistry research as I performed a lab “experiment” that hundreds of thousands of other students had probably done. Contrast that with psychology, where the first study I designed in my experimental psychology lab course was mediocre but novel.
The lure of the frontier and data never wore off, through 3 years of graduate school at the University of South Dakota where I worked in Roger Davis’s primate lab, 10 years at Rockefeller University as part of the W. K. Estes mathematical psychology lab (also offering the opportunity to mingle with stellar scholars like George Miller, Neal Miller, and Michael Cole), 11 years on the faculty at the University of Illinois (where other faculty were a wonderful, inexhaustible resource), a brief 3 year stint at the University of Michigan (where the highlight was the joint lab meetings with Ed Smith’s group, AKA “the Ed and Doug show”), and finally multiple decades at Northwestern University (coinciding with a shift toward including field studies that took me out of the lab and toward cultural and developmental research). My Northwestern experience has taught me to value diversity as a critical source of novelty, data, ways of looking at the world, and ways of asking questions about it. This forms the basis of my comments on our field and its progress that I will present in the form of a report card.
The question of whether psychology is headed in the right direction contains interesting presuppositions in its very formulation. Among them are the potentially contentious ideas that psychology is an identifiable entity, that it is capable of movement, and that the movement has directionality with respect to some reference point or goal. A second order presupposition is that we can do something if psychology is not headed in the right direction. To see just how bold that supposition is, consider whether you think climate change is headed in the right direction (obviously not) and whether we can do something about it (not so obvious).
By now you’re probably wondering when I’m going to stop dancing around this question and write something constructive; but presuppositions count. First, I don’t think that psychology is an identifiable entity (but this may be a good thing). It seems to follow that it cannot move, rendering the questions of directionality and veridicality moot.
Imagine, as an alternative, that psychology is a complex system, interacting in various ways within and between other complex systems. For better or worse (better, I think), the research conducted under the label “psychological sciences” is diverse in method and content and in its interfacing with other disciplines such as neuroscience, economics, educational sciences, linguistics, and anthropology. We have a broad range of theories (from framework theories that are judged for their usefulness to fine-grained models that may be able to predict reaction time distributions), methods, and theories about those same methods. That’s the good news. The not-so-good news is that psychology is not very diverse in other critically important ways.
Complex system or not, questions about relevance, current practices, and progress cannot be dismissed on grounds that our field is too dynamic to be evaluated at any one point in time. Congress won’t buy this, NSF and NIH won’t buy this, and we shouldn’t either. Although psychology is not monolithic, it does have identifiable trends, and its various areas have sets of consensual practices. With respect to the trends and practices that currently are in place, I would give our field a mixed review based on undeniable progress paired with prominent points of consternation. Here are my grades in a few areas of particular consternation, along with some commentary.
Diversity of study populations (grade = D). I believe the goal of our field is to identify and understand the range of human potential in interactions with physical, biological, and social environments. The Henrich, Heine, and Norenzayan (2010) critique of our field’s focus on samples from WEIRD (Western, educated, industrialized, rich, and democratic) societies noted that WEIRD participants are particularly unrepresentative of the world at large. This paper is widely cited and its implications are acknowledged, but there is little if any evidence that psychological science has changed in response to this acknowledgment (not to mention earlier critiques that have been met with the same apparent indifference). One could take encouragement from the fact that Henrich et al. were able to find sufficient cross-cultural studies to make a strong argument, but I see it as contrastive with strong default norms. Why has change been so elusive? One guess is the same forces (needing more and more publications for tenure, etc.) that have led to the replicability crises.
And as we all know, WEIRD isn’t the half of it. If you’re an infant or young child and your parents are middle class and live within 10 km of a major research university, your chances of being recruited for a developmental study are far better than if either condition is not satisfied. If you are not a college student at a major research university, your chances of being recruited for a study are very substantially diminished. Community college students need not apply. Even if you are attending a major research university, your odds of being in a psychology experiment vary with your chosen field of study, with a psychology major a far better guarantee than an engineering or English major. On the rare occasions when fields of undergraduate study have been examined in psychological studies, the results suggest that majors matter (e.g. Frank, Gilovich, & Regan, 1993, 1996; Lehman & Nisbett, 1990; Wang, Malhotra, & Murnighan, 2011). Overall, our discipline often acts as if variability is something to be avoided rather than a raison d’etre. It may not be our explicit goal to maximize nondiversity, but our cultural practices often have that consequence.
The increasing use of Internet studies (e.g., with Amazon’s Mechanical Turk) is a potential palliative (this led to my giving us a D rather than a D−), but they have been conducted primarily for convenience and only rarely as a tool to explore sample variation. Narrow sampling in medical research is transparently immoral and inexcusable—is psychological science any different?
2. Contextual diversity (grade = C−/D+). I use contextual diversity as a broad category of variation that includes research procedures, environments and situations, and the use of multiple converging measures. Although there have been periodic critiques of psychological science’s narrow choice of study populations, to my knowledge there have not been corresponding concerns expressed about diversity in procedures, measures, and settings (but see Ceci, Kahan, & Braman, 2010, for one example). These forms of variation are critical for establishing robustness of findings and for revealing patterned variation as a function of contexts and forms of assessment.
Multiple generations of social scientists have been brought up on Donald Campbell and Julian Stanley’s Experimental and Quasi-Experimental Designs for Research first published in 1963, but still important enough to have a 2015 edition (see also Campbell & Fiske, 1959). Among its many nuggets of wisdom is the discussion of the importance of convergent (and sometimes divergent) measures. It is ironic that the current so-called “replicability crisis” favors exact replications over the kind of robustness that can only be reinforced by converging measures.
Ideally, technology should facilitate our ability to instantiate methodological diversity. The advent of the personal computer dramatically increased the range of materials and stimulus presentation methods that could be employed and the ease with which this could be done, but it also led to greater methodological conformity, as the modal “social context” became a participant interacting on a computer in a cubicle. This is an extremely narrow sample of social contexts and situations (see Baumeister, Vohs, & Funder, 2007, for a critique of psychology as the science largely limited to finger movements and self-reports).
A positive example of employing technology to assess external validity is the Hofmann, Wisneski, Brandt, and Skitka (2014) study using smartphones to text participants randomly five times a day (between 9 a.m. and 9 p.m.) to probe whether they had witnessed, heard about, read about, or experienced a moral or immoral incident within the prior hour. Among their findings was evidence for moral self-licensing (engaging in a moral act made a subsequent immoral act more likely and a subsequent moral act less likely). This parallels lab findings and suggests that licensing effects have considerable generality. Hofmann et al. also found that being the target of a moral act made it more likely that the person would engage in a later moral act. Just a few years ago, this type of study would have been either technologically impossible or prohibitively expensive to conduct (see also Hofmann & Patel, 2015). We need more positive examples.
Few, if any, social scientists would doubt that behavior in a given context is partly a function of the actor and partly a function of the situation. One of the delights of being at the University of Illinois was lunches with many fellow faculty. I remember vividly one such lunch with Phil Teitelbaum, who was conducting ground-breaking research on motivational systems and the brain. Phil told me the story of how he had done studies on rats pressing bars for food in a Skinner box and almost had concluded that the brain area he was studying using brain stimulation was the site of food motivation. Almost, but not quite. I forget just why he did it, but he put nesting materials in the box and then found that stimulation to the same area triggered nest building (and introducing another rat elicited mating behaviors). The clear lesson was that it is important to vary context.
A constructive example of attention to context comes from research on Attention Deficit/Hyperactivity Disorder (ADHD), which could have been a field of research that focused exclusively on treating individuals (e.g., with drugs like Ritalin). But there is an accumulating body of work, often from school settings, looking for interactions with learning contexts. For example, there is evidence that children with ADHD spend more time on task with a small group arrangement than they do with either individual or larger group work (Hart, Massetti, Fabiano, Pariseau, & Pelham, 2010; Imeraj et al., 2013, see Harrison, Bunford, Evans, & Owens, 2013; Van der Kolk et al., 2015, for reviews). Unfortunately, the ADHD example of attention to situations is the exception to the rule. We need more such exceptions.
3. Researcher diversity, niche construction, and biased defaults (grade = solid D). Given that psychological science is a complex system interacting with other complex systems, it almost surely has evolved in a way that is adaptive for its practitioners. When our field’s history includes an overrepresentation of (middle class) White males, these sorts of niche construction processes work to reinforce White male intuitions, White male values, and White male practices. (These are big boxes—in addition to social class, political diversity may be important; Duarte et al., 2015.) Each of these factors may be a source of discord for would-be researchers who do not fall into this privileged (including privileged by precedent) group. Privilege includes who gets to decide what research questions are important (early research on primate reproductive behaviors focused on males and saw female primates as bystanders; this imbalance was countered when women entered the field), as well as who gets to become a research participant. It may be that the quickest route to study population diversity is researcher diversity.
In my opinion, progress in our field is linked to researcher diversity, especially if diverse perspectives are encouraged and appreciated. In the absence of diversity, we may be hindered in realizing that default assumptions are often cultural assumptions associated with one particular culture. These default assumptions may lead to limited procedural exploration, a focus on WEIRD study populations, and deficit thinking on those occasions when non-WEIRD populations are studied. I’ll offer one example of the latter.
The “Word Gap”
There is a considerable and somewhat controversial literature on the relation between language input conditions and children’s vocabulary development. Much of this was stimulated and perhaps sensationalized in a paper by B. Hart and Risley (1995) estimating that the children of middle-class parents were exposed to 30 million more words than children of poor or working parents by the time they enter school (the so called “30 million word gap”). These differences in exposure are correlated with socioeconomic status (SES) differences in children’s vocabulary. There is also evidence that “vocabulary interventions” affect young children’s vocabulary learning (Marulis & Neuman, 2013; Roberts & Kaiser, 2011; see also Goldin-Meadow et al., 2014), so the data linking language input and vocabulary size are not merely correlational.
What makes this literature controversial? If you’re interested in language learning you could hardly be faulted for studying variables related to language learning. The problems come when intervention effects become policy prescriptions. The word gap observations could be seen as suggesting that the reason that lower SES children are not as successful in school as their middle-class counterparts lies with their parents who are not providing a rich language environment. This is a strong conclusion given that the original Hart and Risley study found no correlation between rate of vocabulary growth and children’s third grade scores in reading, writing, spelling, or arithmetic (B. Hart & Risley, 1995, p. 161). But the issues are deeper than that.
A Cultural Perspective on the Word Gap
One facet of culture is shared norms, and it is easy to slip from norms into notions about what is normal or even “natural.” Middle-class parents in the United States talk to their infants from the moment they are born (sometimes even before birth). It doesn’t occur to them to not talk to their babies, even though babies don’t speak in response. In other cultures talking to infants is less frequent (Bornstein, Putnick, Cote, Haynes, & Suwalsky, 2015; Schieffelin & Ochs, 1986; Ward, 1971). From the perspective of some cultures, Americans talk incessantly and often needlessly, and in some of these cultural communities children “are to be seen and not heard.” Navajo adults, for example, may view a talkative child as self-centered and discourteous (Freedman & DeBoer, 1979).
It is important to recognize that a child-focused, steady stream of speech is just one of many approaches to child rearing. There may be many other values that are important in different cultural communities rather than a single “best way.” For example, an important concept not addressed by most word gap research is opportunity costs: the observation that any one choice leads to the potential loss associated with not exercising alternative choices. For example, there is some evidence that adults showing preschoolers the function of a toy limits children’s exploratory play and thus limits the opportunity for discovery (Bonawitz et al., 2011; see also Gweon, Pelton, Konopka, & Schulz, 2014).
Applying the concept of opportunity cost to the word gap domain, note that children are not just sitting around passively waiting for adults to speak with them. They may be engaged in creative play on their own, interacting with peers, and acquiring culture norms through observing and engaging in family and community activities (e.g. Lareau, 2011; Rogoff, 2003), each of which may lead to the development of other important skills. To see the narrowness of focusing on vocabulary even within the domain of language, ask a friend whether they think that it’s better to be able to speak using a large vocabulary or to be able to formulate an effective argument or tell a good story.
Just as there are distinct types of music, there are culturally distinct narrative practices. Some cultural communities favor “getting to the point” quickly and narratives organized as linear sequences. But other cultural communities focus more on creating a rich context for listeners and employing more nonlinear or cyclical narratives. For example, when we asked rural Wisconsin Native American and European American adults to tell us about their last encounter with fish, the median number of words before fish were mentioned was 27 for European Americans and 83 for Native Americans (Medin & Bang, 2014). As has been noted so often, differences are not deficits, and we need diverse perspectives to remind us of this. “Diverse perspectives” might well also include power sharing with and the perspectives of the “researched” in the form of partnerships with communities.
Summing Up
This report card for psychological science is less than impressive. In his APA Presidential address George Miller (1969) said, “In my opinion, scientific psychology is potentially one of the most revolutionary intellectual enterprises ever conceived by the mind of man.” I think George Miller was and is correct but perhaps he should have italicized “potentially.”
It will take systems level changes to encourage and support study sample diversity, methodological diversity, and perhaps most importantly, researcher diversity. In complex systems, nonlinear processes are common, and sometimes small interventions have surprisingly large effects (e.g. Stephens, Fryberg, Markus, Johnson, & Covarrubias, 2012; Stephens, Hamedani, & Destin, 2014). Therefore, a failing report card need not lead to complete pessimism about the future. Of course achieving these various forms of diversity likely would require training psychological scientists to understand culture and contexts that cultural groups find meaningful (and not just the lab). But if you like data, what could be more fun?
Footnotes
Acknowledgements
Daniel Hruschka, Sandra Waxman, bethany ojalehto, Natalie Gallagher, Sylvia Perry, Kalonji Nzinga, and Alissa Baker-Oglesbee provided constructive suggestions that improved this article.
Declaration of Conflicting Interests
The author declared no conflicts of interest with respect to the authorship or the publication of this article.
