Abstract
It is commonplace, when discussing the subject of psychological theory, to write articles from the assumption that psychology differs from the physical sciences in that we have no theories that would support cumulative, incremental science. In this brief article I discuss one counterexample: Shepard’s law of generalization and the various Bayesian extensions that it inspired over the past 3 decades. Using Shepard’s law as a running example, I argue that psychological theory building is not a statistical problem, mathematical formalism is beneficial to theory, measurement and theory have a complex relationship, rewriting old theory can yield new insights, and theory growth can drive empirical work. Although I generally suggest that the tools of mathematical psychology are valuable to psychological theorists, I also comment on some limitations to this approach.
In 1987 Roger Shepard published a brief article in Science with the ambitious title “Toward a Universal Law of Generalization for Psychological Science” (Shepard, 1987). Drawing on the empirical literature on stimulus generalization in several domains and species, he asserted that any stimulus-generalization function should be approximately exponential in form when measured with respect to an appropriately formulated stimulus representation. His article begins with the following remark: The tercentenary of the publication, in 1687, of Newton’s “Principia” prompts the question of whether psychological science has any hope of achieving a law that is comparable in generality (if not in predictive accuracy) to Newton’s universal law of gravitation. Exploring the direction that currently seems most favorable for an affirmative answer, I outline empirical evidence and a theoretical rationale in support of a tentative candidate for a universal law of generalization. (p. 1317)
Shepard’s claim was remarkable in scope. He drew on data from multiple species (e.g., humans, pigeons, rats) and stimulus domains (e.g., visual, auditory), data that had until that point been assumed to be quite different from one another. To spot the invariance that holds across these data sets, Shepard used statistical insights from research on similarity modeling. He noted that the apparent noninvariance of observed stimulus-generalization functions stemmed largely from the fact that response data had previously been analyzed with respect to the physical dissimilarities of the stimulus. When the same responses were plotted as a function of distance in a psychological space constructed by multidimensional scaling, he found that the form of the stimulus generalization was remarkably regular in shape.
Taken by itself, Shepard’s reanalysis would have been impressive. However, Shepard went on to provide a theoretical explanation for why we should expect to find this invariance. The theory was surprisingly simple: The learner presumes there exists some unknown consequential region of the stimulus space across which roughly the same properties hold (e.g., things that look like apples will probably taste the same as one another). When encountering a single stimulus that entails a particular consequence, the learner’s task is to infer the location, shape, and size of the consequential region itself. This is naturally an underconstrained problem, as there are an infinite number of possible regions that might correspond to the true consequential region. Nevertheless, Shepard showed that under a range of assumptions that the learner might make about the nature of consequential regions, the shape of the generalization function across the stimulus space ends up being approximately exponential. A visual illustration of this idea is depicted in Figure 1.

A schematic depiction of Shepard’s (1987) theory of stimulus generalization. The main panel depicts a two-dimensional psychological space, in which possible stimuli can vary along two stimulus dimensions (e.g., brightness, orientation). The black marker shows the location of a “consequential stimulus” (e.g., a fruit with an unpleasant taste), and each of the gray rectangles represents one possible hypothesis about the set of possible stimuli that might also have this consequence (e.g., taste unpleasant). Not knowing which of these hypotheses represents the true extension of the “region of unpleasant fruits,” the learner “averages” across his or her uncertainty leading to the approximately exponential generalization gradients plotted above and to the right. Note that the curves shown in this figure are jagged rather than smooth because only a sample of possible regions is depicted and that for ease of exposition this figure represents a simplified version of Shepard’s theory.
Although brief, Shepard’s article has been influential in the cognitive-science literature. It presented no new empirical data, and in substance it is mostly devoted to the derivation of a formal relation between one unobservable quantity (psychological distance) and another (stimulus generalizability). The universal law featured prominently in a special issue of Brain and Behavior Sciences in 2001 and a first-person retrospective (Shepard, 2004), and for my contribution to this special issue I use it as an example of theory building in psychology. 1 The decision to focus on a single contribution to theory is motivated by a desire to look at the particulars rather than speak solely in the abstract, and my decision to ignore disciplines outside of cognitive psychology is motivated by a desire to work toward what Flis (2019) calls “an indigenous epistemology.” If psychology is to make theoretical progress we must do so on our own terms. There are limits to what we can learn from the physical sciences.
Some desiderata for scientific theories seem easy to list. A scientific theory should be independent of its creator, for instance. It is difficult to make much use of a theory otherwise. In practice this typically means a theory is mathematical or computational in nature. Likewise, psychological theories should of course make some connection with empirical data, giving an account of the generative mechanism that gave rise to those data. Theories should be usable in the sense of providing other scientists guidance for future research. Other criteria could also be named, including falsifiability, simplicity, compatibility with existing literature, generalizability, predictive ability, and so on. However, although it is easy to list desiderata and even easier to debate which elements to such lists are the most important, such “discussions in the abstract” rarely provide much guidance to the would-be theoretician. From the perspective of the working scientist, it is perhaps more useful to give concrete examples, and to that end I return to an examination of Shepard’s (1987) article and the mathematical-psychology literature to which it belongs. I make the following claims: (a) Theory building is not a statistical problem, (b) mathematical formalism is beneficial to theory, (c) measurement and theory have a complex relationship, (d) rewriting old theory can yield new insights, and (e) theoretical growth can drive empirical work that might not otherwise have been considered worthwhile.
Theory Building Is Not a Statistical Problem
When reading Shepard’s original 1987 article and the 2004 retrospective, some surprising characteristics of his work on theory stand out. First, the theory development was largely post hoc. The original article does not collect new data, and indeed the main empirical results reported in the article were based on a reanalysis of existing data. Second, the article reports no hypothesis tests. There are no p values, Bayes factors, or any confidence intervals or their Bayesian equivalents. Third, the article does not outline any specific predictions about future experiments. It makes a strong claim that the exponential law should hold broadly but does not prescribe how tests of this prediction should be constructed.
Viewed through the lens of the methodological reform culture documented by Flis (2019) these properties might seem strange and might even amount to a form of “questionable research practice.” For instance, in the current zeitgeist it is sometimes argued with considerable vigor (especially on informal forums such as academic Twitter) that strong inferential claims cannot be justified without preregistered confirmatory tests. Shepard’s (1987) article does not present any such tests but makes sweeping claims nonetheless. Likewise, one might wonder whether his post hoc theorizing is a form of hypothesizing after results are known (i.e., HARKing). The unwary reader might conclude that Shepard’s work is of questionable value: Perhaps cognitive scientists have erred by according this article such high status?
Something seems awry in this description, and few researchers familiar with Shepard’s work would endorse it. The problem, I suggest, arises from a subtle way in which the preceding paragraph misrepresents the inferential problems scientists face. Methodological prescriptions relating to confirmatory tests (e.g., Wagenmakers et al., 2012) or post hoc hypotheses (e.g., Kerr, 1998) are narrow in scope: They have been developed to guide statistical inferences about empirical data, and as I have argued before (Navarro, 2019), it is an error to presume that the same logic can be applied to the evaluation of scientific theories. 2 To put it another way, the success of Shepard’s theoretical work despite the (apparent) failure to meet these statistical prescriptions tells us something about what a theory is not. In my view neither empirical data nor statistical tests can be called a theoretical contribution, and prescriptions deemed sensible for empirical research or data analysis should not be considered suitable for the evaluation of psychological theory.
I suggest that the value for theory in Shepard’s article was not the discovery of an exponential law but rather the explanation proposed for it, and theories need to be evaluated (in part) in terms of their explanatory value. For example, Shepard’s article did not merely summarize data—it systematized an existing body of empirical findings. It separated aspects of the data that are invariant across studies from those that are not, sifting the wheat from the chaff so to speak. The sieve that enabled this was a mathematical theory describing regularities in stimulus generalization in terms of simpler primitives. Thus, although Shepard’s theory asserts that the form of a generalization curve should be exponential, this exponential form is an entailment of his theory and not its substance.
From the perspective of theory, this is important: If an exponential law were observed in a few terrestrial species with no deeper explanation provided, there would be little reason to believe that such a law might hold with any generality. Such an inference would be statistically unjustifiable, even as a “tentative suggestion.” What Shepard does instead is note that an exponential law emerges as an entailment of sufficiently primitive rules that could be reasonably expected to hold in vastly different environments: “I tentatively suggest that because these regularities reflect universal principles of natural kinds and of probabilistic geometry, natural selection may favor their increasingly close approximation in sentient organisms wherever they evolve” (p. 1323).
In other words, his claim to generality arises not from any statistical quantification of the strength of evidence but from the formal structure of the theory. Statistical evidence and theoretical generality are quite different from one another. Statistical tools can tell us what we might expect to happen were an experiment to be precisely replicated in precisely the same context; theoretical tools exist to tell us how to generalize from one context to another. Insofar as all meaningful inferences that a practical scientist cares about are to some extent an act of generalization across contexts, statistical inferences are insufficient to guide scientific judgment. Theory-based inferences are a necessity, not a luxury.
Mathematical Formalism Is Beneficial for Theory
It is perhaps trite to say so, but the defining property of mathematical psychology is the emphasis on formal descriptions of human thought and behavior, either in the form of an abstract mathematical specification or a clearly defined computational model. To many psychologists it might seem strange that such a discipline even exists, but as Luce (1995) puts it, “mathematics becomes relevant to science whenever we uncover structure in what we are studying” (p. 2). If we believe that our empirical results have structure, we should attempt to articulate what that structure is as precisely as we can. It is with this task that mathematical psychology is concerned.
There are a number of reasons why formality is useful to the would-be theoretician, but first among them (in my view) is precision. Consider how Shepard’s law of generalization might have looked had he not sought the precision that mathematics affords. My attempt to describe the law itself verbally using ordinary English language and not substituting any mathematical words is as follows: If an intelligent agent encounters one thing that has a particular property and encounters another thing and is uncertain whether it possesses that property, then all else being equal the agent will tend to treat those things similarly in regard to the unknown property to the extent that those two things are similar in regard to their known properties, and this tendency will fall away very quickly as this similarity decreases.
Except for that last part—which forms the substantive part of the exponential law—this seems intuitive, but in the stated form it also sounds vacuous and perilously close to tautological. What precisely do I mean when I use the word “similarity”? As philosophers (Goodman, 1972) and psychologists (Medin et al., 1993) alike have noted, the term similarity is not well defined and requires additional constraint to be psychologically meaningful. To make the theory workable, I must elaborate on this verbal definition and try to pin down what I mean by similarity. I also need to pin down what I mean when I refer to the “tendency” to act a certain way. Very quickly one finds that it is difficult to work out what underlying theoretical claim is being made if these claims are stated only in everyday language. Even if the theoretical claim is not entirely vacuous (in this case, if there is some of substance buried within my claim that “the tendency falls away very quickly”), I cannot work out what the substance may be when my theory is stated in this fashion. In other words, without precision it is hard to know what tests and what inferences are licensed by the theory.
Escaping this trap of vagueness is hard, and to illustrate how mathematical formalism can help, it is necessary to introduce some. 3 In this article I use g(x, y) to refer to the generalization function: Specifically, g(x, y) is the probability that a newly encountered stimulus y shares a property that is already known to be possessed by a different stimulus x. Using this notation, Shepard’s claim can be written in the following form:
where the constant e is approximately 2.718 and λ is an unknown parameter of little theoretical interest. 4 The quantity of interest here is d(x, y), the “psychological distance” between stimulus x and stimulus y. Written like this, the theory’s claim starts to become clearer: If it is possible to measure both the psychological distance d(x, y) and the strength of generalization g(x, y) in a defensible way, then we should expect a very specific nonlinear relationship to emerge between the two. Already some of the value of the theory should be clear. It tells us which measurement problems we need to solve.
The value of this should not be understated: Knowing what quantities need to be measured is of considerable importance to psychologists, and knowing when approximate measurements are “good enough” is similarly critical. In the generalization context, if the researcher can obtain only ordinal-scale information about psychological distances, then Shepard’s law yields no predictions at all about the corresponding generalizations. Indeed, to the extent that one goal in methodological reform is to encourage researchers to be more precise in stating the contexts to which they believe their results may generalize (Simons et al., 2017), it is advantageous to have precisely stated theory to guide them. To comment sensibly on how an empirical result might be expected to generalize (or not) beyond the original context, one needs to know which properties of the sample or the study can be deemed inductively relevant to the new context. Formal theory helps by providing the researcher with guidance as to what matters and what does not. Indeed, Shepard’s description of the generalization problem facing every learner seems pointedly appropriate to the generality problem facing psychological scientists: We generalize from one situation to another not because we cannot tell the difference between the two situations but because we judge that they are likely to belong to a set of situations having the same consequence. Generalization, which stems from uncertainty about the distribution of consequential stimuli in psychological space, is thus to be distinguished from failure of discrimination, which stems from uncertainty about the relative locations of individual stimuli in that space. (p. 1322)
If we hope to make sound generalizations as scientists, we must know what theoretical space attaches to our empirical work: My modest suggestion is that formal mathematical theories are the method by which we can do so.
Measurement and Theory Have a Complicated Relationship
Let us turn next to the question of measurement and its relation to theory. If one hopes to obtain empirical support for a theoretical claim, it must be tethered in some way to observational or experimental data. To accomplish this, one must have an appropriate measurement tool. For example, one of the key insights in Shepard’s (1987) article is the recognition that although stimulus-generalization functions can be extremely irregular in form when we measure distance in “objective” terms, they are often very smooth when measured in more subjective terms: Color generalizations are predictable with respect to the appropriate color space (e.g., Ekman, 1954), tones are regular when described in an appropriate perceptual space, and so on. In retrospect this seems obvious, but at the time Shepard developed the theory he was faced with a substantive problem of how to extract the appropriate stimulus representation to which the theory might be applied. Setting aside the justifications for his choices, nonmetric multidimensional scaling (MDS; Kruskal, 1964) served as a measurement model for Shepard in 1987, and his analyses all use MDS-estimated psychological spaces to supply the relevant measure of distance.
As this discussion illustrates, the measurement instrument and development of theory were tightly linked. Without MDS as a measurement tool Shepard would have found it almost impossible to formulate the empirical regularity of interest with any confidence. However, it is equally clear that MDS is merely a tool used to help define the phenomenon to be explained. It can be used to supply an approximate measure of psychological distance d(x, y) between two stimuli, but it does not itself explain why a measure of stimulus generalization g(x, y) should diminish exponentially as a function of this distance. Although MDS and other latent variable models (e.g., factor analysis) can be useful tools for organizing our measurements in a statistically meaningful way, we should not mistake them for psychological theory.
To illustrate the latter point, it is notable that in the stimulus-generalization literature it quickly became apparent that Shepard’s law applies even in situations in which MDS does not: Shortly after the publication of Shepard’s original article, Russell (1988) demonstrated that the same law holds for stimuli defined in terms of discrete features as well as for the continuous spaces for which Shepard’s work was defined, a connection that was later extended by Tenenbaum and Griffiths (2001). Although the theoretical framework could not have come into existence without the scaffolding provided by the MDS measurement model, it quickly outgrew any need for this support. Many of the generalization problems discussed by Tenenbaum and Griffiths cannot be described with respect to any metric space extracted by MDS but are nevertheless consistent with Shepard’s theory. In other words, although the measurement model supplied by MDS played a central role in developing theories of generalization, those theories are no longer dependent on MDS in any meaningful sense.
Rewriting Old Theory Can Provide New Insight
The specific mathematical form that Shepard used to implement his ideas is not unique, and the theory can be rewritten in a different notation. Cooper and Guest (2014) argued that work on theory need not be constrained to a particular “implementation” (or formalism) but is better captured by a more abstract notion of a “specification.” As a concrete example, it is worth considering the manner in which Shepard’s law was later reformulated by Tenenbaum and Griffiths as an (explicitly) Bayesian model and the effect this rewriting had on how the theory could be applied.
To illustrate what I mean here, it is worth considering how Bayesian cognitive models are typically described in the cognitive-science literature. It is grossly typical now to introduce such a model by first saying “we propose to treat [psychological problem of interest] as a Bayesian inference problem” and then introduce the formula for Bayes’s rule:
It would then be explained that P(h) defines the learner’s prior degree of belief in some hypothesis h about the world, whereas P(h|x) is the posterior belief in that hypothesis after the learner encounters the information embodied by x, whatever x may happen to be in the specific application at hand. Next it would be noted that the likelihood term P(x|h) denotes the probability of the learner observing x if hypothesis h were true. The normalizing constant P(x) is also explained, additional context is filled in, and the end result is an abstract specification for a mathematical model. 5
One finds nothing of the kind in Shepard’s (1987) article. None of the “standard” notation is used, and there is no explicit appeal to Bayes’s rule in the text. Instead, all that one finds is a discussion of “consequential regions” of unknown size, probability measures that are not entirely easy to understand for the casual reader, and so on. It does not look like a Bayesian model in the sense that cognitive modelers would easily recognize 30 years later. I can certainly attest to the fact that I did not perceive the connection to Bayesian learning until Tenenbaum and Griffiths (2001) recast Shepard’s formalism using a different notation, expressing the same ideas rather differently.
The contribution to theory of Tenenbaum and Griffiths (2001) is worth expanding on because I think it was instrumental in allowing Shepard’s theory to be extended beyond the original stimulus-generalization context. Whereas Shepard referred to the notion of a consequential region located within a psychological space—with all of the geometric connotations that this space entails—Tenenbaum and Griffiths took a more general view and framed their analysis in terms of “consequential sets.” Moreover, any specific candidate for the true consequential set was labeled a “hypothesis” h and considered part of a broader “hypothesis space” H and the underlying problem of generalizing from one stimulus to another could be recast as Bayesian reasoning about (collections of) such hypotheses.
The Bayesian reformulation of Shepard’s theory that Tenenbaum and Griffiths presented allowed them to generalize Shepard’s theory in three distinct ways. First, as mentioned earlier, they showed (much like Russell, 1988) that Shepard’s theory could encompass stimuli that were not representable as points in a geometric space: In their notation, this is accomplished by substituting a new hypothesis space H. Second, this formulation allowed the theory to naturally accommodate inductive-generalization problems in which the learner has encountered more than one consequential stimulus. Earlier approaches for allowing the model to account for multi-item generalization (e.g., Shepard & Kannappan, 1991) were not quite so adaptable.
Finally, this formalism called attention to a potentially limiting assumption in Shepard (1987). Shepard argued that “in the absence of any information to the contrary, an individual might best assume that nature selects the consequential region and the first stimulus independently” (p. 1321). This so-called weak sampling assumption places strong constraints on the inferences that the learner can make, and when formally instantiated within the model it leads to a situation in which the learner necessarily behaves like a naive falsificationist: The only role that observed stimuli x can play is indicating which hypotheses h are consistent with the observations and which are not. Nevertheless, this is by no means the only assumption a sensible reasoner might make, and by highlighting Shepard’s assumption more clearly, Tenenbaum and Griffiths (2001) allowed later work to explore alternative sampling models that allow the reasoner to use the stimulus information in a more sophisticated manner (e.g., Hayes et al., 2019; Shafto et al., 2014). Each of these insights has led to new empirical and theoretical work, a point that I expand on in the next section.
Theory Growth Can Drive Experimental Innovation
The final point I want to make pertains to the relationship between theoretical growth and empirical innovation. I have heard it suggested on occasion that psychology needs to solve its empirical problems first and only then consider how to construct good theory. I am less than convinced by such claims and hope to illustrate in this section why the two problems go hand in hand, again using the stimulus-generalization theories introduced by Shepard (1987) and Tenenbaum and Griffiths (2001) as my example.
One of the most important contributions made by the Bayesian formulation adopted by Tenenbaum and Griffiths (2001) is that it allowed the underlying theory to be applied to a much broader range of inductive problems. Shepard’s original construction, although purported to be a very general law itself, was formulated with respect to a narrow class of psychological problems: inductive generalization from a single observation. Moreover, because the origins of his work lay in the study of human perception and the animal-learning literature, it was not immediately clear—at least it was not clear to me—how the theory should be extended to higher-order cognition. The reformulation offered by Tenenbaum and Griffiths made it quite apparent that Shepard’s original theory is a special case of a broader class of Bayesian generalization models. By abstracting away from the specific problem Shepard’s theory sought to explain and casting it in a language (Bayesian inference) that is naturally extensible to new problems, I was able to see how I could extend Shepard’s theory on my own.
Perhaps the cleanest example of this interplay in my own research is the work presented by Hayes et al. (2019), which was motivated by a puzzling finding presented by Lawson and Kalish (2009) in which people appeared to solve inductive-reasoning problems differently depending on how the information in the reasoning problem was selected. At the time the original work was presented, no clear explanation for why people would do this was available, so we considered the possibility that—following Tenenbaum and Griffiths’s observation that from a statistical-learning perspective inductive generalization should depend on the learner’s beliefs about how information is selected—the earlier results by Lawson and Kalish (2009) represented the same kind of effect.
The process I followed when adapting the theory to a new context may be informative. In my first pass at adapting the theory (Hayes et al., 2017), I constructed a model that was only very slightly different from the Tenenbaum and Griffiths version and used it to derive qualitative predictions regarding what kind of empirical manipulations should be expected to modulate the effect reported by Lawson and Kalish (2009). My colleagues and I then undertook a series of experimental tests, reported in Hayes et al. (2019), showing that under some circumstances (not all) the effects predicted by (my trivial adaptation of) the Tenenbaum and Griffiths model occur almost exactly as expected. However, from my perspective this initial work was unsatisfying: Because the new experimental results involved a very different design to the kind of “stimulus-generalization” tasks with which Shepard was originally concerned, it was difficult to be certain which aspects of our data could be explained as a “sampling effect” and which could not. This led me to a develop a more substantive modification of the Tenenbaum and Griffiths model, 6 and following the model-evaluation procedure outlined in Navarro (2019), I was able to resolve much of this uncertainty: Most of our experimental findings were indeed consistent with the theory, but some were emphatically not. By adopting a mathematically precise, theory-motivated approach to exploring this phenomenon, my colleagues and I were able to obtain clarity about what we were seeing in our empirical data. I know of no other process that would have allowed me to do so.
A Word of Warning
In this article I have argued that the toolkit provided by mathematical psychology can be a powerful aid to those seeking to build psychological theories. I would be remiss, however, if I did not comment on the limitations to this approach. As a mathematical psychologist studying human inductive reasoning, what I want is a “mathematical theory of human reason” that explains the entire psychological process of human reasoning about underconstrained problems. However, my skill and knowledge are both limited, and I cannot fathom what class of theoretical models might be applicable to the entire psychological process at hand. Nor can I think of a way to circumscribe the scientific problem in a fashion that allows me to render the entire domain of human reason subject to any kind of direct measurement. This limitation has consequences. My experiment is a measurement tool that captures some aspects to human reasoning but inevitably confounds it with the measurement of some other phenomena. If I try to account theoretically for all things in my data I must provide an account of these unknown things as well as the thing I am trying to study. But if my experiment is too complex then these unknown things will themselves become quite complex, leading to the risk that any theoretical explanation I construct is little more than wild speculation.
When faced with this concern, a sensible but potentially dangerous strategy is to make the task simpler. Make the task so small and so simple that we actually can write down models that specify precise assumptions about every aspect of the task. This may lead to better theoretical models, but it may come at the price of limiting their theoretical scope to an unreasonable extent. It is inconvenient, perhaps, but it remains true that our theoretical models are defined with respect to simplified “toy worlds”; humans, however, must occupy the real one. If we emphasize formal rigor too much (and adapt all of our measurements to let us satisfy these demands) the experimental paradigms may become ossified and highly restricted, adapted to suit only those phenomena that we know how to model in full. This can be dangerous insofar as it provides an illusion of explanatory power, one that falls apart once we step outside the narrow confines of our paradigm. This is not a novel observation: For example, Hacking (1992) argued that over time the laboratory sciences can create a self-vindicating system by building theories and methods that are “mutually adjusted to each other” and cannot be falsified, quite irrespective of their real-world utility: “The theories of the laboratory sciences are not directly compared to ‘the world’; they persist because they are true to phenomena produced or even created by apparatus in the laboratory and are measured by instruments we have engineered” (p. 30).
Theory-inclined psychologists should not shy away from the concerns this raises. When seeking to develop theories, one should take some care to reflect on how the perspective from theory may serve to circumscribe the problem at hand in too narrow a way. Precisely because of the fact that mathematical models are hard to build and experimental paradigms are easy to simplify, those of us who advocate formal theory building must, I suggest, be especially wary of this trap.
Conclusion
Mathematical psychology is something of an oddity in the discipline. It does not eschew empirical research, but neither does it view the goal of psychological science to be the accrual of empirical effects. Quite unlike most areas of psychology with which I am familiar, mathematical psychologists place a high value on theory development, particularly when such theories can be stated in a formal manner. My goal in this article was to highlight the manner in which cumulative work on theory has developed in this discipline, using Shepard’s law as an example. From its origins in associative learning and stimulus generalization to its reformulation as a Bayesian model and its extension to a variety of novel contexts, a single theoretical claim can be shown to connect to a variety of empirical findings in superficially distinct domains.
Although I have focused on Shepard’s law and its extensions in this article, I suspect that the underlying pattern is quite general. I could have chosen the Rescorla-Wagner model of associative learning as the basis for this discussion (Rescorla & Wagner, 1972) or the generalized context model of human categorization (Nosofsky, 1986). I could have chosen to focus on models such as ALCOVE (attention learning covering map) that sought to unify associative learning and categorization (Kruschke, 1992) or models such as the hierarchical Dirichlet process that sought to unify various category-learning models within a common theoretical language (Griffiths et al., 2007). I could have revisited Ebbinghaus’s work on memory (Ebbinghaus, 1885/1913). I could have examined sequential sampling models of choice reaction time (Luce, 1986) and the rich theoretical tradition that mathematical psychologists have developed in that domain. In each of these areas psychologists have been slowly and carefully building psychological theories. The work is painstaking and slow and the articles are often difficult to read, but I would argue that the development of theory in this domain has been genuinely cumulative.
These advances in theory have something in common. In each of these areas psychological scientists have built up a considerable body of theoretical knowledge that is instantiated in formal models of psychological processes. In every case the underlying theoretical models are more than mere summaries of empirical results and more substantive than a mere statistical model. In all cases the formalism can be used to generate novel predictions in experimental paradigms that differ markedly from the experimental contexts used to develop the model (and, remarkably, some of those predictions have even turned out to be correct). By judiciously combining abstraction and formalism, mathematical psychologists have been able to develop a toolkit that allows anyone to derive theory predictions in completely novel paradigms. If it is indeed the case that psychology suffers from a kind of “theoretical amnesia” (Borsboom, 2013), perhaps the machinery of mathematical psychology can aid its memory. Perhaps fittingly, the words of Shepard (1987) seem an appropriate way to conclude: Undoubtedly, psychological science has lagged by behind physical science by at least 300 years. Undoubtedly, too, prediction of behavior can never attain the precision for animate that it has for celestial bodies. Yet psychology may not be inherently limited merely to the descriptive characterization of the behaviors of particular terrestrial species. Possibly, behind the diverse behaviors of humans and animals, as behind the various motions of planets and stars, we may discern the operation of universal laws. (p. 1323)
Footnotes
Acknowledgements
This article grew out of numerous conversations with several people, most notably Berna Devezer, to whom I am deeply indebted and without whose thoughtful contribution this article would not exist. I would also like to thank Richard Morey, Olivia Guest, and an anonymous reviewer for thoughtful (and kind) comments on the initial version of the manuscript, which was submitted in a less-than-polished form because of the outbreak of COVID-19. Source material associated with this article is available on GitHub (https://github.com/djnavarro/shepard-theory) and OSF (
).
Transparency
Action Editors: Travis Proulx and Richard Morey
Advisory Editor: Richard Lucas
Editor: Laura A. King
