Abstract
In their article on theory-based measurement, Borgstede and Eggert (2023) argue that a substantive formal psychological theory that is capable of predicting expected measurement outcomes for the theoretical objects of measurement it posits to exist is both necessary and sufficient for psychological measurement. They reveal that measurement in psychology mostly concerns the estimation of latent variables and compares unfavorably to the development of measurement in the history of physics. They, however, fail to include a comparison with the great advances in theory-based measurement achieved in modern physics. In this commentary, I describe how measurement is formalized in classical physics and examine what would be required to formalize the physical measurement of psychological phenomena. I conclude that, without an examination of the theoretical assumptions underlying current measurement procedures and a formal notion of psychological measurement, it is unlikely that psychological science will be able to generate the substantive theories suggested by Borgstede and Eggert.
A difficulty of much psychological theorizing is vagueness in the terms employed.
In their article, Borgstede and Eggert (2023) argue that a substantive formal psychological theory is both necessary and sufficient for psychological measurement. Crucial to understanding their claim is the distinction between the instrumental and the scientific value of measurement procedures. When measurement outcomes can be associated with the content of a formal theory, it can be said to have scientific value because it will allow us to decide between the veracity of divergent predictions by competing theories. Absent such theories, latent variable models, as well as psychological tests and standardized experimental procedures, can have instrumental value, but they do not have any surplus interpretability, which many physical measurements do possess—for example, “the law of equipartition links the values obtained from a mercury thermometer to the abstract concept of energy” (Borgstede & Eggert, 2023, pp. 131–132).
Borgstede and Eggert (2023) compare the current state of psychological measurement to the development of measurement in physics and conclude that psychological measurement is still in the prescientific stage. Although I agree with their conclusion, it is rather curious that the authors never discuss what constitutes a physical measurement in the formal sense. Based on the examples presented in their article, their history of measurement in physics appears to end with the emergence of classical mechanics. I assume that Borgstede and Eggert are referring to classical physical measurement whenever they use the term. This represents a missed opportunity to learn about the highly successful formal approaches to measurement that have been developed in modern physics. For example, physical theories about the quantum world—specifically, quantum electrodynamics—provide the most precise and accurate predictions of measurement outcomes ever observed in the history of science (Aoyama et al., 2008; Bennett et al., 2006). One of the reasons for this success has been the formalization of every aspect of the measurement process, which can yield different theories of measurement for different physical domains (e.g., classical physics, general relativity, quantum physics) but remains relatively independent from the specific content of the theories that operate within the domains (see Ludwig, 1985). Borgstede and Eggert (2023) seem to assume that the measurement problem will be resolved as soon as psychological science is able to produce substantive formal theories: “formal theory does not only provide meaning to the concepts used in a theory; it also provides rules about how to apply the theory, rendering a specific measurement theory obsolete” (p. 131). This strengthens the assumption that, to the authors, measurement concerns ideal classical physical measurement, which has arguably been the default interpretation of psychological measurement throughout the history of the discipline (Luce & Narens, 1983).
The first claim I will defend in this commentary is that even if a revolution happens in psychological theorizing that will finally provide us with substantive formal theories, without a formalization of the process of measurement of psychological phenomena, the discipline will be back to square one—inferring the best-fitting parameters of statistical (latent variable) models from noisy data but still lacking a clear notion of how to incorporate the measurement context and the act of measurement into the description of the psychological phenomenon (Hasselman et al., 2019). The second claim I will defend concerns the following statement: “Our critique of LVM [latent variable modeling] is independent of the specific assumptions made in latent variable models, like quantitative structure, ergodicity, local independence, and so forth” (Borgstede & Eggert, 2023, p. 127). Analogous to Dennett’s (1995) famous dictum about philosophy-free science, I argue that there is no such thing as theory-free measurement; there is only measurement whose theoretical baggage is taken on board without examination. In what follows, I first describe how the measurement procedure has been formalized for classical physical theories and subsequently explore whether this approach can be used to describe psychological measurement. Finally, I argue that psychological science, perhaps inadvertently, has been testing a formal theory for many decades. Can psychological phenomena be considered to originate from an ergodic system?
Formalizing physical measurement
In classical physics, a measurement brings about a correlation between a quantity
As an example of a nontrivial interaction between
To summarize, irrespective of the specifics of the phenomenon of interest, the measurement process in physics concerns two aspects—state preparation and measurement—and the important questions to resolve are whether the measurement procedure is disturbing or not, and whether it is possible to predict the effects of the disturbance, which requires knowledge about the nature of the interaction between
Psychological measurement is disturbing
Can the formalization of the measurement process in physics be a model for psychological measurement? Psychological measurement generally concerns the quantification of internal physiological, emotional, or psychological states, or mental phenomena that are not directly perceivable by an outside observer. This is different from measuring the position of a planet but may be similar to measuring the temperature of water or the mass of your body, which are also quantities that are not directly observable. I suggest that it is reasonable to assume that all psychological measurement concerns at least disturbed measurement—that is, the interaction between
The sobering conclusion must be that psychological science does not have a formal conceptualization of exactly what measurement entails in this context. Even if we had a formal theory of happiness that would yield numerical predictions for
Theoretical baggage in psychological methods and models
The statistical models used in contemporary psychology require measurement outcomes to be stationary, homogeneous, and independent—that is, beyond a sufficiently short timescale, repeated observations should no longer be correlated. The latter assumption is called the memorylessness property (Ramachandran, 1979) and, together with stationarity and the homogeneity of central moments, puts very specific constraints on the data-generating process and the kind of physical system that can produce such data. This is known as ergodicity (Molenaar, 2004, 2008), which refers to the condition in which ensemble averages of variables observed in samples of sufficiently many systems of the same identity, are expected to be arbitrarily similar to the time averages of variables evolving over a sufficiently long interval of time in a single system, irrespective of the set of possible initial conditions. For example, if we were able to throw 1,000 fair dice all at once, the observed distribution of values is expected to be arbitrarily similar to the distribution we would obtain by throwing a single fair die 1,000 times in a row. The assumption of ergodicity constitutes the basis of all statistical models, including latent variable models. These assumptions are theory-laden and, as such, make specific theoretical claims about the object of measurement and inform the measurement procedure (random sample, factorial design, random assignment, etc.) and interpretation (group-to-individual generalization). Is it really the case that psychological science studies the behavior of ergodic systems?
Recent observations of discrepancies between inferred properties at the ensemble level (interindividual) and the individual level (intra-individual) indicate an emerging consensus that ergodicity does not apply to psychological measurements (Fisher et al., 2018; Wolfers et al., 2018). Olthof et al. (2020) evidenced that multivariate time series of self-reports of human experience in the context of psychopathology violate all of the ergodicity assumptions and more likely reflect the properties of the nonlinear dynamics of complex adaptive systems. A prominent result was the nonstationarity of the autocorrelation function, which implies a temporal structure in the data that is likely multifractal in nature—a phenomenon that has been suggested to play a role in the reproducibility crisis in psychology (Wallot & Kelty-Stephen, 2017).
Contrary to what Borgstede and Eggert (2023) suggest, there is an abundance of theory-based measurement in psychological science; it just happens to be based on erroneous theoretical assumptions about the object of measurement. If a formalism for the physical measurement of psychological phenomena is defined, the substantive theories will follow.
Footnotes
Acknowledgements
I would like to thank Ralf Cox for introducing me to the ideas behind modern physical measurement, as well as his efforts to translate the formal concepts in order to be applied to psychological measurement. This commentary would not have been possible without our discussions.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
