One small step towards a metatheory of evidence and proof

Abstract

Many challenges to a probabilistic account of evidence fail to distinguish two separate questions. One is the question of whether the evaluation of individual items of evidence is probabilistic or non-probabilistic. And the other is whether the evaluation of multiple items of evidence is sequential (Bayesian) or holistic. This paper seeks to distinguish these issues, to argue that challenges to probabilistic accounts have not yet made their case, and to attempt to clarify just what a theory of evidence and proof is a theory of.

Keywords

bayesianism evidence explanationism probabilism theory of proof

In ‘Relative plausibility and its critics’ (Allen and Pardo, 2019), Ronald Allen and Michael Pardo offer and defend against multiple critics a theory of, in their words, ‘juridical proof’ (Allen and Pardo, 2019: 7, fn 7). But what is a theory of juridical proof? And what would be necessary to defend one, to attack one, to prove one or to falsify one? Allen and Pardo’s contribution, including its assumptions, its gaps and indeterminacies as well as its insights, provides a valuable opportunity to address these issues. And consequently this essay, although focused on Allen and Pardo, is also an attempt to clarify what theorists can be understood as doing when they theorise about the nature of legal evidence and the process of legal proof.

Is there a paradigm, and is it shifting?

Allen and Pardo describe their account as part of a ‘paradigm shift’ in thinking about the nature of juridical proof, a shift from what they describe as the ‘probabilistic paradigm’ to what they label as ‘explanationism’. They acknowledge the tendency in legal scholarship to overclaim contributions as paradigm shifts, but it is not clear that they have avoided the pitfall they have so admirably recognised.

As introduced by Thomas Kuhn and then subsequently explained, embellished and expanded by legions of others, a claim that there has been, or is now in progress, a paradigm shift is an empirical sociological claim.¹ What makes a paradigm sociological, and accordingly different from a theory (even a sound one), an account or an explanation, is that the idea of a paradigm embeds the empirical and sociological claim that this is how the bulk of working scientists, researchers or participants of any kind in some enterprise understand and go about their work (see Barnes, 1982; Bird, 2018; Bloor, 2016). The key ideas are consensus and agreement, both sociological ideas. A paradigm establishes the agreed-upon examples, and the agreed-upon approaches of the field. In normal science, to use Kuhn’s term, those working in the field agree upon the central examples, and upon the central methods or approaches that participants will use. When working in normal mode, those in the field may debate about penumbral cases or alternative explanations, but there is a consensus about the basic phenomena they are trying to explain and on what would make an explanation successful or unsuccessful.

A paradigm is thus an empirical and sociological description of (most of) the behaviour of (most of) a field’s major participants. What follows is that a paradigm shift occurs when these participants produce a new consensus – when they replace what were formerly the agreed-upon exemplars of a phenomenon with new and different exemplars, and so too with a replacement of agreed-upon methods with new and different ones. In order to identify a paradigm shift, therefore, we must first identify the relevant cohort of practitioners, and then identify the examples, methods and accounts that all or most of these practitioners formerly shared, and finally identify the different examples, methods and accounts that this cohort of practitioners now shares or is in the process of beginning to share.

Pace Allen and Pardo, there seems little empirical evidence that such a paradigm shift has taken place or is even now in progress. Even if probabilism is the kind of thing that can constitute a paradigm, which it likely is, and even if Allen and Pardo are not the only ones offering challenges to probabilism, which they are plainly not, the core of the empirical sociological claim of a paradigm shift requires evidence of a change in the practices or assumptions of the central participants in the enterprise. That some of the participants adopt different assumptions and explanations, and that those participants hope that other participants will agree with them, is, at best, a wish for a paradigm shift, but the claim that there has been or is now in progress a paradigm shift requires far more of an empirical demonstration of an actual change in the (participants in the) field’s assumptions, examples and methods than Allen and Pardo have been able to provide. And although this complaint about the use (or misuse) of the idea of a paradigm shift may seem trivial, it goes to the authoritativeness of the views and methods of the field as currently constituted. If what is generally done by the participants is entitled to some deference just because it is generally done – that is, if what is generally done has authority just because of its provenance as being generally done – then we must be careful about the sociological claim. Allen and Pardo may be correct on the merits, but it is premature, at least on the evidence they provide, to claim the authority of a widely accepted mass shift in the consensus within the relevant field.

Is there a dispute, and, if so, what is it about?

Allen and Pardo offer what they variously describe as explanationism and relative plausibility as an alternative to probabilism. Embedded in this contrast between two accounts, however, are two different issues, and thus two different potential disputes, which it is important to distinguish.

First is the question whether the evaluation of individual items of evidence is necessarily a probabilistic process, or whether instead a probabilistic understanding of evidentiary evaluation misconceives the process of evaluating such individual items. And so although it is commonly claimed – perhaps only by probabilists, Allen and Pardo might maintain – that all evidence is probabilistic,² there may be an alternative explanation of how people in general, or lay jurors or legally-trained judges operating as triers of fact,³ evaluate pieces of evidence, an alternative explanation that cannot be reduced to probabilities or explained in terms of probabilities. As a result, one dispute is between the view that all evidence is probabilistic, whether numbers are assigned or not, and the view that there are non-probabilistic understandings of individual pieces of knowledge and individual items of evidence.⁴

Without for now saying anything more about this dispute between probabilistic and non-probabilistic evaluations of individual pieces of evidence, it is important to recognise that this dispute is different from a dispute about the way in which some population of decision-makers evaluates multiple items of evidence thought to bear upon some conclusion or judgement. Under one account – call if Bayesian, for want of a better term – multiple items of evidence are evaluated individually – atomistically – and sequentially. The decision-maker begins with a prior probability of the likely truth of some conclusion, and then updates this judgement up or down as additional pieces of evidence are offered.⁵ But under an alternative account – call it holism – the decision-maker suspends judgement⁶ until all the evidence is in, and then determines which of opposing explanations is better supported by the totality of the evidence, where the choice between opposing explanations is to be made, in part, on the extent to which an explanation offers a coherent account of the individual pieces of evidence.⁷

Allen and Pardo are plainly holists, but they appear to discount the possibility that holistic evaluation might be probabilistic and that atomistic sequential evaluation – the alternative to holism – might be non-probabilistic. But this possibility ought to be taken more seriously than Allen and Pardo take it. A holistic evaluation might still attempt to determine, based on all of the evidence considered as a whole, and evaluated in terms of internal coherence and in a non-comparative way, whether one claim was, say, 10% likely, or 51% likely, or 99% likely. And a non-holistic evaluation might, conversely, attempt to determine, at each of multiple sequential stages, which among competing alternative explanations for that single piece of evidence was more plausible or more coherent with the already received pieces of evidence. Thus, part of the reason to be careful to separate the claims of holism from those of non-probabilistic reasoning is that the two, although they often travel together, remain analytically distinct.

Although the probabilistic/non-probabilistic and sequential/holistic contrasts are thus orthogonal to each other, Allen and Pardo advocate both the non-probabilistic and the holistic sides of these divides.⁸ Their claim is that juridical proof is holistic rather than sequential, and that holistic evaluation of multiple pieces of evidence cannot be explained in probabilistic terms. Without yet addressing the question of how to evaluate the conjunction of these two claims, it remains important to emphasise that these are two claims and not one, and that the two claims need not be conjoined in the way in which Allen and Pardo conjoin them.

Must probability be about numbers?

Many analyses over the years have offered numerical equivalents for the beyond-a-reasonable-doubt standard routinely applied to criminal prosecutions in the United States and elsewhere.⁹ As such efforts indicate, it is at least possible to translate verbal formulas for degrees of confidence into numerical levels.

But although it is possible to translate beyond-a-reasonable-doubt into, say, .95, and clear-and-convincing into .75, and preponderance-of-the-evidence to .51, possibility is not the same as necessity. Most people most of the time are comfortable with non-numerical judgments of preference, and they are comfortable, without numbers, in saying that they are, for example, ‘pretty sure’, ‘sure’, ‘really sure’ and ‘positive’ of some conclusion,¹⁰ or that one thing is on some way better than another thing even if there is no common metric between the two.¹¹ What this observation tells us, however, is not that a probabilistic account of individual assessments is flawed, but only that the probabilistic account, which is about grading or comparing degrees of probability, or degrees of credence, does not require attaching hard (or even soft) numbers to the probabilities. Allen and Pardo are correct in asserting that American trials rarely, if ever, use numbers to describe degrees of confidence, and they are correct in concluding that attempting to do so would produce only additional confusion, at least for lay jurors and likely for judges as well. But the conclusion that probabilities are rarely couched in numerical terms in the legal system only challenges the probabilistic account if so-called probabilism is inextricably tied to numericalising the relevant probabilities. But if the heart of the probabilistic account is simply the ability to describe or compare degrees of confidence, and the capacity to recognise that the necessary degree of confidence might vary with the consequences of a judgment, then more than just the current and inevitable non-numerical description of probabilistic judgments is necessary to rebut the most plausible versions of the probabilistic account. Those most plausible versions divorce determinations of probability from assigning numbers, just as the law does when it requires judges to determine ‘probable cause’, and just as ordinary people do when they say that they will probably go the movies on Saturday.

But this is a side issue, because at bottom the Allen and Pardo claim is less about the use or non-use of numbers and more about the use or non-use of sequential processing – the dispute between the sequential process of Bayesian updating with the addition each additional piece of evidence, as opposed to holistic evaluation of all the evidence together in order that the trier-of fact may determine which side has offered the more plausible explanation of the disputed events.

Stories as explanations

In making their case for a holistic account of fact-finding at trial, Allen and Pardo rely on the research of Nancy Pennington and Reid Hastie, who in many studies have produced the well-known ‘story model’ of jury decision-making.¹² But although Allen and Pardo rely almost exclusively on Pennington and Hastie for the empirical support that they maintain, correctly, is an essential component of their larger claim, they nevertheless stress the distance between themselves and Pennington and Hastie because they, Allen and Pardo, believe that their claims about the plausibility of competing explanations extend beyond mere stories.

Here, however, the issue seems largely a semantic dispute about the word ‘story’. Allen and Pardo understand a story as a chronological narrative, and maintain that the explanations whose comparative plausibility is what fact-finders are tasked to determine are often not stories in this narrative and chronological sense. Rather, they argue, explanations need not be narrative and need not describe a chronologically sequenced array of events, in the way in which historical accounts, novels, plays, movies and bedtime stories report chronological events and offer explanations of those events.

But here Allen and Pardo seem too eager to create a dispute with their principal allies. Pennington and Hastie explicitly, and often, use the word ‘explanation’ to describe what jurors do, and equally explicitly describe juror decision-making as choosing between alternative explanations, their equally frequent use of the word ‘story’ notwithstanding (see especially Pennington and Hastie, 1992). Moreover, there is a sense in which ‘story’, in ordinary English, is simply a synonym for ‘explanation’, as when a parent, encountering a messy room, says to a child, ‘What’s the story here?’, or when, whether true or apocryphally, the negligent student says that the dog ate her homework, and the teacher responds by saying that he doesn’t believe the story. In such cases we could substitute ‘explanation’ for ‘story’ and nothing in the conclusion would be lost. There are, as Allen and Pardo observe, factual disputes in which the competing accounts are not really stories or narratives, as with their example of patent disputes, but nothing in the Pennington and Hastie research suggests that their research, which is narrowly about chronological narratives, would not be applicable, the word ‘story’ aside, to such controversies.

This is not to say that there might not be other and important differences between Allen and Pardo, on the one hand, and Pennington and Hastie, on the other. But those differences cannot be captured in any alleged differences between stories and explanations, and perhaps Pennington and Hastie would have been better off, at least for some purposes, had they always used the word ‘explanation’, and left the stories for bedtime.

But what is being explained?

The relative plausibility account offered by Allen and Pardo is a theory, but what precisely is it, as Dan Simon worries in this Symposium (Simon, 2019), a theory of? Allen and Pardo describe their theory as being descriptive and empirical, but that does not answer the question of which phenomena they seek to offer an account, theory, or explanation of. They say that the phenomenon is ‘juridical proof’, but are they seeking a theory of the concept of judicial proof, or instead of the practice of proving facts in courts of law. And if the latter, which appears to be the most plausible explanation of their enterprise, and if the inquiry is, as they say, chiefly empirical, then we would want a more precise specification of the units of analysis, or, more broadly, of the actual acts or events or phenomena for which we are seeking an explanation in order to know whether the relevant decision-makers whose decision-making practices are being studied are judges or magistrates acting as triers of fact, or, more likely, lay persons deciding collectively as members of a jury.

This question about the identity of the relevant factual decision-makers may again seem to be nit-picking, but it exposes a centrally important question about Allen and Pardo’s entire enterprise. For if Allen and Pardo seek to explain the decision-making processes of lay jurors, and their explicit reliance on Pennington and Hastie’s jury-focused empirical research makes this clear, then there is a potential misfit between their goal of seeking an empirical explanation of decision-maker behaviour and their particular interest, at least in this article, in the question of burdens of proof. And this distinction is potentially a misfit because it invites the question whether jurors actually do use notions of burden of proof in making their decisions.

Of course, as Allen and Pardo stress, the instructions that are given to jurors do typically employ and explain the idea of a burden of proof, or of different standards of proof, and these instructions vary depending on whether the standard of proof is preponderance of the evidence, clear and convincing evidence, proof beyond a reasonable doubt, or something else.¹³ But these instructions would be a substantial part of what an explanation of the practice of judicial proof needs to explain only if the instructions are themselves significant factors in jury decision-making. However, it turns out that that may not be the case. The same kind of empirical research employed by, for example, Pennington and Hastie, tends to be inconclusive on the question of whether the standards of proof that jurors are instructed to use have any effect on the conclusions that those jurors reach.¹⁴ No two studies use the same experimental design, and no two use the same legal context, but it is still safe to conclude that varying standards of proof have not yet been shown to play, or not to play, a substantial role in influencing juror verdicts. The existing research is simply inconclusive.

The actual or potential variance among legal doctrine, instructions to jurors and juror verdicts exposes an issue of even greater generality – an issue that we might label the Legal Realist challenge. A central tenet of Legal Realism, exemplified in the writings of Jerome Frank (1930, 1949), most flamboyantly, and Karl Llewellyn, most influentially,¹⁵ is that the behaviour of legal decision-makers, whether they be judges, jurors or administrative officials, is only loosely predicted by the formal legal doctrine that those decision-makers are officially charged with applying. Rather, either the decision-makers reach conclusions that are dominated by the particular facts of particular cases (Frank) or make decisions according to rules that vary from the rules that can be extracted from formal legal sources (Llewellyn).¹⁶

This is not the place to engage in even a partial assessment of the full Realist challenge, even in the context of the rules of evidence and the practice and principles of juridical proof.¹⁷ Nevertheless, the extent to which Allen and Pardo rely on jury instructions and various facets of legal doctrine suggest, their use of Pennington and Hastie aside, that their empirical research project presupposes the falsity of the Legal Realist hypotheses, and thus presupposes that one can explain what actually goes on at trials by close scrutiny of the formal legal doctrine, including the rules of evidence, that seeks to constitute, organise and regulate the trial process.¹⁸ But this, as Mark Spottswood notes in his contribution to his Symposium (Spottswood, 2019), may be a more extravagant presupposition than Allen and Pardo suppose, and may require more in the way of empirical ‘proof’ than they seem willing to offer.¹⁹

Alternatively, we might understand their project as seeking to explain the legal doctrine of evidence and proof, leaving the degree of fit between that doctrine and the actual practice of lawyers, judges and jurors for other times or other researchers. Such internal doctrinal analysis is a respectable project, and indeed it is, to the annoyance of classical and modern Legal Realists, one that dominates legal scholarship in much of the common law world. But if that is their project, then we might wish for more consideration of those items of American legal doctrine that seem inconsistent with the holism of the comparative plausibility model, such as the sequential structuring of trials, the Bayesian relevance standards of Rule 401 of the Federal Rules of Evidence, and the fact that judges determining relevance make those determinations with respect to individual pieces of evidence and not after all of the evidence has been presented. But if explaining legal doctrine and formal legal practice characterises the nature of Allen and Pardo’s project and the form of support they wish to employ in their empirical enterprise, then reliance on Pennington and Hastie becomes even more curious. If the Pennington and Hastie research provides almost the entire empirical support for the claim that decision-makers reach their factual conclusions holistically rather than sequentially, then it appears that what Allen and Pardo seek to explain is not so much legal doctrine as actual jurdidical practice. But if that is what they are doing, then one would expect to see more attention to the panoply of empirical and non-doctrinal research on the practices of courts (and lawyers who appear before them) and the decision-making behaviour of jurors. That research might well support a conclusion that the practice of proof in American trial courts is best explained in more holistic and less sequentially Bayesian terms, and that factual decision-makers in reality evaluate the comparative plausibility of competing explanations of all of the evidence, but reaching that conclusion will require leaving the doctrine more behind than Allen and Pardo seem willing to accept, and relying less on formal rules and reported judicial opinions to support empirical claims and conclusions about actual proof practices than we have thus far observed.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes

References

Allen

Pardo

(2019) Relative plausibility and its critics. International Journal of Evidence and Proof 23(1-2): 5–59.

Amaya

(2015) The Tapestry of Reason: An Inquiry into the Nature of Coherence and Its Role in Legal Argument. Hart: Oxford.

Barnes

(1982) TS Kuhn and Social Science. London: Macmillan.

Bird

(2018) Thomas Kuhn. In Stanford Encyclopedia of Philosophy. Available at: https://plato.stanford.edu/entries/thomas-kuhn (accessed 11 January 2019).

Bloor

(2016) The pendulum as a social institution: TS Kuhn and the social sciences. In: Blum

Gavroglu

Joas

Renn

(eds) Shifting Paradigms: T.S. Kuhn and the History of Science 235–252.

Chang

(2002) Making Comparisons Count. London: Routledge.

Chisholm

(1989) Theory of Knowledge. 3rd edn. Englewood Cliffs, NJ: Prentice-Hall.

Da Silva

(2011) Comparing the incommensurables: Constitutional principles, balancing and rational decision. Oxford Journal of Legal Studies 31(2): 273–302.

Frank

(1930) Law and the Modern Mind. New York: Brentano’s.

10.

Frank

(1949) Courts on Trial. Princeton, NJ: Princeton University Press.

11.

Howson

(1996) Bayesian rules of updating. Erkenntnis 45(2–3): 195–208.

12.

Kuhn

(1970) The Structure of Scientific Revolutions. 2nd ed. Chicago, IL: University of Chicago Press.

13.

Kuhn

(1977) Second thoughts on paradigms. In: Suppe

(ed) The Essential Tension. Chicago, IL: University of Chicago Press, 293–319.

14.

Kyburg

(1987) Bayesian and non-Bayesian evidential updating. Artificial Intelligence 31(3): 271–293.

15.

Laudan

(2006) Truth, Error, and Criminal Law: Essays in Legal Epistemology. Cambridge: Cambridge University Press.

16.

Laudan

(2008) The elementary epistemic arithmetic of criminal justice. Episteme 5(3): 282–294.

17.

Lavie

Ganor

Feldman

(2018) Adjusting legal standards. European Journal of Law and Economics. doi:10.1007/s10657-018-9597-4.

18.

Llewellyn

[1938–1939] (2011) ( Schauer

(ed)) The Theory of Rules. Chicago, IL: University of Chicago Press.

19.

Loevinger

(1958) Facts, evidence and legal proof. Case Western Reserve Law Review 9(2): 154–175.

20.

Moss

(2018) Probabilistic Knowledge. Oxford: Oxford University Press.

21.

Okasha

(2013) The evolution of Bayesian updating. Philosophy of Science 80(1): 745–757.

22.

Pennington

Hastie

(1988) Explanation-based decision making: Effects of memory structure on judgment. Journal of Experimental Psychology: Learning, Memory, and Cognition 14: 521.

23.

Pennington

Hastie

(1991) A cognitive theory of juror decision making: The story model. Cardozo Law Review 13: 519.

24.

Pennington

Hastie

(1992) Explaining the evidence: Tests of the story model for juror decision making. Journal of Personality and Social Psychology 62: 189.

25.

Pettigrew

(2016) Accuracy and the Laws of Credence. Oxford: Oxford University Press.

26.

Regan

(1997) Value, comparability, and choice. In: Chang

(ed) Incommensurability, Incomparability, and Practical Reason. Cambridge, MA: Harvard University Press, 129–150.

27.

Saks

Spellman

(2015) Psychological Foundations of Evidence Law. New York: New York University Press.

28.

Schauer

(2003) Profiles, Probabilities, and Stereotypes. Cambridge, MA: Harvard University Press.

29.

Schauer

(2006) On the supposed jury-dependence of evidence law. University of Pennsylvania Law Review 155: 165–202.

30.

Schauer

(2013) Legal realism untamed. Texas Law Review 91: 749–780.

31.

Schauer

Zeckhauser

(1996) On the degree of confidence for adverse decisions. Journal of Legal Studies 25(1): 27–52.

32.

Shaviro

(1989) Statistical-probability evidence and the appearance of justice. Harvard Law Review 103: 530–554.

33.

Simon

(2019) Thin empirics. International Journal of Evidence and Proof 23(1-2): 82–89.

34.

Spottswood

(2019) On the limitations of a unitary model of the proof process. International Journal of Evidence and Proof 23(1-2): 75–81.

35.

Twining

(1973) Karl Llewellyn and the Realist Movement. London: Weidenfeld and Nicolson.

36.

Wistrich

Guthrie

Rachinski

(2005) Can judges ignore inadmissible information: The difficulty of deliberately disregarding. University of Pennsylvania Law Review 153: 1251–1345.

37.

Woody

Green

(2012) Jurors’ use of standards of proof in decisions about punitive damages. Behavioral Sciences and the Law 30(6): 856–872.