Abstract
In this article I address a foundational question in evidence law: how should judges and jurors reason with evidence? According to a widely accepted approach, legal fact-finding should involve a determination of whether each cause of action is proven to a specific probability. In most civil cases, the party carrying the burden of persuasion is said to need to persuade triers that the facts she needs to prevail are “more likely than not” true. The problem is that this approach is both a descriptively and normatively inadequate account of reasoning with evidence in law. It does not offer a plausible picture of how people in general, and legal fact-finders in particular, reason with evidence. And it turns out that if we try to do what the approach tells us, we end up with absurd results. Faced with these difficulties, a group of evidence scholars has proposed an alternative. According to them, legal fact-finding should involve a determination of which hypothesis best explains the admitted evidence, rather than whether each cause of action is proven to a specific probability. My main contributions in this article are twofold. First, I elaborate on the many descriptive, normative and explanatory considerations in support of an explanation-based approach to standards. Second, I offer novel replies to pressing objections against that same approach.
Introduction
A party who bears the burden of persuasion can win only if the evidence persuades the triers of the existence of the facts she needs to prevail under the requisite standard of proof (Mueller and Kirkpatrick, 2011: 645). Exactly how to understand the different standards of proof is controversial, however. This article contributes to this debate by offering a new defence to an explanation-based approach to standards. It does so in two main ways. First, by replying to a potentially powerful objection. Second, by offering a revised formulation of standards in explanatory terms to clarify the confusion.
According to a widely accepted approach, standards of proof should be understood as probability thresholds that correspond to a numerical degree of subjective confidence that the judge or juror must reach on each element of the claim to justify a verdict for the party carrying the burden of persuasion (see, for example, Finkelstein and Fairley, 1970; Hamer, 2004; Kaplan, 1968; Kaye, 1999; Koehler and Shaviro, 1990; Lempert, 1977; Redmayne, 2008). For instance, this interpretation assumes that the preponderance of the evidence standard applicable in most of the civil cases in the United States simply means ‘proven by a probability higher than 0.5’.
In recent decades, however, commentators have raised several objections to probability approaches to standards of proof (see, for example, Allen, 1986, 1991; Allen and Leiter, 2001; Cohen, 1977; Nesson, 1979, 1985; Pardo, 2013; Tribe, 1971). We can divide these objections into two camps. The first group suggests that the most common subjective probabilistic frameworks that evidence scholars relied upon are descriptively and normatively inadequate. These probabilistic frameworks do not offer a plausible picture of how people in general, and legal fact-finders in particular, reason with evidence. Moreover, following this approach leads to absurd results. The second group of objections focuses on internal limitations to probability approaches to standards of proof. These include so-called ‘proof-paradoxes’. These objections make a strong case against a probability-based approach to standards.
A group of evidence law scholars has proposed an alternative. According to them, legal fact-finding should involve determining the best explanation of the admitted evidence, rather than proving each element to a specific probability (see, for example, Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). The idea, in a nutshell, is that fact-finders should infer the truth of a given hypothesis (out of the set of hypotheses offered by the parties or constructed by the fact-finders) from the fact that it ‘best explains’ (as defined blow) the admitted evidence and then decide the case based on that inference.
Under this alternative view of standards, inference to the best explanation (hereinafter ‘IBE’) becomes the characteristic mode of inference in legal fact-finding. IBE can be characterised as follows. 1 First, when faced with an unexplained phenomenon, we select a group of plausible hypotheses that we determine explain a given phenomenon. We never begin with a full menu of all possible potential explanations. We also do not consider a randomly generated class of explanations. Not only are the plausible hypotheses compatible with many of our background beliefs, but we also believe that, if true, they would explain an aspect of the phenomenon being addressed. We then infer the (probable) truth from that hypothesis which provides the best explanation.
A difficulty with such explanation-based approaches to standards of proof is that the epistemology and philosophy of science literature are filled with attacks and defences of IBE as an epistemically justified mode of inference. While some authors praise IBE as an indispensable mode of inference for science, others claim that it is unclear how explanatory value can get us closer to the truth. 2 That is, it might be difficult to see why the mere fact that a given hypothesis best explains a phenomenon makes that hypothesis more likely to be true.
Proponents of explanation-based approaches to standards have rebutted similar objections to IBE by claiming that the same objections also apply to all forms of inductive reasoning (Pardo and Allen, 2008). Unless we are willing to accept skepticism about induction in general, we ought to reject these objections. The problem here is that, unless we are already committed to a non-skeptic position, this reply is not very convincing. We then need a better defence of IBE. This article provides just that. First, it offers a reply to this objection. In a nutshell, the reply notes that elements of explanatory virtues and inferential virtues coincide, which means that if the latter is truth-conducive (and, therefore, allows us to infer epistemically justified propositions), so is the former. This article’s second contribution relates to the specific formulations of standards of proof and other evidentiary mechanisms that should follow from adopting an explanatory framework. How exactly should triers reason under an explanation-based approach? What does it mean to ‘select the best explanation’ in legal settings? Commentators suggest that fact-finders should decide cases based on the relative plausibility of the stories put forth by the parties or by the triers themselves (Pardo and Allen, 2008). The best way to understand this prescription is that fact-finders should decide cases based on the relative explanatory value of the evidentiary hypothesis presented by the parties or the triers themselves. Based on this accepted hypothesis, fact-finders should infer the best explanation and find for the party that substantive law supports. Under this formulation, standards of proof are more clearly designed to set up different thresholds of explanatory values that the party carrying the burden of persuasion needs to meet. The higher the standard, the more valuable the explanation (for the best evidentiary hypothesis available to triers) must be.
Descriptive, normative and explanatory considerations support an explanation-based approach to standards. Proponents of the explanation-based approach highlight how the approach better describes jurors’ reasoning (Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). To support this conclusion, proponents offer evidence that what the approach asks of jurors is compatible with psychological research on jury decision-making (see, for example, Pennington and Hastie, 1988, 1991, 1992, 1993a, 1993b). The primary finding of that research is that individuals do not usually reason with separate pieces of evidence, which is contrary to what the probabilistic approach to standards of proof says of jurors. Instead, people tend to construct an entire story that will fit the evidence and then make decisions based on the explanatory virtue of that story. There are also important normative considerations that support an explanation-based approach to standards. Not only do explanation-based approaches avoid some of the problems associated with probabilistic approaches, but explanation-based approaches also improve by prioritising evidential support over subjective credence. 3 Moreover, explanatory considerations also favour explanation-based approaches. In particular, they allow for a better understanding of different evidentiary mechanisms, including the concepts of ‘relevance’ and ‘probative value’.
At the same time, there are pressing objections against explanation-based approaches to standards. This article addresses some of the most significant ones. For example, some argue that an explanationist approach incentivises parties to overload courts with evidence. This is because parties would view their role as one of presenting overreaching narratives, and so would be more inclined to offer evidence that speaks to parts of the narratives but is only marginally (if at all) related to controversial facts in the case. This ‘evidence saturation’ could, in turn, make legal decision-making more complex such that the explanation-based approach ultimately becomes too costly or unworkable. Even if we concede to the objector and assume that the approach does, in fact, create these incentives (which is not entirely convincing), we still have reason to believe that the approach will not turn out to be unworkable. Specifically, other evidentiary mechanisms can combat evidence saturation. Admissibility and exclusionary rules are the best examples. Not only do are there a myriad rules that make evidence inadmissible, but there is also a catch-all exclusionary clause in Federal Rule of Evidence 403. 4
Significant practical implications turn on this debate. The goal of achieving a sufficient degree of factual accuracy is a fundamental part of all legal systems. Any assignment of entitlements and their correlative disablements depends on sufficiently accurate fact-finding. 5 In a contractual dispute, a sufficiently accurate factual finding is essential to determine whether the act that constitutes an alleged breach took place. The legitimacy of any assignment of entitlements and disablements also depends on accurate fact-finding. 6 If, however, decisions about empirical facts are not epistemically justified, then such decisions also have no legitimate legal justification. 7
This article is divided into two parts, besides the introduction and conclusion. The first part discusses existing formulations of probability approaches to standards of proof and explores the most powerful objections against them. The next part discusses the moving parts of an explanation-based approach to standards of proof. It starts by explaining the characteristic mode of inference according to this approach (IBE) and how it translates to legal reasoning. This second part also identifies different arguments for explanation-based approaches to standards, objections to such approaches, and counterarguments to those objections.
Probability-based approach to standards of proof
Exactly how to understand standards of proof has been the object of much controversy in evidence scholarship. 8 Statistically-minded authors argue that classical probability theory provides the best model for interpreting standards of proof (see, for example, Finkelstein and Fairley, 1970; Hamer, 2004; Kaplan, 1968; Kaye, 1999; Koehler and Shaviro, 1990; Lempert, 1977; Redmayne, 2008). According to these authors, standards of proof should be understood as probability thresholds ranging from 0 to 1. People who subscribe to this interpretation commonly assume that preponderance of the evidence means ‘proven by a probability greater than 0.5’, while beyond a reasonable doubt would require a much greater likelihood of conviction, such as intuitive estimates of 0.9, 0.95 or 0.99. 9
As previously stated, the probabilistic approach is vague: it does not specify whether the probabilities should be understood as causal propensities, relative frequencies or degrees of subjective belief (i.e., as objective or subjective probability). One might think that we should interpret these numbers as representing relative frequencies. 10 This would raise serious difficulties, however. Frequency interpretations of probability are not especially useful to reason about unique events, which, most of the time, are the objects of legal trials (for example, X allegedly killed Y or A allegedly breached his contract with B). 11 Very rarely is evidence presented at trial in a relative frequency format. Moreover, even when it is (for example, DNA evidence), it is often combined with evidence that is not (for example, the glove found at the crime scene does not fit the defendant) (Allen, 2008). Imagine what it would mean to ask a party to establish her case based on a certain relative frequency threshold. That party would have to account for every conceivable way the world might have been and show that a certain number of those scenarios favour her claim (Allen, 2008). Such a task seems impossible.
In light of these problems with objective forms of probability, 12 proponents of the probability approach usually turn to some form of the Bayes rule for rescue (see, for example, Finkelstein and Fairley, 1970; Hamer, 2004; Kaplan, 1968; Kaye, 1999; Koehler and Shaviro, 1990; Lempert, 1977; Redmayne, 2008; Tillers and Green, 1988). One can express such rule in many different ways. In this article, I focus on subjective versions of the Bayes rule, given their higher popularity within the relevant literature. Bayesian probability theory provides a powerful mathematical framework for updating our beliefs in light of new information. The foundations of Bayesianism were laid down about 200 years ago, but it was only in the mid-20th century that it has become widely popular across many different disciplines, from astrophysics to neuroscience to political science and evidence law (see, for example, Bayes, 1764; Carnap, 1950; Earman, 1992; Horwich, 1982). In Bayesian language, the relationship between a particular hypothesis we are interested in (‘H’) and the available evidence we have access to (‘E’) can be portrayed by the following mathematical equation:
P(H|E), also referred to as the posterior, is the relative probability of H after having taken E into consideration. This is what we want to know. P(E|H), gives us the probability of the evidence arising from the hypothesis, or how we expect the evidence to look given that the hypothesis is true. P(H) is also called the prior. P(E) plays the role of an ignorable normalising constant (obtained by dividing P(E|H)P(H) by H). This element, P(E), captures our knowledge of the world before we encountered the evidence, E. Stating the prior precisely is perhaps the most difficult aspect of Bayesianism. It is also the most subjective—and thus subject to manipulation—aspect of reasoning according to the Bayes rule. Being, however, subject to manipulation is not an exclusive feature of the Bayes rule. All other traditional statistical methods are also potential victims of tweaking. Perhaps an advantage of Bayesianism is that it forces us to state clearly one’s assumptions concerning the prior.
One can also think of the Bayes rule as representing a learning function. It gives us the transformation from the prior, P(H), to the posterior, P(H|D), given what we learned from the evidence, E. To simplify, under a Bayesian framework, the fact-finder is presented with competing hypotheses about the disputed facts. She is expected to update, in the manner described above, the prior probability she assigns to the hypotheses in the light of new evidence. In the end, the fact-finder selects the hypothesis to which she assigned the highest posterior probability as the grounds for the legal decision, provided that applicable evidentiary threshold has been met.
Bayesianism has monumentally advanced the understanding of how experts and lay people reason with evidence in science, law, and everyday life. Nonetheless, it is not free of criticism. Although critiques do not refute Bayesianism or probabilistic approaches to standards of proof, they highlight their limitations as successful models of factual inferences for law. In the next two sections, I elaborate on what I perceive to be the most important problems and why these provide a good reason to move away from probabilistic approaches to standards of proof.
Descriptive and normative inadequacies
One important objection against probability-based approaches to standards of proof concerns the challenges involved in estimating one’s degree of subjective confidence in a given proposition (or in a given belief). 13 Imagine you are on a jury. To sharpen intuitions, imagine it is a hotly debated criminal trial. Maybe a former hall of fame athlete has been accused of murdering his spouse in cold-blood. After hearing all the evidence from the prosecution and the defence and deliberating with your fellow jurors, you form a belief about the defendant’s guilt. The subjective-probabilistic interpretation of standards seems to ask that you now determine your precise degree of confidence in that belief. Assume that you believe the defendant is guilty. Now ask yourself: what is the precise degree of confidence I assign to my belief? More simply, how confident am I that the defendant is guilty?
One problem here is that we are not used to answering such questions precisely and reliably. In fact, most of us would have a hard time answering that question precisely for most of our current or past beliefs. It is a difficult exercise to determine the precise degree of confidence that I assign to my belief, say, that it will rain tomorrow. Do I believe that it will rain tomorrow with 80% certainty? 90%? 95%? It seems hard to defend a precise number without running the risk of arbitrariness. And if this is the case for simple, seemingly harmless beliefs about the weather, we can suspect that the difficulties will be exacerbated for beliefs regarding criminal defendants’ guilt or innocence, which involve sophisticated interdependent nuances and with so much at stake.
Economically minded scholars have a ready-made answer to the difficulties involved in assessing one’s degree of confidence in a given belief. 14 They tell us that we need only to investigate the individual’s behavior towards different wagers in that belief. For instance, imagine we know that an individual is willing to bet $10 with the prospect of winning $11 if his belief turns out to be true (and lose his $10 if his belief turns out to be false). We can, then, infer that that individual assigns a degree of confidence of roughly 90% to that belief.
It is hard to see how this strategy helps resolve the problem. First of all, if I am at a loss as to how precisely to determine my degree of confidence in a given belief, we should expect that I will be equally at a loss if asked to bet on different probabilities. Second, and more importantly, the choices of which bets to accept or decline are not exclusively functions of one’s degree of confidence in a given belief. Whether I am risk-averse or risk-seeking (and to what extent) also significantly impacts such decisions. As does an individual’s budget constraints, indifference, and myriad beliefs and legal rules operating in the background. 15 Most jurors would find it difficult to state precisely and non-arbitrarily their respective levels of confidence in the plaintiff’s case.
Moreover, subjective probabilistic frameworks anchored in Bayesianism demand a storage capacity and computational power of decision-makers that seems to go beyond most individuals’ ‘hardware’ capacities. To update the probability of specific propositions, fact-finders have to assign probabilities to various conjunctions formed by those initial, and other, propositions pertinent to the evidence and the factual hypotheses under consideration. This results in an explosion in the number of combinations that fact-finders have to consider, since the number of conjunctions grows exponentially as the number of evidentiary propositions increases. This is particularly disturbing in light of a substantial body of psychological research showing that people’s reasoning processes do not conform to probability rules. 16
Another problem with this framework is that Bayesianism says almost nothing about which initial assignment of probabilities one should endorse. It places very few limits on individuals’ assignments, such as that one should not contradict the rules of probability. But other than this ‘soft’ constraint, fact-finders can freely adopt whatever distribution of prior probabilities they deem fit, no matter how strange that distribution is. Defenders reply that strange priors get ‘washed out’ by incoming evidence, as long as individuals update their priors according to the rules of probability and traditional canons of rationality. The weakness in this defence is that, depending on the situation, this ‘washing out’ can take a very long time and require a prohibitive amount of evidence. This is particularly problematic for the legal setting, where time and resource constraints are likely more severe.
Furthermore, fact-finders in legal settings often do not know much of the information required to set the priors until the end of the trial. But by that time, all evidence is already ‘past’. There is nothing to update (see, for example, Allen and Jehl, 2003; Amaya, 2007: 10–24). All of these problems speak to the descriptive and normative inaccuracy of probabilistic approaches to standards of proof. It turns out that people are especially bad at doing exactly what these approaches have them doing—understand standards of proof as probability thresholds.
There is still another objection against approaches to standards based on subjective probability that it is worth addressing. Working within this approach, courts have repeatedly defined standards with reference to the targeted mental state of jurors. For instance, the Supreme Court has decided multiple times that the right way to characterise whether there still exists reasonable doubt in criminal cases is in terms of the subjective state of mind that jurors should be in if they are to condemn or acquit the defendant. 17
The problem here is that whether a party has successfully met the applicable standard of proof should not depend solely, or even primarily, on the fact-finders’ subjective confidence. 18 The standard is what tells us whether the decision-maker’s subjective confidence in a given belief is sufficiently justified to assert that a given proposition is proven for some particular purpose (for example, a criminal conviction or the acceptance of a scientific theory) (Laudan, 2006). The subjective-probability approach to standards of proof seems to be at odds with this function of standards—to serve as a decision-making threshold. This conflict arises because the approach appears to focus too heavily on the fact-finders’ mental states about factual hypotheses at the expense of the degree of evidential support that the evidence provides for the hypotheses. What distinguishes a reasonable doubt from an unreasonable one is not the degree of confidence, but rather the level of evidential support for the underlying belief.
In asking whether the party with the burden of persuasion has carried it successfully, we are not simply asking whether judges or jurors are convinced by her version of the facts full stop. The key question we are asking (or should be asking) fact-finders is: how strongly does the admitted evidence support a conclusion about liability, or, more precisely, how strongly does the evidence support the case of the party carrying the burden of persuasion? If the level of evidentiary support reaches a certain threshold, then the juror should decide in that party’s favour. If lower, then the juror should decide in favour of the opposing party. If you ask jurors and judges to answer that question—the strength of the evidence buttressing the party with the burden of persuasion—it seems to be of little help to tell them that they must first have a certain degree of confidence about a particular factual proposition.
Consider the case of a racist jury, which is always ‘convinced’ of the guilt of African-American defendants regardless of what the available evidence indicates. It is absurd to say that the prosecution has fulfilled the applicable standard merely because this particularly objectionable jury has a high degree of confidence. It is not enough that fact-finders reach a certain degree of subjective confidence concerning the truth of a particular factual proposition. That proposition needs to have some degree of evidential support, because a primary function of standards of proof is to set the minimum level required of evidential support. 19 It therefore makes more sense to have standards of proof that depend on the strength that the admitted evidence provides to the hypothesis under consideration, instead of one that makes standards dependent merely on jurors’ subjective confidence
One might object to this point by arguing that a ‘focus on the relevant degree of confidence does not displace the idea that the judgment of guilt or innocence must take the evidence into account in some reasonable way’ (Walen, 2005: 409 n 232). However, shifting the focus from subjective confidence to evidential support better emphasises that the decision-makers’ conviction is only rational insofar as it appropriately reflects the evidence presented. This does not mean that different judges and jurors will reach the same decision when presented with the same evidence. Jurors, in particular, refer to their life experiences, background knowledge, common sense and intuitions. 20 The issue, therefore, can be seen as one of focus. This fact, however, does not make it any less significant.
It might be the case that some jurors are unable to state the precise level of evidential support that the admitted evidence lends to the parties’ hypotheses. And reasonable jurors might even disagree about what that level is. The fact that disagreements might happen, however, is no reason to avoid the topic of evidential support when informing the jury about the standards of proof. We can improve jurors’ inferential practices and reduce disagreement with well-crafted jury instructions, among other tools. But most importantly, discussions about the level of evidential support are exactly the kinds of discussions we should expect from triers of fact. The fact that these debates come up should assure us that we are on the right track. A seemingly serious problem, given the current structure of the jury-trial system, is that jurors are not required to justify their decisions. But, not having to announce what influenced a juror to decide for or against one party does not contradict the fact that the epistemic justification, and therefore the legal legitimacy, of that juror’s verdict relies on his making a decision based on how strongly the admitted evidence supports the case of the party carrying the burden of persuasion.
In sum, probabilistic approaches seem not to offer a plausible picture of how people generally, and legal fact-finders in particular, reason with evidence. And it turns out that if we try to do what the approach tells us, we end up with problematic results. We need instead an approach to standards of proof that asks jurors to focus on the relations of evidential support between the admitted evidence and the factual hypothesis available. In slogan form: standards of proof are primarily a matter of evidential support, not simply degree of confidence. As we will see in the next section, an alternative approach might be able to deliver just what we need while avoiding many of the difficulties associated with the probability-based approach.
Internal limitations: the ‘proof-paradoxes’
I would also like to discuss briefly a different kind of objection to the probabilistic approach to standards. This objection is not related to its inaccuracy as a description of how we reason about facts or its inadequacy as a prescription of how triers of fact should reason, but rather it concerns the inherent logical limitations of the approach. This kind of objection takes the form of so-called ‘proof-paradoxes’. As these paradoxes are widely discussed in the literature, I do not intend to belabor them here (see, for example, Cohen, 1977; Ho, 2008; Redmayne, 2003, 2007; Schauer, 2003). Instead, I will briefly address two such paradoxes to elucidate the discussion.
The first proof-paradox worth mentioning is the ‘conjunction paradox’ (see, for example, Allen, 1986; Cohen, 1977; Ho, 2008; Stein, 2005: chapter 3). This paradox results from the rule of classical probability that specifies that the probability of two independent events occurring is the product of their separate probabilities. That is, if event X has probability 0.3 of occurring and independent event Y has probability 0.4, then the probability of both event X and event Y happening is 0.3 × 0.4 = 0.12. In order to see how this poses a problem for the probabilistic interpretation of standards of proof, assume that in a torts case, A, there are only two elements of the cause of action, such as breach of duty and causation, and that both elements are independent of each other (two assumptions by no means trivial). 21 Assume further that each element is proven to a probability of 0.6. The probabilistic interpretation would demand a verdict for the plaintiff, since both elements were proven by the preponderance of the evidence (i.e.,> 0.5), even though the probability of both elements being true is actually only 0.36 (0.6 × 0.6), which means that the probability that the defendant did not negligently harm the plaintiff is 0.64 (1.0 − 0.36). Now assume that in another torts case B, breach of duty and causation are proven to 0.9 and 0.4, respectively. Here, the probabilistic interpretation demands a verdict for the defendant, since causation was not established by a preponderance of the evidence. In both cases A and B, the probability of the defendant having negligently harmed the plaintiff is exactly the same (0.9 × 0.4 = 0.6 × 0.6 = 0.36). Yet, the probabilistic interpretation demands a verdict for the plaintiff in one case and the defendant in the other.
The second paradox is the ‘gatecrasher paradox’ (see, for example, Cohen, 1977, 1981; Enoch et al., 2012; Kaye, 1979; Rhee, 2007; Thomson, 1986). This paradox is derived from the rules of conventional probability, which says that the probability of any fact plus the probability of its negation must equal 1.0. In other words, the probability of an exhaustive list of all possible events might add up to 1.0. Therefore, if the probability of the plaintiff’s case being true is 0.50001, then the probability of the plaintiff’s case being false must equal 0.49999. Now imagine a rodeo where only 499 people paid for admission, but there are, in fact, 1,000 attendees. The rodeo organiser randomly picks subject S out of all the people at the rodeo and sues S for non-payment. There is a 0.501 probability that S did not pay. Assume that no tickets were issued and that there is no testimony available as to whether S paid for admission or not. It seems that a strictly probabilistic model is committed to saying that the rodeo organiser is entitled to a verdict against S. But this seems manifestly unjust.
There are two primary reactions one can have when one’s theory is confronted with paradoxes and puzzles like these. One can try to rescue the theory either by tweaking it to avoid the paradoxes or by arguing that the alleged paradoxes are only solvable puzzles. 22 For instance, we might think that the proof paradoxes might vanish if we considered the standards of proof as applying to the entire story each party puts forth, instead of applying only to each element. However, this would have disastrous effects. As the number of elements of a cause of action would increase, the party carrying the burden of persuasion would face a more demanding task. Rather than proving each element by a certain numerical threshold, she would have to prove that the conjunction of all elements exceeds that threshold. But this task becomes increasingly more difficult as the number of elements in a cause of action increases. Here, I am interested in a proposal that interprets standards of proof in a very different way. 23
Explanation-based approach to standards of proof
A group of evidence law scholars has proposed that, rather than a determination of whether each element is proven to a specific probability, legal fact-finding should involve a determination of the best explanation of the admitted evidence (see, for example, Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). The idea in a nutshell is that fact-finders should infer the (probable) truth of a given hypothesis (out of the set of hypotheses offered by the parties or constructed by the fact-finders) which best explains the admitted evidence, and then reach a conclusion concerning the facts of the case based on that inference. This approach makes IBE the characteristic mode of inference in legal fact-finding. 24 Given the prominent role reserved for IBE, I will first elucidate a few key points regarding the general structure and defining features of this mode of inference before exploring whether it can offer us valuable insights to think about standards of proof. I will also address the consequences of the explanation-based approach to standards of proof from objections targeted against IBE. Then in more detail, I discuss how an explanation-based approach can give us a better account of standards, as well as help us elucidate other evidentiary mechanisms.
Inference to the best explanation
IBE can be characterised as follows. Faced with a phenomenon to be explained, we first select a group of plausible hypotheses that we deem explain a given event. We never begin with a full menu of all possible potential explanations. Such a menu would be too large to generate and handle given our cognitive and time constraints. Similarly, the class of explanations we consider is not generated randomly. Not only are these explanations compatible with our background beliefs, but we also believe that, if true, they would explain an aspect of the phenomenon we are interested in. From that narrow menu of plausible hypotheses, we select that hypothesis which provides the best explanation of the phenomenon. 25 We, then, infer the (probable) truth from that selected hypothetical. Formally, the premise of an IBE first consists of one or more propositions that describe some unexplained phenomenon. Second, one or more propositions to the effect that, if explanatory hypothesis X were true, or otherwise warranted, the phenomenon in question would be explained sufficiently for the reasoner given his interests. And, third, the proposition that no other potential explanatory hypothesis describes the phenomenon as well as hypothesis X. The conclusion of that argument is an explanatory hypothesis.
Explanatory inferences like these are extremely common in everyday life, as well as in expert domains such as science and law. 26 In many different settings, we often infer that some proposition, P, is true because, if true, it would best explain the evidence available to us. Suppose you are my doctor and I come to your office complaining of a pain in my toe. You ask yourself: what might explain the pain? You first examine me and collect evidence about my clinical state. You ask me questions about when the pain started, how bad it feels, where precisely the pain is located, etc. Maybe you also conduct more sophisticated exams, such as a tomography or a blood test. You then start generating possible hypotheses that you believe, if true, explain the pain in my toe. Maybe I hit my toe and broke it. Or maybe I have an ingrown nail or athlete’s foot. You could continue generating hypotheses, but your experience tells you that these are common causes for discomfort in your patients’ toes. To add to this example, say I told you that the pain started immediately after I hit the corner of my bed and that the tomography shows a small bone fracture. You then hypothesise that the pain in my toe is best explained by the fact that I broke it when I hit the corner of my bed. Finally, you infer and consequently come to believe that the fact that I hit and broke my toe is the cause of my pain. You see this explanation as a fit given your interests. As such, you do not need to get into the details about the anatomy of toes or their biomolecular composition to prescribe a course of treatment.
This brief sketch of IBE raises the question of what it takes for a hypothesis to be explanatory. Or, more simply, what is an explanation? Defining the concept of ‘explanation’ is a particularly difficult and highly controversial topic in philosophy (see, for example, Hempel, 1965; Kitcher, 1989; Salmon, 1984; Van Fraassen, 1980; Woodward, 2003). It does not serve our purposes to enter too deeply into this debate. In this article, I posit that to explain is to say why something is the way it is from a particular point of view. The concept of a ‘point of view’ is central here. 27 A point of view flags that what qualifies as an explanation, and, perhaps most importantly, what counts as a good explanation, is relative to a particular enterprise, where specific (and presumably reliable) methods of analysis are chosen and used to serve cognitive goals and interests unique to members of that enterprise. Examples include a scientific, economic, religious and legal point of view. For instance, an expert witness might attempt to explain to a jury or a judge what some scientific fact or theory germane to a specific case from the point of view of a biologist, chemist, physician, ballistician or psychiatrist. Each different point of view might lead to different explanations for the same phenomenon.
For our purposes, the important takeaway is that even if we are unable to provide a complete account of all the conditions for something to explain something else, we might still be able to make relative judgments regarding better or worse explanations based on a set of criteria. 28 We can, for instance, evaluate explanatory inferences based on judgments about their abilities to provide a potential understanding of the phenomenon we are interested in (Barnes, 1995; Lipton, 2004). The extent to which an explanation improves our potential understanding of a phenomenon depends on various criteria, such as whether we can link the explanation to some articulated mechanism, the number of precise details of a phenomenon the explanation entails, whether the explanation posits fundamentally new types of phenomena, or even the explanation’s elegance and simplicity. It is not my goal in this article to scrutinise these criteria. Spelling out notions such as elegance and simplicity is a notoriously complex task (not to mention the challenge of solving potential conflicts between elegance and simplicity). The important point for our purposes here is that a hypothesis must fulfill several criteria if it is to qualify as a good explanation. And to the extent that fulfilling these criteria is a matter of degree, some explanations can be better than others.
An important point to acknowledge is that whether a hypothesis is considered explanatory depends, in part, on our interests (Lipton, 2004). For example, our interests help determine which aspects of the phenomenon we are trying to explain. One hypothesis might explain one aspect of a particular phenomenon, but not other aspects. Another way our interests influence our explanatory inferences relates to the contrastive feature of explanations. We usually do not want to explain merely why something happened, full stop. We want to explain why one thing rather than something else happened. 29 Our interests determine what we substitute for ‘something else’. Depending on the aspects of the phenomenon we are interested in, we might end up with different potential explanatory hypotheses. 30 In other words, different aspects of the same phenomenon might call for different explanations, while the same aspects of a phenomenon might call for different explanations. Not only is determining our interests useful in establishing the best explanation, but by specifying which contrasts are relevant, we can also assess the adequacy of different explanations. 31
This description of IBE helps clarify the fact that explanatory considerations guide many of our everyday inferences. That might seem like a difficult claim to prove. Alternatively, it might be the case that when it appears that explanatory considerations are guiding our inferences, it is actually something else doing the work, something that is completely distinct from the explanation. At the same time, we want an account of IBE that gives us a descriptively accurate model of our inferential practices. This is particularly important for our purposes because we need an account of standards of proof that is compatible with legal fact-finders’ inferential practices. Remember, this was a major weakness of the probability-based approach. 32 Luckily, we have good reason to believe that IBE provides a descriptively accurate model of some of our inferential practices (Lipton, 1998: 55–62). By allowing us to make relative judgments regarding better or worse explanations, IBE offers a convincing perspective about those important aspects of our current inferential practices. Moreover, the hypothesis that explanatory considerations guide our inferences is itself the best explanation for the otherwise disproportionate role that explanatory thinking plays in our cognitive economy (Lipton, 1998). It explains, for instance, our tendency to account for features that fit into an explanatory story and to ignore those that do not, as illustrated by people’s propensity to disregard base rate information in the presence of some salient information (see, for example, Bar-Hillel, 1980; Kahneman et al., 1982; but see Stein, 2013). These aspects of our inferential practices suggest that explanatory considerations guide many of our inferences—mostly for the better, even if occasionally for the worse. Lastly, our inclination to solve inferential problems by constructing causal models also leads us to proceed in explanatory terms. We often find it easier to reason about causes rather than in terms of logical relations. Explanatory considerations fit this way of reasoning since the appeal of causal relations is a major component of many explanations of empirical phenomena (although not all explanations, as illustrated by examples in logic or mathematics). All of this does not conclusively prove that explanatory considerations guide many of our inferences, but it does make a unyielding case for this claim.
Besides evaluating a model for its descriptive value, we can also evaluate its normative value. That is, we can judge whether the model correctly portrays our actual inferences and we can assess whether what the model tells us to do is justified. Note that here we are dealing with epistemic justification, which is significantly different from other kinds of justification, such as moral justification. Traditionally, epistemic justification has been understood as correlated to truth-conduciveness. 33 Briefly, a mode of inference is epistemically justified if it leads us to form true beliefs and avoid false beliefs. The question that we need to ask now is: are inferences guided by explanatory considerations epistemically justified? Is IBE truth-conducive?
Epistemic worries
The epistemology and philosophy of science literatures are filled with attacks of IBE as an epistemically justified mode of inference. While some authors praise IBE as an important mode of inference for science (Kelly, 2001), others claim that it is unclear how explanatory value can be truth-conducive (see Fumerton, 1992; Kelly, 2001; Lipton, 2004; Nozick, 1994; Van Fraassen, 1989). That is, it might be difficult to see why the mere fact that a given hypothesis is the best explanation of something makes that hypothesis more likely to be true. In this section, I present what I take to be the most pressing objections against the epistemic status of IBE.
One objection against IBE points out that an explanation’s accuracy is simply too subjective to give us a suitably truth-conducive mode of inference. If we expect judgments about how good an explanation is to vary from person to person, how can we expect IBE to get us closer to the truth? Let us refer to this argument as the ‘Subjectivity Objection’. A possible reply to this objection is to point to the fact that, in one important sense, warranted inferences themselves are audience-relative (Lipton, 2004). Warranted inferences depend on available evidence, and different people have different evidence. The epistemic statuses of beliefs also depend strongly on background beliefs, which vary from person to person. In the previous section, we noticed how many of our explanatory inferences are, in one important sense, interest-relative. People’s different interests make them focus on different aspects of the same phenomenon, which in turn might call for different explanations. We saw how it is no threat to the objectivity of an account of explanation that different people interested in explaining different facts may infer different explanations. Likewise, this is not threat to the objectivity of a mode of inference.
But this response seems to miss the point of the objection. The concern is not merely about discrepancies in available evidence, background beliefs or interests among reasoners. The concern is greater. It is the possibility that epistemic peers (i.e., individuals with equal levels of cognitive capacities, with access to the same body of evidence and with similar background beliefs) might reasonably disagree about how good competing explanatory hypotheses are about the same aspects of the same phenomenon. 34 The possibility of reasonable disagreement among epistemic peers, however, is not a feature particular to IBE. In fact, such possibility exists in other modes of inference. Take, for instance, the case of analogy. Epistemic peers might reasonably disagree about what the appropriate analogy is, and so might disagree about what important features of one group should be extrapolated to a different group, given that they share some set of features. 35 This shows that the ‘Subjectivity Objection’ fails to raise a challenge that is specific to IBE.
Another proffered concern about the truth-conduciveness of IBE focuses on the fact that IBE tells us to infer the likely truth of the hypothesis which would, if true, best explain our evidence. But for all we know, the objection goes, we might not inhabit a world that obeys the traditional criteria used to assess explanatory goodness, such as simplicity or coherence. That is, we might not inhabit a simple or coherent world. And unless defenders of IBE can conclusively prove that we do inhabit such a world, we cannot justify the belief that how good an explanation is positively correlates to truth-conduciveness. I refer to this objection as the ‘Humean Objection’. Again, this objection is not unique to IBE. The Humean Objection is no worse news for IBE than it is for any other account of non-deductive modes of inference. Induction, for instance, also relies on critical assumptions about how our universe works that cannot be proven in a non-question-begging way.
36
But this does not make us irrational when we believe, for instance, what science tells us about nature. It might make us dogmatic, but not irrational. As Lipton (2004: 145) puts it: Whatever account one gives of our non-deductive inferences, there is no way to show a priori that they will be successful, because to say that they are non-deductive is just to say that there are possible worlds in which they might fail. Nor is there any way to show this a posteriori since, given only our evidence to this point and all a priori truth, the claim that our inference will be successful is a claim that could only be the conclusion of a non-deductive argument and so would beg the question.
Consider also the so-called ‘Bad Lot Objection’ to IBE (Van Fraassen, 1989). According to this objection, even the best explanation out of a set of possible explanations might still not be good enough, since the set considered might contain bad explanations. This might suggest that a more epistemically justified rule of inference is not merely ‘inference to the best explanation’, but rather something like ‘inference to the best explanation provided that the best explanation is good enough’. 37 But even this might not suffice. Here is a simplified version of the rest of the argument. Let us assume that one plausible interpretation of what it means for an explanation to be ‘good enough’ is for it to be considerably more likely than some other potential explanation. (If the probabilities are close enough, it might be rational for one to simply withhold judgment.) The problem is that we would then have good reason to believe that in many cases of non-trivial inquiry, the probability that there exists some other explanation that is, in fact, the actual explanation will be quite high.
Two reasons are usually given to support this concern. First, the set of potential explanations might contain explanations not yet considered (Van Fraassen, 1989). Second, the set might contain an infinite number of potential explanations (Toulmin, 1961). So, even if the best explanation we know of is considerably better than the second best explanation we know of, there well might be an unexamined explanation that is even better. And, the objection continues, this unexamined explanation might, first, be incompatible with what we currently take to be the best explanation and, second, have stronger inductive support.
Answering the epistemic worries
The objections raised in the section above concerning the truth-conduciveness of IBE are worrisome. Any philosophically sound account of IBE must address them. Most importantly for our purposes, the case in favour of an explanation-based approach to standards turns on our capacity to reply convincingly to those objections. Otherwise, we will be left with an approach to standards that asks us to rely on an objectionable mode of inference. And this might affect the epistemic justification of legal decisions arrived at via IBE, which in turn may affect their legal justification. 38
So far, nothing new. Proponents of an explanation-based approach have rejected objections similar to those raised above (Pardo and Allen, 2008). However, their rejection might have been too quick. The objections that IBE is not an epistemically justified norm of inference have been dismissed with claims that the same objections also apply to all forms of inductive reasoning. And so, unless we are willing to accept skepticism about induction in general, we ought to reject these objections. 39 But, unless we are already committed to a non-skeptic position, this is not a very convincing argument. We then need a better defence of IBE against the objections raised above. To that end, I will briefly mention one reply that, although potentially attractive, ultimately fails. I will then present another answer that, even though left partially incomplete, represents a more promising path.
A first possible response points to the idea that when dealing with the possible propositional attitudes of triers toward evidentiary hypotheses, belief is not the only option. Acceptance is an alternative. L Jonathan Cohen distinguishes between belief and acceptance: [A] belief that p is a disposition, when one is attending to issues raised, or items referred to, by the proposition that p, normally to feel that p and false that not-p, whether or not one is willing to act, speak or reason accordingly. But to accept the proposition or rule of inference that p is to treat as given that p. More precisely, to accept that p is to have or adopt a policy of deeming, positing, or postulating that p[.]
40
These considerations suggest that it is a better interpretation of our evidentiary system that verdicts be regarded as expressing acceptance rather than beliefs. Here, we might be tempted to think that we have a plausible answer to the worries about the epistemic status of IBE. If triers need only accept the conclusions of IBE-type arguments, why should we care about epistemic justification, which seems to be only relevant for beliefs? While it might be interesting to explore the idea that the correct propositional attitude to expect from jurors is acceptance, shifting the focusing away from belief is a not a good way to answer the worries about the epistemic status of IBE. Regardless of whether jurors are required to believe in, or ‘merely’ accept, the conclusions of IBE-type arguments, we still want jurors to accept epistemically justified propositions. Otherwise, as said above, we risk ending up with epistemically and legally unjustified decisions.
The challenges raised above pressure us to find an account of IBE that shows that explanatory value is truth-conducive. That way, we can refute the claim that IBE is an epistemically unjustified mode of inference. Drawing on Lipton’s work, one strategy to claim that explanatory value is truth-conducive is to show that explanatory virtues and inferential virtues coincide (Lipton, 2004). That is, to show that the features identified as increasing the explanatory value of a potential hypothesis also happen to be the features that increase the hypothesis’ level of evidentiary support. I mentioned above that we usually assess the explanatory value of hypotheses based on several factors, such as coherence, simplicity, not positing new entities, etc.
41
Coincidentally, these and other criteria used to judge explanatory value seem very similar to the criteria used to evaluate the strength of our inferences. Hypotheses that accurately explain many observed phenomena tend to be better supported by evidence than hypotheses that do not. The same seems to be true of simpler and unifying explanations that refer only to known entities and mechanisms.
42
Alternatively, consider the contrastive aspect of explanations I alluded to above. The choice of which contrast to focus on helps decision-makers select the likely causal history of a given phenomenon that provides a good explanation using what Lipton calls a ‘causal triangulation’. In his words (2004: 73): Since looking for residual differences in similar histories…is a good way of determining a likely cause…and contrastive explanation depends on just such differences, looking for potential contrastive explanations can be a guide to causal inference. Given contrastive data, the search for explanation is an effective way of determining just what sort of causal hypothesis the evidence supports.
Some steadfast skeptics might remain unconvinced. They threaten that unless we can offer a consensual account of the epistemic justification of IBE, we are doomed to epistemically unjustified (or worse: arbitrary) verdicts. My (tentative and quick) reply is threefold. First, different evidentiary rules already contribute to truth-conduciveness in legal fact-finding. For instance, many exclusionary rules are justified given the fear that jurors will mistakenly over- or under-weigh particular types of evidence, such as character evidence. 45 Second, it is not surprising that the legal system has goals other than the purely epistemic objective of securing rational beliefs at all costs. Other considerations besides accuracy, such as fairness, also drive our evidentiary system. 46 Again, most (albeit not all) exclusionary rules exemplify this fact. Third, one can still attempt to decide based on a conclusion arrived at via IBE, epistemically justified by searching for additional evidence that further supports that conclusion (Kelly, 2001). Just like scientists can search until they find additional evidence that supports their explanatory hypotheses, legal fact-finders could also request additional evidence that would provide further evidentiary support for hypotheses derived from explanatory inferences.
A burden of the best explanation
Legal scholars have proposed that legal fact-finding should involve a determination of the best explanation of the admitted evidence, rather than a determination of whether each element is proven according to a numerical degree of subjective confidence (see Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). They claim that from the fact that a given hypothesis (out of the set of hypotheses offered by the parties or constructed by the fact-finders) best explains the admitted evidence, fact-finders should infer to the (probable) truth of that hypothesis and then decide the case based on that inference.
But what exactly does it mean to ‘select the best explanation’ in legal settings? How do we formulate the different standards of proof in explanatory terms? This is where the rubber meets the road. Some commentators have suggested that fact-finders should decide based on the relative plausibility of the evidentiary hypothesis put forth by the parties or the triers themselves (Pardo and Allen, 2008). The easy test is in civil cases, where the standard of proof is preponderance of the evidence. In those cases, fact-finders should infer that the most plausible explanation is the true explanation. How to formulate other standards besides preponderance of the evidence imposes greater challenges for those that try to formulate standards by reference to plausibility. 47 Some authors have suggested that in criminal cases, with a ‘beyond a reasonable doubt’ proof standard, fact-finders should infer the defendant’s innocence whenever there is a ‘sufficiently plausible explanation of the evidence consistent with innocence (and ought to convict when there is no plausible explanation consistent with innocence, assuming there is a plausible explanation consistent with guilt)’ (see Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). Analogous suggestions are given to cases in which the applicable civil standard is clear and convincing evidence. In those cases, ‘fact-finders would have to infer a conclusion for the party carrying the burden of persuasion when there is an explanation that is sufficiently more plausible than those that favour the other side’ (Pardo and Allen, 2008: 239 (emphasis added)).
One initial problem with this way of formulating standards is that it is not entirely clear what these commentators mean by an explanation being ‘more plausible’ than another explanation. There are at least two possible interpretations. According to one interpretation, by ‘plausible’, commentators simply mean ‘probable’ or ‘likely’. To say that hypothesis X is more plausible or that it has a higher relative plausibility than another competing hypothesis Y means that X is more likely to be true than Y. Under this interpretation, proponents of the explanation-based approach would be asserting that in civil cases, fact-finders should decide for the party with the most likely explanation.
Regardless of whether this is the interpretation that most proponents of an explanation-based approach want to put forward, the proposal to equate plausibility with likelihood or probability when formulating standards of proof is misguided. Likelihood is only one among many different criteria that guide explanatory inferences (Lipton, 2004). As we saw in the last section, many criteria are relevant to defining the explanatory value of a given hypothesis. We also examined the reasons why we should strive for explanations that, if correct, would provide the most likely understanding of the phenomenon we are interested in. It might turn out that the explanation that provides the most understanding is also the most likely one. This is not necessarily the case, however (remember the case of opium and its dormitive principle). Under this logic, triers might still be warranted in believing in a given explanation that happens not to be the most likely, provided it gives them the best understanding of the relevant phenomena. The point is rather that likelihood is not a sufficient condition for a good explanation. 48 A hypothesis needs to fulfill other requirements before it can be rendered a good explanation. This is compatible with maintaining that a certain level of likelihood might still be a necessary condition for a hypothesis to count as a good explanation. An extremely unlikely hypothesis is probably not a good explanation. A good explanation might, for instance, be incompatible with our background beliefs or past observations, or it might posit fundamentally new types of phenomena, etc. Just as there is a minimum to how likely a hypothesis has to be to qualify as a potential good explanation, there also is a minimum to how likely a party’s evidentiary hypothesis has to be for her to meet her burden of persuasion.
This leads us to a second possible interpretation of what plausibility might mean to explanation-approach proponents. Under this other interpretation, ‘plausible’ means ‘explanatorily valuable’ or ‘having explanatory value’. This more expansive interpretation of plausibility explicitly recognises that likelihood is not the only factor that should be considered when formulating standards. It clearly departs from the probabilistic approach. 49 Under this second interpretation, standards of proof are designed to set up different thresholds of explanatory values that the party carrying the burden of persuasion needs to meet. The higher the standard, the more explanatorily valuable the best evidentiary hypothesis available to triers must be.
In civil cases with the preponderance of the evidence standard, a verdict should be rendered for the party favoured by the explanation with the highest explanatory value. 50 If the available explanations are equally good (or bad), or if none are good enough, then the verdict should be rendered against the party carrying the burden of persuasion. 51 When the standard is not preponderance of the evidence, the explanatory values of the best explanation that favours the party carrying the burden of persuasion would have to be higher or lower, depending on the applicable standard. If the standard is mere reasonable suspicion, then that explanation can have a relatively low explanatory value. A similar suggestion applies to cases in which the applicable standard is clear-and-convincing evidence. In these cases, the fact-finder should only find for the party carrying the burden of persuasion if an available explanation favouring that party has substantial explanatory value. 52 In criminal cases, when the standard is beyond reasonable doubt, the explanation favouring prosecution must have an extremely high explanatory value 53 —so high that any available explanation favouring the defendant would not be a reasonable explanation at all. 54
Just as the assessment of explanatory value depends on reasoners’ interests, so too does whether legal decision-makers consider if the party carrying the burden of persuasion has met her burden (Allen, 2008; Pardo and Allen, 2008: 236–237). For instance, how fine-grained the explanation must be depends on what features of the disputed facts the fact-finders are focusing on. If fact-finders are deciding a homicide case, the explanation ‘The defendant killed the victim’ may not be fine-grained enough; while the explanation ‘The defendant killed the victim using a knife, which caused a series of physiological effects in the victim’s tissues and organs, such as X and Y, which can be observed by their biological composition Z, etc.’ maybe be too fine-grained. The points of contrast between the explanations the parties put forth also influence how good those explanations are (Allen, 2008). Remember that explanations are contrastive. 55 We usually do not want simply to explain why something happened, full stop. We want to explain why one thing rather than another happened. Depending on what the specific disagreements between the parties are, fact-finders will choose to value some explanations over others. If the defendant asserts he never incurred the alleged debt with the plaintiff, then the appropriate contrast to focus on is whether he incurred the debt or not; not on whether he incurred the debt freely or was coerced into doing so. Different aspects of the same phenomenon might call for different explanations. Moreover, the same dimensions of a phenomenon might call for different explanations depending on what we compare them to. These differences make the fact-finders’ interests and the points of contrast between the parties highly relevant in determining the explanatory value of the hypothesis available at trial.
Lastly, the law also guides what kind of explanations the fact-finder should be after (Allen, 2008). More comprehensive explanations may be required depending on how many elements of causes of action are necessary for a case to move forward. For instance, in a criminal action, explanations such as ‘the defendant did something wrong’ would be inadequate. While, under the doctrine of res ipsa loquitur, a plaintiff might be able to recover damages even by offering explanations as simple as ‘my damages were caused by the defendant’s actions’. 56 Moreover, which contrasts are at stake limit the types of explanations jurors and judges can come up with, other than the explanation that the parties put forth. The pertinent contrasts might also be a function of substantive law and other formalities such as the charges included in the complaint or indictment. Our focus on the proper contrasting explanations also changes based on which facts are disputed, conceded to or judicially noticed. For example, if the defendant contends that he was driving the car involved in the accident, then the appropriate contrast might be limited to whether there was negligence or not (instead of also including whether he was driving the car or doing something else when the accident occurred).
Descriptive adequacy
The following sections consider the descriptive, normative and explanatory considerations that support an explanation-based approach to standards. As each is discussed in turn, let us start with descriptive adequacy. Other proponents of the explanation-based approach highlight how the approach more accurately describes jurors’ reasoning (see Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008; Spottswood, 2014). To support this conclusion, proponents offer evidence that what the approach asks of jurors is compatible with psychological research on jury decision-making (see Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). The main finding of the research is that, contrary to what the probabilistic approach to standards says of jurors, individuals do not usually reason with separate pieces of evidence. That is, we usually do not update our beliefs based on each piece of new information we receive. Instead, we tend to construct entire stories that will fit the evidence. We then make decisions based on the explanatory virtues of those stories.
The descriptive value of the explanation-based approach goes beyond juror decision-making. In the section ‘Inference to the best explanation’ we saw how there is a good reason to believe that explanatory considerations guide many of our inferences. This is true for both triers of fact and lay people. We use explanatory value in our everyday inferences and in many expert domains, including science and law. The descriptive value of the explanation-based approach is enhanced to the extent that the approach says triers are already making legal inferences based on considerations that guide not only juror’s reasoning, but also other triers’ and individuals’.
Normative adequacy
Important normative considerations also support an explanation-based approach to standards. First, the explanationist view seemingly avoids many of the paradoxes associated with the probabilistic model (Allen and Jehl, 2003). Consider the so-called proof-paradoxes discussed above. The probabilistic interpretation requires that in a case with two elements, the probability of both elements must exceed the applicable proof standard. Thus, as the number of elements increases, the probability each element needs to meet the burden of persuasion also increases. These counterintuitive results pose a serious challenge to a probabilistic interpretation of standards. By requiring fact-finders to decide based on what they believe to be the best explanation of the available evidence, not merely on the probabilities of individual elements, the explanation-based approach avoids the seemingly paradoxical result that an individual might prevail because he was deemed to have proven each of the elements of the cause of action to a given probability, even though the probability that his entire story is true is lower than the applicable standard. 57 Likewise, the explanation-based approach also avoids the unfair result that any unlucky, randomly-chosen attendee is found liable for not paying for the ticket when the majority of attendees did not pay for entrance to the rodeo. Such a hypothetical case would simply not include an explanation good enough to justify a verdict against the randomly-selected attendee (because, among other reasons, it fails to distinguish among different attendees) (Tribe, 1971).
A second important normative consideration for the explanation-based approach is that it prioritises evidential support over subjective credence. We saw above how we need an account of standards of proof that does not make them dependent merely on jurors’ subjective confidence. Rather, we need an account that makes standards rely on the support that the admitted evidence provides to the evidentiary hypothesis under consideration. This is how we ensure that standards of proof serve their function as decision-making thresholds, telling when the decision-maker’s subjective confidence sufficiently justifies a verdict. We also saw how the probability-based approach gets this wrong by prioritising subjective confidence over relations of evidential support.
The explanation-based approach does far better here as well. It asks triers to reason based primarily on the explanatory value of the evidentiary hypothesis raised at trial. And, as we explained above, it is reasonable to think that that explanatory value incorporates relations of evidentiary support (Lipton, 2004). In other words, the explanation-based approach does not ask triers to focus on their subjective confidence, but rather it (indirectly) asks them to concentrate on the evidence and the level of support it provides to the hypothesis under consideration. That is what we should expect from standards of proof.
Another normative consideration for explanationism is that this approach promotes explanatory value more clearly than the probability approach. The reason for this should be apparent. IBE offers triers a mechanism to promote explanatory value, since it allows them to premise their legal arguments on evidentiary hypotheses that best explain the evidence that the parties brought to them. So one approach uses explanatory value as the main criterion in reasoning with standards, while the other does not mention it. We must be careful, though. We need to be able to provide independent reasons for the idea that our current evidentiary system, should promote explanatory value.
While this suggestion might surprise some, we can find support for this conclusion from a multitude of legal material.
58
Judicial opinions explicitly reference the importance of explanatory value in legal reasoning. For instance, the Supreme Court employed an explanatory approach to summary judgment: Neither the Court of Appeals, nor the respondents, nor the dissent provides any reason to question the city’s theory. In particular, they do not offer a competing theory, let alone data, that explains why the elevated crime rates in neighborhoods with a concentration of adult establishments can be attributed entirely to the presence of walls between, and separate entrances to, each individual adult operation.
59
Different evidentiary mechanisms are better understood if we assume that explanatory value is worth promoting (Allen, 1986: 413–420). Rules that are meant to allow fact-finders to have access to evidence that portrays a fuller image of the evidentiary hypothesis in consideration are good examples. Consider Federal Rule of Evidence 106, which established that material relevant to specific testimony is admissible. 62 Alternatively, according to Federal Rule of Evidence 612, if a witness relies on a writing while or before testifying, that writing is admissible regardless of other exclusionary rules. 63 Another example found in different jurisdictions is a provision which permits background matters to the litigated question. The standard practice of trying conspirators together provides yet another example of a rule that encourages giving fact-finders a ‘fuller picture’. 64 Also, the fact that counsellors have opening and closing statements suggests that we want juries to hear a complete narrative of the evidence.
Regardless of how convincing we find that the examples alluded to above indicate that we should promote explanatory value, they refer only to the American legal system. We have not yet offered support for the broader claim that explanatory value is something that we should promote in legal reasoning generally. We must be able to make this last claim if we want to apply the importance of explanatory value to other legal systems. How can we achieve this? One (fairly ambitious) route would be to argue that whenever we promote explanatory value, we are also indirectly promoting other, more fundamental values, such as ascertaining the truth or encouraging procedural fairness. 65 This route would take us through a lengthy detour, is beyond the scope of this article. 66
Explanatory adequacy
Explanatory considerations also favour the explanation-based approach to standards. Particularly, this approach allows for a better understanding of different mechanisms in our evidentiary system. Here, I will only discuss the suggestion that explanationism helps us understand ‘relevance’ and ‘probative value’.
The probability-based approach has a straightforward definition for relevance. Under this view, evidence is relevant if it is has a likelihood ratio of anything other than 1:1. Probative value, in turn, means how far from 1:1 the likelihood ratio of the evidence is. 67 The fact that the probability approach offers a clear account of key evidentiary mechanisms such as relevancy and probative value counts as a benefit. Thus, the explanation-based approach needs to respond in turn by offering a defensible account of those same mechanisms.
Let us start with relevancy. Roughly, for Pardo and Allen, a piece of evidence is relevant if it is explained by an account put forward by the party presenting the evidence. 68 This is a good starting point for a defensible account of relevancy under an explanation-based approach. There are at least two problems with this formulation, however. First, it cannot provide a necessary or sufficient condition for relevancy. Second, it seems to leave out one important aspect of the American concept of relevancy.
Pardo and Allen’s account does not seem to provide a necessary condition for relevancy because a piece of evidence, E, may still be relevant to hypothesis H even if H does not correctly explain E. For instance, the evidence that John has a chest wound may be relevant evidence for the hypothesis that John will die, even though the hypothesis that he will die does not explain why John has the chest wound (Achinstein, 1978). Nor can this account provide a sufficient condition for relevancy. Suppose my scooter will not start on a cold winter morning in Massachusetts (Achinstein, 1978). The hypothesis H that ‘At precisely 3:05 AM this morning, two boys removed the gas remaining in my tank and substituted it with water,’ would, if true, correctly explain why my scooter will not start. Yet it seems that my evidence (that my scooter will not start that winter morning) is not evidence that H is true. Given what we know about motor engines, batteries and cold weather, my evidence is not a good reason to believe in H. 69 Otherwise, we would be forced to claim that the fact that my scooter will not start on a cold winter morning in Massachusetts is evidence for a vast range of hypotheses closely related to H, such as: ‘At precisely 5:03 AM, three monkeys removed the gallon of gas remaining in my tank and substituted crushed bananas.’ The fact that my scooter will not start does not give us a good reason to believe in either of these far-fetched hypotheses.
The second problem with Pardo and Allen’s concept of relevancy is that it apparently leaves out one important aspect of the American concept of relevancy. According to the test for relevancy under Federal Rule of Evidence (FRE) 401, evidence is relevant if it has any tendency to make a fact more or less probable. 70 Pardo and Allen’s proposal does not seem to allow for the possibility that a piece of evidence can make a fact less probable. That is, they seem to say, if evidence is explained by an evidentiary hypothesis available at the trial (and therefore makes that hypothesis more probable), then the evidence is relevant. They seem to suggest that being explained by a hypothesis is both a necessary and a sufficient condition for relevancy. This is logically equivalent to the following proposition: if a piece of evidence is not explained by an evidentiary hypothesis available at the trial, then that evidence is not relevant. But this is at odds with FRE 401. Under the rule, if an evidentiary hypothesis does not explain piece of evidence, the hypothesis is less probable. Therefore, that evidence is relevant under FRE 401.
We can build upon these objections to develop a revised formulation of relevancy that is both compatible with an explanationist approach and FRE 401. Here is one suggestion: evidence is relevant (and therefore should be admitted) if it increases or decreases the explanatory value of an evidentiary hypothesis. Another way to articulate the same idea: evidence is relevant (and therefore should be admitted) if it makes an evidentiary hypothesis a better or worse explanation. This relevancy formulation not only respects the FRE 401 test, but it also places explanatory value as the central element of the test, in line with the explanation-based approach to standards.
Another concept that an explanation-based approach elucidates is that of probative value. According to Pardo and Allen, ‘[p]robative value refers to the strength of the explanation; the most the evidence is explained by…the party’s explanation of the evidence, the greater the probative value’. 71 While this formulation is on the right track, it could benefit from a slight modification. Probative value can be better explained under the explanationist approach as a function of how much the evidence makes a particular explanation better or worse, not as a function of how well the explanation clarifies the evidence. 72
The difference between these two formulations might not be immediately obvious. It is true that if a hypothesis explains any additional piece of evidence, that makes it a better explanation (because it can explain more things). But a piece of evidence can still make an explanation better without being explained by it. For instance, mathematical theorems can improve the explanatory value of meteorological models, making the theorems better explanations of the weather. It does not follow, however, that the mathematical theorems are themselves explained by the models. Actually, the explanation seems to run counter to this (although this might not necessarily be the case). Likewise, that the defendant had strong personal financial interests tied to the victim’s death makes explanations where the defendant participated in the crime better explanations. Again, however, the inverse does not hold. A hypothesis about who committed the crime and why does not explain the financial interests. These examples show that many facts offered as evidence in trials contribute to the explanatory value of explanations (i.e., they can make explanations better or worse) without themselves being explained by the hypothesis. Because such facts influence the strength of explanations, they also affect the strength of our explanatory inferences. Moreover, to the extent that they do so, they should be thought to have probative value under an explanation-based approach.
Let us look closely at an example brought by Pardo and Allen to defend their formulation of relevancy and probative value (Pardo and Allen, 2008). Imagine that an expensive necklace is found in a housekeeper’s pocket. Suppose the owner testifies that she found the necklace in the employee’s pocket. Pardo and Allen say that this testimony is relevant ‘because the fact that the [employee] stole the necklace explains the testimony’ (Pardo and Allen, 2008: 242). What I am suggesting is that the testimony is relevant because it makes the hypothesis that the employee stole the necklace a better explanation, not because the evidence is explained by this hypothesis. According to Pardo and Allen, the testimony’s probative value depends on how well the testimony is explained by the justification that the employee stole the necklace. I suggest that the testimony’s probative value should be measured by how much better or worse it makes the explanation that the employee stole the necklace.
Pardo and Allen (2008) make an additional important point when addressing the probative value of the testimony in this example: [i]f there is other evidence that someone stole the necklace, then the testimony has greater probative value; if there is other evidence that the owner gave the [employee] the necklace as a gift, then the testimony has less probative value.
This means that to determine how the testimony’s probative value is affected by any evidence that the owner gave the employee the necklace as a gift, we need to assess the testimony’s probative value with relation to this gift hypothesis. Now here is the tricky part. If there is already evidence available to support the explanation that the owner gifted the necklace to the employee, then the testimony does not have less probative value, as Pardo and Allen suggest. Rather, the testimony would be more probative. Why? Simple—the testimony does not support that explanation. Instead, the testimony weakens it (i.e., makes it a worse explanation). So, there is no overlap of evidentiary support, which would occur if there were evidence available, before the testimony, for the explanation that the employee stole the necklace. In other words, with relation to the explanation that the owner gifted the necklace, the fact that there is available evidence to support this explanation gives the testimony more probative value.
Potential objections
Let us now consider potential objections to explanation-based approaches to standards, in general, and to the specific proposals in this article, in particular. Some may react to the explanation approach by claiming that it might make things worse. ‘With the probabilistic interpretation, we at least had numbers we could hold onto in the attempt to make things less arbitrary. This talk of ‘worse’ and ‘better’ explanations gives free reign to jurors’ and judges’ subjectivity. How is that a more effective interpretation of standards of proof?’
Talk of ‘best explanation’ and ‘substantially better explanation’ might seem to risk being as nebulous as old formulations of standards invoking obscure concepts such as ‘moral certainty’. 73 But this is not the case. We have seen how different external criteria help triers to make a decision regarding the comparative explanatory value of various explanations, even though we do not have a definitive account of explanation and explanatory value. When the explanation-based approach tells triers to decide based on the degree of explanatory value of the available evidentiary hypotheses, it is actually (indirectly) asking triers to look at those different criteria. Importantly, this guides triers to reason about facts.
Moreover, the idea that probability approaches give more precise formulations of standards because it associates numerical thresholds with specific standards is illusory. We discussed all of the problems connected to formulating standards in terms of degrees of subjective confidence. 74 These issues include the difficulty individuals have articulating their precise levels of confidence, as well as the mistake of prioritising subjective confidence over relations of evidential support. This suggests that concerns about vagueness in formulating standards in terms of explanatory value is inherent in the standards themselves; not in the specific formulation. 75 Thus, lack of precision may be a general critique of standards, and less a specific critique to the explanation-based approach.
Another initial objection to explanation-based approaches is that it seems to require the party not carrying the burden of persuasion to put forth its own explanation. This concern is particularly prevalent in criminal cases where a well-known element of the presumption of innocence is that criminal defendants do not have to produce evidence against themselves. This objection misses the mark, however. Nothing in the explanationist approach demands that both parties present explanations of the evidence, or any evidence at all for that matter. Every explanation brings with its denial as a necessary contrastive explanation. When my doctor says that the fact that I hit my toe explains my pain, we can judge that reasoning by reference to an explanation that ‘something else’ explains my pain. Still, it is highly unlikely that in any given case triers will end up with only one explanation put forth by the party carrying the burden of persuasion. Not only are triers free to craft their own explanations of the admitted evidence (even though the law might limit which kind of explanation they are allowed to consider), but it is also likely that the opposing party will present her explanation of the evidence. Just because that party is not legally required to do so, does not mean that is the best legal strategy.
A stronger objection against the explanationist approach is that it might incentivise parties to overload courts with evidence. This is because parties (and their counsellors) would see their function as construing overreaching narratives. Therefore, they would be more inclined to offer evidence that speaks to parts of the narratives, but is only marginally (if at all) related to controversial facts in the case. This ‘evidence saturation’ could, in turn, make legal decision-making more complex to the point of making the entire explanation-based approach ultimately unworkable. Even if we concede that the approach creates these incentives, other evidentiary mechanisms can help avoid evidence saturation, so that the approach will not become unworkable. Admissibility and exclusionary rules are the best examples. Not only do we have a vast range of rules that turn evidence inadmissible for myriad different reasons, we also have a catch-all exclusionary clause included in FRE 403. 76
In a recent article, Larry Laudan raised a more serious objection (Laudan, 2006). Laudan suggested that explanationist standards have little to do with the defining features of IBE. Consider ‘beyond a reasonable doubt’, in which commentators have suggested that in criminal cases fact-finders should infer the defendant’s guilt only when there is a no plausible explanation consistent with innocence (assuming there is a plausible explanation consistent with guilt) (see Allen, 2008, 2014; Allen and Jehl, 2003; Allen and Pardo, 2007; Pardo and Allen, 2008). To Laudan, however, this sounds nothing like IBE. It sounds instead like an ‘inference to the only plausible explanation’ (Laudan, 2006).
I believe that this provocation is unfair to proponents of explanation-based approaches. While the reasoning that an explanation approach asks of triers with regard to some standards is not, technically speaking, IBE as professional philosophers use the expression, we should not fetishise terminology. Call it inference to [fill in the blank]. The significant contribution of explanation-based approaches is to draw our attention to the role that explanatory inferences play in legal reasoning in general, and how these inferences can shed light on the way we think about standards of proof, in particular. Laudan seems to miss the bigger picture (see Allen, 2008: 327 and Pardo and Allen, 2008: 230 n 21, 239 n 45 (defending a similar view)).
If standards are understood as establishing different explanatory value thresholds, then we are always instructing triers to infer the best explanation (instead of asking them to infer ‘the only plausible explanation’). What differs between standards (say, from preponderance of the evidence to beyond a reasonable doubt) is how much better a party’s best explanation needs to be (compared to the opposing party’s best explanation) for that party to meet her burden of persuasion. Or should we say: her burden of the best explanation.
Conclusion
This article makes a twofold contribution to the literature on standards of proof. First, I elaborate on the many descriptive, normative and explanatory considerations in support of an explanation-based approach to standards. Second, I offer novel replies to pressing objections against that same approach.
Several questions remain unanswered. What is the best way to translate standards formulated in terms of explanatory value into jury instructions? How can standards expressed in terms of explanatory value fulfill a function commonly attributed to standards, that of serving as a mechanism to distribute errors? How is explanatory value connected to other more fundamental values, such as ascertaining the truth or achieving procedural fairness? These are important and pressing questions, and I hope to help answer them shortly.
From where we stand, a move away from probabilistic approaches to standards helps avoid much confusion. We should instead think of standards of proof as involving explanatory inferences. In that sense, the party carrying the burden of persuasion is, in fact, carrying the burden of the best explanation.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
