Abstract
When psychological evaluators are asked to provide their expert opinions in legal proceedings, they are expected to do so in an objective and unbiased way. The statutory requirements regarding the admissibility of expert testimony in many countries often cite objectivity and reliability as standards. However, as is true in many realms of human decision-making, the field of forensic psychological assessment is fraught with bias. In this article, we discuss several lines of research that have investigated bias in forensic psychological evaluations. We also discuss emerging lines of research involving methods to measure and reduce bias. We conclude with a call for structured self-monitoring as an important strategy for forensic evaluators to mitigate bias in their work.
Criminal courts are predicated on the review of reliable, objective evidence for fact-finding and decision-making. Forensic psychological evaluations are important components in cases involving mental health issues, and they are therefore subject to strict judicial assumptions of evidentiary reliability and objectivity. Forensic evaluators are required to collect and provide neutral and scientifically credible evidence to the court (in this case, forensic information and opinions). In the United States, when determining whether to admit expert testimony, judges typically adhere to the Daubert standard and thus must make an assessment of whether the testimony is “scientifically valid and can be properly applied to the facts at issue” (Daubert v. Merrell Dow Pharmaceuticals, Inc., 405 U.S. 597, 1993). In these decisions, courts consider a variety of Daubert-related factors such as “whether the theory or technique can be (and has been) tested” and the “known potential rate of error.” In South Africa, standards of expert testimony were outlined in National Justice Compania Naviera S.A v Prudential Assurance Co Ltd. (Reynard, 2018). These standards specifically state that expert evidence “should be, and should be seen to be, the independent product of the expert uninfluenced as to form or content by the exigencies of litigation.” Furthermore, the standards state that the expert should state all facts and assumptions underlying his or her opinion, and rightfully qualify their opinion if there is insufficient evidence.
Soon after the Daubert decision, Bergeron (1994) argued that psychiatric testimony did not meet this standard, as it was “neither science nor medicine” and could not provide information in absolute terms, which the law so often demands (p. 224). Bergeron further argued that the most serious flaw in psychiatric testimony was the lack of available base rates, which are “integral to the process of establishing the validity and reliability of any given scientific observation” (p. 225). Faust and Ziskin (1988) also voiced similar criticism of the value of psychological and psychiatric testimony in legal proceedings, citing a lack of uniform methodologies, an absence of base rates, and the presence of common errors in clinical judgment in forensic mental health evaluations. Despite these criticisms, testimony from psychiatrists and psychologists continued to be highly valued by courts and attorneys alike.
Beyond courtroom admissibility standards, ethical guidelines in forensic psychology mandate that evaluators “strive for accuracy, impartiality, fairness, and independence” and, when conducting forensic examinations, “strive to be unbiased and impartial and avoid partisan presentation of unrepresentative, incomplete, or inaccurate evidence that might mislead finders of fact” (American Psychological Association, 2013, pp. 8–9). Similarly, according to the Ethical Rules of Conduct for Practitioners Registered under the Health Professions Act (Health Professions Council of South Africa, 2006), which speak specifically of psycholegal activities, South African evaluators should qualify expert opinions and “clarify the effect of his or her limited information on the reliability and validity of his or her reports and testimony, and limit the nature and extent of his or her findings accordingly” (Annexure 12, s. 69). In lay terms, forensic opinions should rest on the facts of the case—not on which evaluator has conducted the evaluation. Biased evaluators compromise the validity of their opinions as well as the ability for judicial decision-makers to make truly informed, objective decisions. Clearly this can cause dire, unfair consequences for defendants and courtrooms alike.
Threats to objectivity
External bias
Despite this clear reliance on objective and unbiased experts, the criminal court process is fraught with threats to reliability. Several studies have shown that evaluator opinions are influenced simply by which side (defense or prosecution) retains their services (Boccaccini, Chevalier, Murrie, & Varela, 2017; Chevalier, Boccaccini, Murrie, & Varela, 2015; Murrie, Boccaccini, Guarnera, & Rufino, 2013). Even when evaluators use objective structured risk assessments designed to mitigate subjective bias, evaluators retained by the prosecution tend to assign higher scores on these measures (Murrie et al., 2009). Other authors have compiled a wealth of data indicating that expert opinions are substantially influenced by the fees they earn—resulting in the concept of the “hired gun” in the courtroom (Bergeron, 1994; Faust & Ziskin, 1988; Hagen, 1997).
Furthermore, Mossman (2013) found individual differences in decision thresholds between evaluators in competency to stand trial evaluations. Although he acknowledged that individual differences in feelings and beliefs may contribute to differences in decision thresholds, he also discussed several other influential variables—such as internal and external expectations and conventions in specific agencies, knowledge of local judicial decision-making trends, and various understandings of constructs underlying adjudicative competence. While these threats are likely plausible, the impact of these broader systemic variables has yet to be sufficiently studied.
Internal bias
In addition to these external threats to objectivity, evaluators are faced with several internal threats as well. Previous research has found that individual evaluators show significant variability in their base rates of specific decisions or opinions in forensic evaluations. In a sample of 59 evaluators who conducted a total of 4498 evaluations of legal sanity, 7 evaluators opined the individual was sane in 100% of their evaluations, whereas 3 evaluators opined the individual was sane in 50% of their evaluations (Murrie & Warren, 2005). In a sample of 15 evaluators who each completed at least 100 competency to stand trial evaluations each, rates of incompetency findings across evaluators ranged from 1.7% to 27.9% (Murrie, Boccaccini, Zapf, Warren, & Henderson, 2008). Similar discrepancies among evaluators have also been found in the use of the Psychopathy Checklist, Revised (PCL-R; Hare, 2003), a measure commonly used in many risk assessments. Specifically, Boccaccini, Turner, and Murrie (2008) found that some evaluators assigned consistently higher scores on the PCL-R compared with other evaluators. These studies suggest that an evaluator’s forensic opinion of a defendant may indeed partially depend on who provides it.
The cause for differential base rates among individual evaluators has been studied from various angles. Miller and colleagues found that evaluators who were more agreeable, as measured by a self-report personality questionnaire, were more likely to assign lower scores on the PCL-R (Miller, Rufino, Boccaccini, Jackson, & Murrie, 2011). More recently, Vera and colleagues found that evaluators instructed to use expressive empathy techniques were more likely to rate examinees as less psychopathic, more conscientious, and more genuine (Vera et al., 2018). It seems that personality characteristics of the evaluators themselves may also influence their forensic opinions.
Emerging research suggests that race and ethnicity may also be quite influential in evaluator decision-making. Many forensic evaluators determining insanity acquittees’ readiness for hospital discharge relied predominantly on the racial and ethnic status of the person requesting release (Callahan & Silver, 1998). In the state of Hawaii, McCallum, MacLean, and Gowensmith (2015) found that evaluators opined Asian misdemeanant defendants to be incompetent at significantly higher rates than other ethnic groups, even after the effects of language deficits and inequitable arrest base rates were removed. Others have found differences in rates of opinions between Black and White defendants when tracking their own opinions on past forensic evaluations (Parker, 2016). Of course, these studies are not necessarily evidence of implicit racism or sexism on the part of evaluators. Broader issues—rates of arrest and referrals, responsiveness to mental health treatment, access to care—likely influence these results as well. Still, it is important to be mindful that these potential biases are inherent in the work of forensic psychological evaluation.
Reasons for bias
It should not be surprising that the context of a forensic assessment begets bias. As mentioned, the nature of the adversarial system naturally results in either implicit or explicit bias. In addition, the forensic evaluation often requires evaluators to formulate an opinion with incomplete data in a limited amount of time and with imperfect assessment techniques (Neal & Grisso, 2014; Otto, 2013). When arriving at a relatively quick decision with incomplete information, the human mind takes unconscious shortcuts called heuristics (Tversky & Kahneman, 1974). Neal and Grisso (2014) describe three types of heuristics common in forensic evaluations that lead to biased decision-making: representativeness, availability, and anchoring. The representativeness heuristic leads evaluators to overvalue evidence that resembles a prototype. The availability heuristic leads evaluators to overestimate the probability of an outcome based on other cases that are easily recalled. The anchoring heuristic leads evaluators to be more influenced by information they gather first than information gathered later. Despite these heuristics operating at a generally unconscious level, they can nonetheless influence evaluator opinions in significant ways.
The difficulty to recognize bias in ourselves, coupled with the relative ease with which we are able to identify bias in others, is described by Pronin and Kugler (2007) as the “bias blind-spot.” When forensic evaluators have been surveyed about the possibility of bias in their work, most indicate they are free of bias or able to correct for any bias they might have (Commons, Miller, & Gutheil, 2004). A more recent survey revealed that almost all of the responding forensic evaluators uniformly believed they had the ability to be objective and unbiased in their work, although they varied in their acknowledgment of potential threats to their own objectivity. Conversely, every respondent in the survey described the existence of bias in the work of others, but not themselves (Neal & Brodsky, 2016). A similar study found that forensic evaluators found fault in other evaluators’ ratings of psychopathy scores, but eschewed the potential for bias within their own opinions (Murrie, 2017).
Pronin and Kugler (2007) found that a potential explanation for this type of cognitive error is that we are more likely to use introspection, rather than behavioral outcomes, to evaluate our own biases. In their study of a sample of 247 undergraduate students, they found that participants asked to evaluate their own bias reported considering their own thoughts and motivations rather than their behavior. However, when asked to suggest how others should evaluate their potential biases, respondents indicated the opposite. The authors suggest that introspection is not only a poor strategy for mitigating one’s own bias, but likely exacerbates the bias blind spot. They recommend the use of behavioral and observable anchors to more accurately gauge and mitigate the presence of bias, rather than introspection.
Emerging data from personal practice examples
The bias blind spot data suggest that the ability for evaluators in the field to adequately monitor their own potential for bias in their personal work is extremely poor. A number of evaluators have therefore recommended that evaluators keep a record of their own evaluations and outcomes (Bergeron, 1994; Murrie & Warren, 2005; Parker, 2016). Such a record allows for an objective accounting of the potential for bias. Some early data from practitioners illustrate the importance of this approach. After tracking his own forensic evaluations over a 5-year period, Parker (2016) found that he had been “more likely to find white defendants competent to stand trial than black defendants, more likely to find black men competent to stand trial than black women, and more likely to find white women competent to stand trial than black women” (p. 412). Parker also discovered that he had been more likely to find female defendants insane than male defendants and more likely to find White women insane more than White men.
A second example comes from the primary author’s own database. Gowensmith has maintained a database that has included more than 25 independent variables and 3 primary outcome variables across more than 100 forensic evaluations. Independent variables are organized into three different categories. These categories represent defendant demographic information (e.g., defendant age, ethnicity, gender, severity and nature of charges, diagnoses), evaluator information (e.g., employer, number of evaluators), and evaluation information (e.g., fee paid, referral source, type of evaluation). Outcome variables include the ultimate forensic opinion, whether the opinion was favorable to the side retaining the expert, and whether the opinion was favorable to the defense counsel. The database was personally constructed using widely available spreadsheet software (Microsoft Excel) as part of the author’s private practice, illustrating how easily such a database can be developed even by practicing professionals. The database includes cases from multiple referral sources.
A 2018 analysis (Gowensmith, Smith, & Yeager, 2018) found that most independent variables did not differ across outcome variables (i.e., there were no significant differences in forensic opinions when comparing defendant ethnicities, genders, severity of charges, etc.). However, some significant differences did arise. Defendants diagnosed with personality disorders and/or substance-related disorders were significantly more likely to receive an outcome that was unfavorable to the defense counsel. Also, defendants with a personality disorder were more likely to be ruled as competent to proceed than those without. Finally, final forensic opinions were at an increased likelihood of being favorable to the retaining attorney in cases in which a fee was paid to the evaluator. Cases in which the evaluator worked for a salary (and therefore did not personally receive payment for a case) did not show these differentially favorable opinions. Further analyses indicated that the actual difference in favorable opinion between these two types of cases was small, resulting in an 18% increase in favorable opinions in those cases in which a fee was paid.
The database was expanded to analyze similar variables in cases from a forensic mental health evaluation agency (Gowensmith, McCallum, Johnson, & Jennings, 2019). This study expanded the database to include cases from four evaluators across the agency (n = 220 evaluations). The results were largely similar to those found above; again, most independent variables showed no influence on the outcome variables. Opinions were not affected by variables such as defendant ethnicity, gender, level of charges, evaluation setting, or most others. However, although the presence of a fee was no longer significantly related to favorable opinions, other significant evaluator and evaluation factors were. For example, evaluators were more likely to return favorable findings to attorneys with whom they had worked before (regardless of whether the attorney was defense counsel or a prosecutor).
To the best of our knowledge, the aforementioned self-investigative studies are the only such projects to be published in the field. Other forensic psychologists have mentioned anecdotally that their evaluations and opinions are tracked independently, but it is unknown just how pervasive this practice is among evaluators. It seems to be important practice for any forensic practitioner, even if only for self-reflection and development.
Furthermore, monitoring potential areas and impacts of bias has added value in international settings. From a research perspective, it is important to better understand how biases may be shaped by the unique cultural histories of other countries and communities. Countries vary widely in histories of racial prejudice, segregation, sexism, and other social factors. As a result, practitioners’ evaluations may be influenced by these cultural mores or beliefs. Understanding how practitioners fit (and do not fit) with the larger cultural norms around them could provide a great deal of important information. As such, the self-investigative work discussed above is important both domestically and internationally.
Practical solutions for minimizing and correcting bias
In their survey of forensic psychologists, Neal and Brodsky (2016) asked respondents to describe their current strategies, if any, to correct or avoid potential bias in their forensic work. Endorsed strategies included taking pride in one’s professional identity, receiving didactic training on objectivity, staying abreast of professional literature and ethical guidelines, seeking feedback from supervisors or colleagues, collecting multiple sources of data when forming opinions, spreading the evaluation over time, critically examining one’s conclusions, seeking disconfirming evidence, being aware of common sources of bias, maintaining emotional disengagement, limiting contact with the retaining party, and limiting the scope of one’s opinions. A troubling finding of the survey was that respondents also rated introspection as a primary strategy to correct for bias. As was argued by Pronin and Kugler (2007) in their bias blind-spot study, introspection may not only be a poor strategy but may actually exacerbate bias. However, another strategy mentioned was tracking one’s own decision-making trends to evaluate one’s own base rates. This strategy involves evaluating one’s own behaviors (decisions and opinions in forensic evaluations), rather than one’s own beliefs and motives (introspection), and it may be a strong strategy to mitigate bias blind spot.
To be able to evaluate one’s own behavioral outcomes accurately, it is necessary to have normative comparisons. In other words, known base rates are critical. Earlier critiques of psychiatric and psychological testimony cited a lack of base rates as a primary reason such testimony did not meet admissibility standards (Bergeron, 1994; Faust & Ziskin, 1988). Knowing and referencing base rates is a strategy suggested by Daniel Kahneman, a renowned expert in the field of cognitive biases and heuristics (Kahneman, 2011). Murrie and Warren (2005) also urge evaluators to maintain a record of forensic evaluations and opinions and regularly review it. Both Parker and Gowensmith have echoed this recommendation, encouraging practitioners to maintain their own databases with both independent and dependent variables related to base rates and bias. Keeping such records allows clinicians to calculate their own base rates and compare them with base rates of their geographical area (if the information is available). Practitioners can also easily calculate differential proportions for particular analyses of interest (such as the differential proportion of cases opining White vs Black defendants as competent vs incompetent).
These types of databases also allow evaluators to calculate clinician-attorney agreement rates, or what Brodsky (1991, 1999) described as an “objectivity quotient.” The objectivity quotient is a measure of the percentage of cases in which the evaluator’s opinion aligns or is contrary to the retaining party. Murrie and Warren (2005) offered the possibility of several contextual factors that could skew results of this quotient between evaluators, such as professional setting, professional niche, or the presence of attorneys that cherry-pick evaluators likely to return favorable opinions. Nevertheless, calculating and monitoring one’s own base rates allow evaluators to consider firm behavioral outcome data rather than relying on anecdotal, falliable self-reflection. And as Brodsky (1999) pointed out, having such data could prove helpful to an evaluator on the witness stand when asked about their own bias.
If such strategies become widely practiced and the information is made public, such information could doubly function as a quality control system. For example, if a specific evaluator truly is a “hired gun,” the evidence of bias could become clear in his or her opinion base rates and the court judge could weigh the opinion of the expert accordingly. While this idea might make forensic evaluators uncomfortable, the practice could provide an extremely useful method to promote accountability in the field and strive to adhere to a high ethical standard. This approach could even be expanded to the courts themselves; at a meta-level, courts or attorneys could be analyzed to determine any inequities in the experts that they retain (i.e., guarding against courts or lawyers seeking those same “hired guns.”).
Finally, Neal and Grisso (2014) describe several tenets of social psychology that guard against bias that are applicable to forensic psychology as well. One such “debiasing strategy” is to “think like a statistician,” defined as knowing the relevant base rates and subsequently critically evaluating the strength of the evidence for the evaluator’s clinical opinion (Neal and Grisso, 2014). Another strategy discussed is to “consider the opposite,” a strategy found to mitigate one’s bias to seek confirming information (Koehler, 1991; Lord, Lepper, & Preston, 1985; Neal and Grisso, 2014). A third strategy is to use structured and systematic methods, such as a structured assessment instrument. Although even these measures are subject to be influenced by bias, as discussed above, it may be better to utilize them than to simply rely on clinical judgment alone.
Discussion
Evidence of bias may be difficult to accept for the evaluators involved. Still, evidence that has been collected and disseminated by evaluators clearly shows potential concerns regarding bias. However, several issues must be explored before demonizing specific professionals.
Evaluator bias or system injustice?
First, more sophisticated analyses and interpretations should be considered before taking the above practitioners’ results at face value. These are preliminary datasets and do not compromise the large numbers of evaluations and variables necessary to make conclusive determinations. Furthermore, some discretion is urged when interpreting these results. Bayesian approaches should also be considered in these sorts of circumstances. In Parker’s (2016) defense, it is certainly possible that differential rates of “likely positive” cases (i.e., defendants who are truly mentally ill and incompetent to stand trial) may have been referred to him by the criminal justice system. The United States has a troubling history of differential access to mental healthcare as well as criminal prosecution along color lines. The problem may then lie less with the evaluator and more with the system within which he or she works—if significantly more truly incompetent Black defendants are sent for evaluation, then subsequently differential rates of incompetency opinions would be a reflection of professional acumen, not bias. In a similar vein, although fees and previous attorney relationships initially appear troubling, it is entirely plausible in professional practice that attorneys send a higher proportion of “likely positive” cases to professionals for private practice evaluation. Marshaling reimbursement and other resources for private forensic evaluations is not an easy undertaking; attorneys may legitimately believe that many of these cases are more likely to be found incompetent or legally insane than the run-of-the-mill cases offered more cavalierly to government-employed evaluators.
However, it is also important to consider that the above explanations may not be accurate. The most insidious part of bias is rationalizing unwanted results. While the above may be true, it is also possible that some true bias may indeed exist—whether it stems from racism, financial gain, adversarial allegiance, or some other source. This is the literal definition of the bias blind spot. It applies to everyone, and it cannot be ignored because other explanations make us “feel better” about ourselves.
Understanding emerging base rates
The only factual data we have in the above studies is that base rates differ among certain independent variables. However, the causes of those differences are difficult to discern. Some differences could stem from bias, while some could stem from legitimate sources. The existence of comparative norms would aid the accurate interpretation of these differences. We turn to the differential outcomes of defendants with personality disorders as an example. The findings from Gowensmith et al. (2018) and Gowensmith et al. (2019) studies suggest that individuals diagnosed with personality disorders were significantly more likely to be found competent to stand trial and more likely to receive unfavorable opinions to the defense attorney. Is this surprising—or even concerning? Some could argue that these results are unremarkable. Attorneys may mistake personality-disordered individuals for those with an acute mental illness; after all, many are impulsive, emotionally labile, confrontational, dismissive, suspicious, and/or grandiose. However, those ingrained personality characteristics are not typically grounds for findings of incompetency. Therefore, one might expect that the “hit rate” for defendants with personality disorders to be significantly lower than the larger pool of referrals. The same logic could apply to other variables.
Without broader comparative norms, however, these interpretations are speculative. Unfortunately, normative base rates for these sorts of variables are simply not being collected or disseminated. At this point, forensic psychology is operating in relative ignorance about how these different variables typically affect forensic outcomes. Having these norms is critical in interpreting the variables from a particular evaluator or agency (Bergeron, 1994). For example, what are the normative differential base rates for opinions of competency or sanity in evaluations conducted face-to-face versus those conducted via tele-video? 1 What are the normative differential base rates for forensic opinions for evaluations conducted in jails versus hospitals versus community settings? Or those conducted for private pay versus those conducted for salaried evaluators? Do high-profile cases, or cases containing significant public outrage, affect evaluator opinions in unexpected ways? Each of us can draw preliminary conclusions about expected base rates in these different situations, but without normative base rates, certainty is replaced by anecdotal assumptions.
Moreover, it is possible that some seemingly “biased” findings may indeed be reflective of acceptable base rates. Some cases are likely to have significantly high or low opinion rates. In cases of juvenile transfers and waivers, for example, it is possible that high rates of recommendations of keeping younger offenders in the juvenile justice system (rather than “waiving” them to an adult criminal court) may be acceptable. Continuing this example, it is also likely that some case-specific variables may influence those opinions systematically—age of the offender, type of offense, political landscapes in which the evaluation is conducted, and so on. Other types of psycholegal questions, offenders, and/or evaluation circumstances are likely to disproportionately affect forensic opinions. It is unrealistic to determine what norms are absolutely acceptable versus unacceptable; however, an important step may be to systematically and broadly collect forensic opinions and associated evaluation variables to better determine existing base rates and the variables that affect them. Further research and discussion can then shape the parameters of “acceptable” versus “unacceptable” norms.
Cultural considerations
The bulk of the scientific literature on bias stems from samples found in the United States. These findings may not always be applicable to other systems. The United States operates an adversarial judicial system and holds expert testimony to certain unique standards. Moreover, racial disparities and injustices in the United States differ from those in many other areas of the world. Finally, certain psycholegal referral questions are unique to the United States, such as Miranda-related issues, police confessions, juvenile transfer hearings, and Miller-related issues. 2 It would be short-sighted to assume findings generated exclusively in the United States would generalize to other countries.
Similarly, other countries have systems and psycholegal questions that are unique. A full exploration of these differences is beyond the scope of this article (readers are encouraged to read the summary found in Simon and Ahn-Redding (2008), for more information on international and cultural differences in forensic psychology). However, we highlight an example from South Africa for illustration. In South Africa, when a person alleges that he or she has been a victim of sexual assault or rape, the court must be satisfied that the victim is competent to testify before she or her can be admitted as a witness in the proceedings. A large number of these victims are intellectually disabled and therefore subject to a forensic evaluation to determine their competence to testify (Pillay, 2012). Anecdotal evidence suggests that remaining purely objective in some of these cases is extremely difficult. Assessing the potential for bias in these evaluations, as well as deriving normative base rates, could be an important tool in understanding expectations for evaluators and opinions in these types of cases.
In a larger sense, having different base rates for different countries, jurisdictions, and systems is important to better understand how individual evaluators (or groups of evaluators in a certain system) compare and contrast with meaningful norms. Even within the United States, state statutes vary widely—it would be inappropriate to compare rates of insanity findings in states with two-pronged statutes versus those that utilize a purely M’Naghten standard, for example. 3 To date, forensic psychology has been hampered in its ability to derive differing norms and base rates across these jurisdictions, and so accurately comparing rates of opinions from particular evaluators has been extremely difficult. With the advent of big data, handheld mobile applications and devices, and electronic record-keeping, the potential exists for these types of data comparisons to finally be made.
Conclusion
Given the robust research suggesting that pure objectivity is nearly impossible in the context of a forensic evaluation, it seems unethical for experts in the field to claim otherwise. At the extreme, an expert could risk perjury if he or she claims during testimony to have arrived at a purely objective opinion without possessing an adequately scientific foundation. A more honest and ethical approach is to acknowledge what we now know about our own individual biases and how those biases may have affected our opinions. For example, some experts may choose to excuse themselves from certain cases if their objectivity cannot be empirically supported (e.g., death penalty cases, cases involving child murder).
It is likely that judges would appreciate this information as well, as it assists them in fulfilling their gatekeeping roles for admitting expert testimony. Testimony from experts in any field is often critical in determining legal outcomes. However, offering such testimony under the guise of pure objectivity is misleading to the judge, jury, defendants, attorneys, and also the public. Importantly, as is stated in the language of the Daubert decision, the judge may deem expert testimony admissible even if the technique has known error rates. The Daubert standard is a clear acknowledgment of systematic error. Bias is systematic error—and the emerging evidence shows that experts are not immune to it. Paradoxically, acknowledging known error rates in one’s forensic evaluation work thus improves the scientific credibility of one’s testimony.
Acknowledgment of bias, however, is not a simple or easy step. Not only does bias exist, it largely exists undetectable to the holder of the bias. Humans are simply not adept at monitoring their own biases through introspection. Unconscious biases affect our erroneous conclusions that we are unbiased. Because bias so often exists unconsciously, we are likewise unaware of how it affects our decisions and behavior. In the context of forensic evaluations, most evaluators would likely be surprised at the influence of bias in their evaluations (if they were brave enough to look for it in the first place). However, in line with Daubert, judges should rely on experts not only to provide a forensic opinion, but also for how various error rates can affect that opinion. This should not only include the error rates of a testing instrument, but one’s own base rates of such opinions, any normative base rates available, and research related to the existence of bias in the field.
A forensic evaluator cannot adequately accomplish the above without tracking his or her own evaluations and opinions across a variety of variables (e.g., defendant ethnicities, referral sources, amount of fees charged, charge types). Since introspection has been shown to be a poor approach to identifying and rectifying bias, professionals should instead consider recording and analyzing objective data. In forensic psychology, this means focusing on behavioral evaluation variables and outcomes (i.e., specific evaluation factors, decisions, and opinions). As Parker (2016), Gowensmith et al. (2018), and Gowensmith et al. (2019) have shown, such analyses can be illuminating. Only through such methodology will evaluators be able to accurately calculate rates of opinions and provide such information to the trier of fact. In doing so, we argue that this would strengthen the admissibility of testimony rather than weaken it.
Unfortunately, this practice is not widely accepted and is certainly not a current standard of practice. However, if it were, our field would have far greater knowledge about how individual base rates compare with base rates in the field. Furthermore, we would have rich information about how base rates may vary based on various factors, such as geographic regions, relationships with court personnel, experience, or workplaces. Forensic evaluators are encouraged to begin tracking their own evaluations and opinions in evaluations, and the field of forensic psychology should work to aggregate this data for a broader and deeper understanding of these issues. Only through this sort of approach can forensic psychology truly and objectively mitigate internal and external biases that threaten the impartiality of our forensic expertise.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
