Abstract
A large national sample of 4,775 reports of child physical and sexual abuse made in Israel in 2014 was analyzed in order to examine whether assessments of credibility would vary according to abuse type, physical or sexual, and whether child and event characteristics contributing to the probability that reports of abuse would be determined as credible would be similar or different in child physical abuse (CPA) and child sexual abuse (CSA) cases. Results revealed that CPA reports were less likely to be viewed as credible (41.9%) compared to CSA reports (56.7%). Multigroup path analysis, however, indicated equivalence in predicting factors. In a unified model for both types of abuse, salient predictors of a credible judgment were older age, lack of a cognitive delay, and the alleged abusive event being a onetime less severe act. Over and beyond the effects of these factors, abuse type significantly contributed to the prediction of credibility judgments.
Keywords
The testimony of children about experiences of abuse plays a critical role in the substantiation of related allegations. Central within this context is the issue of credibility assessment: Professionals experience great difficulties distinguishing between children’s truthful and false statements of abuse, and as of yet, there is no method providing satisfactory levels of accuracy to guide their decisions (Vrij, Granhag, & Porter, 2010). Very few studies, however, have investigated the credibility judgments of such reports, and the research that does exist focused almost exclusively on cases of child sexual abuse (CSA). Yet, cases of child physical abuse (CPA) and the associated forensic investigations are far more common than those of CSA. Extending previous work on credibility assessment in CSA cases, the present study reports a comparative analysis of reports deemed “credible” across abuse types, physical and sexual, examining all investigations carried out in Israel in 2014. The study aims to examine whether CPA reports are evaluated differently than CSA reports in terms of credibility, which characteristics of the child and the event contribute to the probability that reports of abuse are classified as credible, and to what extent those characteristics are shared between the two types of abuse.
Abuse investigations constitute a primary step in the legal procedure aimed at protecting abused children from further harm. In many cases, these investigations, whether of CSA or CPA, rely heavily on children’s testimony because additional evidence is often lacking (Herman, 2010). Assessing the credibility of children’s statements, however, is a challenging task, and accumulated evidence suggests that the ability of professionals to accurately evaluate a child’s testimony is in doubt (e.g., Zajac, Garry, London, Goodyear-Smith, & Hayne, 2013).
Analog studies of staged events (rather than forensic investigations of alleged crimes) employing varied methods to examine professionals’ credibility ratings have shown that on average, a third of the judgments were incorrect (e.g., Leach, Talwar, Lee, Bala, & Lindsay, 2004). These findings suggest that in many cases, professionals have difficulty in distinguishing between events that have actually taken place and those that have not (Leichtman & Ceci, 1995). The few available field studies, using transcripts from real forensic interviews with alleged CSA victims, further support these findings (Jackson & Nuttall, 1993). Hershkowitz, Fisher, Lamb, and Horowitz (2007) have shown that the use of the National Institute of Child Health and Human Development (NICHD) investigative protocol to interview the child produced a major improvement in levels of accuracy, interrater reliability, and the confidence interviewers have in their decisions. Nevertheless, many judgments were still incorrect, and an additional substantial portion (16.7%; excluded from accuracy calculation) was rated as “no judgment possible.” One must bear in mind though that the main objective of child investigators when assessing credibility is to determine whether there is sufficient cause for further investigation rather than to ascertain the truthfulness of childrens’ statements beyond resonable doubt. As available findings, from analogue studies as well as from field studies, have examined decisions of the latter nature, rather than of the former, they may not fully generalize to real-life decisions made by forensic investigators.
An incorrect assessment, whether it be a true report of abuse considered as false (false negative) or a false report of abuse considered as true (false positive), can have devastating consequences for the reporting child and his or her family as well as for the alleged perpetrator (Sbraga & O’Donohue, 2003). In addition, a no judgment possible assessment on the grounds of insufficient evidentiary material may result in the closing of the case or in leaving vulnerable children unprotected and exposed to continued abuse.
Although false abuse allegations made by children in forensic interviews are considered to be the rare exception (Trocme & Bala, 2005), a considerable percentage of CSA allegations, as high as 42.1% (Haskett, Wayland, Hutcheson, & Tavana, 1995; Melkman, Hershkowitz, & Zur, 2017), is assessed as either “unreliable” or “inconclusive.” These findings seem to suggest that many true cases of sexual abuse are screened out by the welfare and justice systems, leaving little recourse for the victims. Although it has been recommended to strengthen the credibility assessment with independent case evidence (Lamb, Sternberg, Esplin, Hershkowitz, & Orbach, 1997), in fact the police may refrain from gathering evidence when the child investigator rates the case as unreliable or inconclusive.
Comparable data regarding credibility assessment of CPA reports are currently lacking. Although there is evidence to suggest that such reports may be even less likely to be judged as credible than CSA reports, findings are mixed (Ahern, Hershkowitz, Lamb, Blasbalg, & Winstanley, 2014; Hershkowitz & Elul, 1999). Some studies indicate that children may experience greater difficulty in providing detailed accounts of physical abuse (Hershkowitz & Elul, 1999) and do so with less consistency (Ghetti, Goodman, Eisen, Qin, & Davis, 2002). This apparent reluctance to disclose physical abuse, rather than sexual abuse, implied in these findings, has been attributed to the fact that physical abuse is almost always perpetrated by parents (Hershkowitz, Horowitz, & Lamb, 2005), and that compared to sexually abused children, physically abused children are more likely to be disclosing abuse for the first time in the investigative interview (Rush, Lyon, Ahern, & Quas, 2014). However, other (smaller scaled) studies have documented opposite trends, with suspected CSA victims less likely to disclose abuse than suspected CPA victims (Ahern et al., 2014) and, having disclosed the abuse, providing less information about the it (Azad & Leander, 2015). Furthermore, Cross, Finkelhor, and Ormrod (2005), in the only study that to our knowledge has examined credibility assessment in both types of abuse, found physical abuse more likely to be viewed as credible, though observed differences were insignificant. Clearly, in order to lower the risk of under (or over)-identifying the physical abuse of children, due to incorrect decision-making, there is a need for a better understanding of the factors associated with professionals’ ability to support the credibility of the testimonies.
Child and Event Characteristics Related to Credibility Assessment
Research on credibility assessment of CPA reports is still nascent and has little to offer in terms of factors that may influence such assessments. However, the literature focusing on CSA has tied several characteristics of the child and the abusive event to judgments of credibility. Age is the most powerful and consistent predictor of credible judgments, with reports made by older children, particularly school-aged children, substantially more likely to be judged as credible (Melkman et al., 2017). Although young children, as young as three, have been shown to possess the basic requisite cognitive, attentional, verbal, and communicative skills to provide reliable accounts of abuse experiences (Lamb, Hershkowitz, Orbach, & Esplin, 2008), their developing capacities in these domains limit the amount of new and relevant information they provide (Hershkowitz, Lamb, Orbach, Katz, & Horowitz, 2012). Indeed, compared with older children, preschoolers recall significantly less information, particularly in response to free recall prompts (Hershkowitz et al., 2012), and provide shorter accounts of their experiences (Eisen, Qin, Goodman, & Davis, 2002). Furthermore, although their recall responses are not less accurate than those of older children, they are more likely to omit important information (see Lamb et al., 2008, for a review) and to demonstrate less consistency over time (Ghetti et al., 2002).
Cognitive delay has also been associated with a reduced probability the report being classified as credible (Melkman et al., 2017) and presumably affects children’s interview-related skills in a similar fashion (Connolly, Price, & Gordon, 2010). Indeed, evidence shows that allegations made by higher cognitively functioning children are more accurate (Eisen, Goodman, Qin, Davis, & Crayton, 2007), and assessments of their credibility are made with greater certainty (Elliott & Briere, 1994).
There is some evidence suggesting that gender may play a role in the evaluation process, though findings are inconclusive. Studies generally indicate that male victims are more reluctant to disclose events of abuse (DeVoe & Faller, 1999), are less consistent in their accounts (Ghetti et al., 2002), and their testimonies are less likely to be viewed as credible by investigators, though reported effect sizes were small (Cross, Finkelhor, & Ormrod, 2005). Other studies, however, found gender differences to be insignificant (Wood, Orsak, Murphy, & Cross, 1996).
The child’s familial context seems to be of importance as well. Allegations involving children of divorced or separated parents, particularly when custody is in dispute, are consistently found to be less likely to be substantiated (Trocme & Bala, 2005). Why this is so, though, is not quite clear. A widespread belief is that allegations made in the context of a divorce or a custody dispute, where the depiction of one parent as a molester may produce an advantage for the other parent in judicial proceedings, are more likely to be intentionally fabricated than when no such conflict is present (Mackay, 2014). However, it has also been suggested that divorce or separation in itself, regardless of ensuing legal disputes, may negatively affect the child and hence impair his ability to provide a fully detailed account (Melkman et al., 2017).
The nature of the abusive event may also help explain variations in credibility assessment of abuse allegations. For example, children may be especially reluctant to report alleged abuse by parents and guardians (Hershkowitz, Lamb, Katz, & Malloy, 2015) and experience greater difficulty in recalling details of a specific event following repeated exposure to abuse (Powell & Thompson, 1996), hence reducing professionals’ ability to assess the veracity of their statements. Consistent with these findings, existing evidence suggests higher rates of substantiation of reports involving single abuse, rather than multiple abuse, when the suspected perpetrator was not a biological parent (Melkman et al., 2017). Finally, there are indications that reports of more severe abuse are more likely to be viewed as credible (Cross et al., 2005), although there is some evidence for the opposite effect, with credible judgment more likely in less severe cases of abuse (Melkman et al., 2017).
The literature on the credibility assessment of CSA reports reviewed above suggests that effects of child or event characteristics are often largely attributed to the impact they have on children’s capacity to provide clear and comprehensive accounts of the alleged abuse. It would therefore seem plausible that such factors would have a similar impact on the assessment of CPA cases. Yet, differences between the two types of abuse, such as the greater shame and guilt possibly experienced by sexually, rather than physically, abused children (Hobbs et al., 2014), may suggest otherwise. For example, Hershkowitz, Lamb, and Katz (2014) found that in cases involving intrafamilial abuse, allegation rates in CSA cases were lower compared with CPA cases. Another study found that in such cases of abuse, allegedly perpetrated by the parents, boys were significantly less likely than girls to make allegations when sexual abuse was suspected, whereas there were no gender differences where physical abuse was concerned (Hershkowitz et al., 2015).
Clearly, unravelling the factors related to children’s reports of physical abuse being classified as credible could have important practical implications for the investigation of such allegations and the ensuing decision-making processes. Furthermore, it is of importance to identify whether credibility assessment of CPA cases shares predicting factors with cases of CSA, or rather, different predictors are associated with credibility decisions in the two types of cases; the latter suggesting that decision-making processes in CPA and CSA cases should be seen as essentially distinct.
Credibility Assessment: The Israeli Context
In Israel, forensic evaluations of children, alleged abuse victims, are conducted by their interviewers, immediately following the interviews. Positioned at the gate of the forensic process, interviewers typically have access only to the information provided by the children and the sources filing the complaint. In addition, as interviewers often serve as witnesses on behalf of children deemed unable to testify in court, case information is not available to them. Having interviewed the child, interviewers are required to provide a detailed assessment of the credibility of the child’s testimony, based on the accounts provided, as well as on their overall impression of the child’s conduct. The interviewers summarize their impression of the child and of the statement’s credibility before they report to the police department for further investigation and evidence collection. In addition, they fill out computerized records of the allegations, including characteristics of the child and the alleged events, as well as their credibility assessment in one of the four options: credible, partially credible, difficult to determine credibility, and impossible to determine credibility. These computerized records form the data set for the current study.
The purpose of the present study is to analyze credible judgments in a national sample of all investigated reports of physical and sexual abuse carried out in Israel in 2014, in order to examine whether credible judgments would vary according to abuse type, physical or sexual, and whether child and event characteristics contributing to the probability that reports of abuse are assessed as credible differ between CPA and CSA cases. Drawing on available literature, we expected differences in rates of credible judgments across the two types of abuse, though equivalence in predicting factors. Specifically, we hypothesized that reports of sexual abuse would be more likely to be judged as credible than reports of physical abuse, but that credible judgments are equally associated with children’s older age, typical cognitive functioning (as opposed to cognitive delay), female gender, and parents being married (rather than separated), as well as with onetime abuse, more severe forms of abuse, allegedly perpetrated by an extrafamilial figure.
Method
National data files of all forensic investigations involving 3- to 14-year-old alleged victims of physical or sexual abuse interviewed in Israel in the year 2014 were examined in the present study. Files included data on child and event characteristics and credibility assessments recorded by the interviewers. Interviews were conducted by 96 trained child investigators with a bachelor’s degree in social work using the NICHD investigative protocol. Guidance to perform credibility assessment was part of their basic training, and a substantive proportion of their ongoing supervision focused on doubtful allegations and how to evaluate their veracity. Although the interviewers are generally knowledgeable about tools for credibility assessment such as Criterion Based Content Analysis (CBCA), no structured guidance for their use was provided to them, and such tools are not in formal use. Interviewers were predominantly male (78.3%), aged 26–53 (M = 35.27; standard deviation [SD] = 5.74) and had a median of 3 years’ experience working in the Israeli Child Investigation Service. 1 The data were prepared and updated monthly by the Child Investigation Service. The study was approved by the ethics committee of the authors’ university and by the research committee of the Israeli Ministry of Welfare.
A total of 9,026 cases of suspected child physical or sexual abuse (6,456 and 2,570, respectively) investigated by the Israeli Child Investigation Service in 2014 were analyzed. As repeated interviews with children may substantially differ from one another in amount of detail, described events, or the assessment of credibility, different interviews with the same child were treated as separate cases. Of interest to the current study were cases with indicated reliability assessments of reports of children aged 3 years or older who had disclosed abuse. Therefore, 2,996 cases in which the child did not disclose (33.9% and 31.4% of physical and sexual abuse cases, respectively) and 753 cases (8.3% of all cases) that did not meet the age criteria or where assessments of credibility were missing were omitted from the final analysis. An additional 502 cases (9.5% of relevant cases) with missing information on any of the child characteristics were also omitted from the analysis (cases with missing data on event characteristics may reflect the quality of the child’s report, rather than recording-related issues, and were therefore included in the analysis, as described below). In summary, a total of 4,775 cases in which children disclosed abuse and reliability was assessed were analyzed: 1,398 cases of sexual abuse and 3,377 cases of physical abuse. Compared to those cases left out of the analysis because of missing information, selected abuse cases were significantly more likely to be of unknown rather than of moderate severity (10.4% vs. 6.6%, χ2 = 8.09, df = 1, p = .017), multiple rather than single abuse (77.4% vs. 69.6%, χ2 = 15.50, df = 1, p = .000), perpetrated by the parents or other caretaker (45.2% vs. 34.0% and 13.1% vs. 10.4%, respectively, χ2 = 45.07, df = 1, p = .000), and less likely to be judged as credible (46.3% vs. 64.2%, χ2 = 58.35, df = 1, p = .000). Given these differences in event characteristics and credibility assessments, the magnitude of associations reported in the results section is likely to be an underestimation.
The age of the children in this sample ranged from 3 to 14 (M = 9.18, SD = 2.86 for either types of abuse; M = 9.99, SD = 2.92 for sexual abuse; M = 8.85, SD = 2.77 for physical abuse). Regarding gender, distribution was quite even for both types of abuse combined, with 49.7% of the sample being males. Boys were slightly more likely to report physical abuse (56.7% of the sample), whereas girls were twice as likely to report sexual abuse (67.2% of the sample).
Credibility assessments of children’s abuse reports were classified by the forensic interviewers into one of the four categories: (1) credible, (2) partially credible, (3) difficult to determine credibility, and (4) impossible to determine credibility. The first two categories indicate that the abuse report was found credible and suggest to proceed with the investigation. Although a report judged to be partially credible may have been limited in detail, or inconsistent with external evidence, still the statement was found to be truthful overall. The second two categories indicate that the veracity of the report could not be supported due to either insufficient information or a clear conclusion on the part of the evaluator that abuse had not occurred. These classifications do not allow for a finer distinction between these two very different circumstances, though in most cases, and unless external evidence corroborating the allegation is provided, they may both imply the cessation of any further investigation. As the focus of this study is on children’s statements deemed credible, for the purpose of the analysis, statements classified as either one of the first two categories were coded as credible, whereas statements falling within the second two categories were coded as “unreliable or inconclusive.”
Predictors
The predictors explored in the current study comprised child- and abuse-level measures. Children’s characteristics examined were age (at interview time), gender, cognitive delay (present/absent), and marital status of parents (married, divorced/separated, or in the process of divorce). Features of abusive events examined were severity of abuse (either severe or moderate; [1] physical abuse: cases involving any injury were classified as severe, all other cases classified as moderate; [2] sexual abuse: cases involving sexual touches [i.e., acts of indecency or vaginal, anal, or oral penetration] were classified as severe, whereas cases consisting of exposure or indecent proposals without sexual touches were classified as moderate), frequency (single vs. multiple event), and suspect–child relationship (biological parent, relative or other caretaker [i.e., stepparent, foster parent, or mother’s romantic partner], or other [i.e., teacher, other school personnel, neighbor, babysitter, or stranger]).
Data Analysis
t Test and χ2 analyses were used in order to test differences between sexual and physical abuse reports on the study variables. Multigroup path analysis was performed in order to examine whether the contribution of child and event characteristics differed for sexual and physical abuse. Logistic regression was used in order to test the overall contribution of the variables, and above all, the unique contribution of abuse type, to the prediction of credible judgments of children’s statements of abuse. Because of the nested nature of the data with child- and event-level characteristics nested within investigators, the Mplus cluster option was used.
As noted, cases with missing data on child characteristics have been omitted from the analyses because they were seen to reflect a coding error rather than an issue of substance. Missing data on event characteristics, however, are not uncommon in forensic interviewing and are likely to determine the overall quality of the report and the investigator’s ability to assess credibility and hence of interest to the current analysis. For the model analyses, missing data on the event characteristics, severity of abuse, and suspect–victim relationship were dummy coded as 1 = unknown, 0 = other. Excluding these instances, there was no additional missing data.
Results
Table 1 presents descriptive statistics of the current sample according to abuse type. Overall, 46.3% of abuse allegations were assessed as credible, the remaining 53.7% assessed as unreliable or inconclusive. CPA cases were significantly less likely than CSA cases to be assessed as credible (41.9% vs. 56.7%, respectively). CPA and CSA cases also differed significantly in their child characteristics: CPA cases were more likely than CSA cases to involve males, younger children, children of divorced/separated parents or of parents during divorce processes, and children with normal or high cognitive functioning.
Descriptive Statistics of Current Sample by Abuse Type.
Note. Numbers in parentheses indicate row percentages. SD = standard deviation.
aContinuous variable, figures presented are M (SD), t test (df), and Cohen’s d results.
**p < .01. ***p < .001.
Regarding characteristics of the abusive event, differences between CPA and CSA cases were even more notable: Compared with CSA cases, CPA cases were almost 4 times less likely to be single occurrences, more than 4 times more likely to be moderate rather than severe or unknown, and almost 7 times more likely to be allegedly perpetrated by one or both of the parents.
To examine whether the selected child and event characteristics are related to credible judgments and whether these relations differ for sexual and physical abuse, multigroup path analyses were conducted in which the associations between child or event characteristics and credible judgments were estimated for sexual and physical abuse reports. Initially, we tested a constrained model where the path coefficients were set equal across physical and sexual abuse cases. Next, we tested an unconstrained model in which the path coefficients were allowed to vary between the two types of abuse. This unconstrained model was then tested with the addition of interactions between abuse type and all the study variables. Each of these effects was examined in a separate model using a Bonferroni adjusted α level of .0016 (.05/31). No significant interactions were evident, and therefore interactive effects were not included in subsequent analyses.
As a preliminary step, possible clustering effects, that is, child and event characteristics nested within child investigators were examined using a univariate analysis of variance with the cluster identifier as the independent variable. The examination yielded significant effects (ICC = .20), F(94, 4,680) = 13.52; p < .001. Therefore, investigator’s identity was controlled for at the cluster level in all the following analyses.
The initial constrained model provided an adequate fit to the data, χ2(11) = 17.20, p = .102; CFI = .974; TLI = .949; WRMR = 1.234; RMSEA = .015 (90% confidence interval [.000, .029]). Next, we estimated a model in which path or regression coefficients were allowed to differ across groups (see Table 2). A χ2 difference test was then applied to examine whether the difference between the constrained and the unconstrained model was significant. The unconstrained model did not have a significantly better fit compared to the fully constrained model, Δχ2(11) = 17.20, p = .102, indicating that the hypothesized model was equivalent in both types of abuse. It should be noted, however, that inspection of the path coefficients in the unconstrained model did indicate that some associations between the study variables varied across the two groups, though taken together these differences were negligible (see Table 2).
Path Coefficients of Child and Event Characteristics on “Credible” Judgments for Sexual and Physical Abuse in the Unconstrained Model.
Note. SE = standard error; CI = confidence interval.
aReference category is parents being married. bReference category is moderate abuse. cReference category is suspect being one or both of the parents.
*p < .05. **p < .01. ***p < .001.
In light of these findings, indicating that a unified model for both types of abuse fitted the results best, as a final step, we tested the contribution of child and event characteristics to credible judgments of all abuse cases (sexual or physical) employing a logistic regression and taking account of the nesting effects—children within investigators. The initial model included the child characteristics—(1) age, (2) gender, (3) parents’ marital status, and (4) presence of a cognitive delay—and the event characteristics—(5) abuse type, (6) frequency of abuse, (7) abuse severity, and (8) suspect–child relationship. Variables that did not significantly contribute to the prediction model were extracted sequentially until an optimal model was reached. The results of the final model are presented in Table 3.
Final Model: Testing the Role of Abuse Type in Predicting “Credible” Judgments of Reports of Abuse Above Other Child and Event Characteristics.
Note. R2 = .19, p = .000. SE = standard error; CI = confidence interval.
aReference category is parents being married. bReference category is moderate severity. cReference category is suspect is a parent.
In total, the overall model, controlling for investigator’s identity, explained 19% of the variance in credible judgments. Above and beyond all other child and event characteristics, abuse type was found to have a significant effect on credible judgments, with suspected sexual abuse, rather than physical abuse, having a higher likelihood of being judged as credible, Exp (B) = 1.56. The strongest predictor of credible judgments was age, Exp (B) = 1.27. Although other variables had greater OR’s, meaning that per one unit of change they were related with a greater increase in the probability of a credible judgment, as increments in the child’s age grow larger, a the probability of a credible judgment grows exponentially (e.g., a 4- or 6-year increment in the child’s age was related with 2.6- or 4.2-fold increase in the probability of a credible judgment, respectively). Next in its predictive strength was lack of a cognitive delay which was positively associated with a credible judgment, Exp (B) = 3.43. Of the child characteristics examined, the child’s parents being married, rather than divorced, also significantly predicted a greater likelihood of a credible judgment, Exp (B) = 1.20.
The literature often attributes such differences to the belief that allegations made in the context of a custody dispute are more likely to be fabricated as the depiction of one parent as a molester may produce an advantage for the other parent in judicial proceedings. This explanation, however, assumes significant differences, particularly (if not only) in cases where abuse is allegedly perpetrated by one of the parents. Therefore, further χ2 analyses were conducted examining whether effects of the parent’s marital status varied according to the suspect’s identity. Significant differences in credible assessments were found only among children allegedly abused by someone in the “other” category—other than the biological parents or other relative or caretaker, χ2(2, n = 932) = 39.66, p < .001. Within this group, reports of children of married parents were the most likely to be viewed as credible (64.5%), followed by those of children of parents in the process of divorce (61.5%), with reports of children of divorced/separated parents the least likely to be viewed as credible (41.3%).
With regard to the contribution of the characteristics of the abusive event, all variables examined had a significant effect on credible judgments. Specifically, a higher probability of a credible judgment was associated with a single rather than multiple occurrences of abuse, Exp (B) = 1.75, and with moderate rather than severe abuse, Exp (B) = 1.61. Finally, reports where the identity of the suspect was unknown were significantly less likely to be deemed credible than those involving one or both of the parents, Exp (B) = .72.
Discussion
Building on previous research focused almost exclusively on CSA reports, the purpose of the current study was to investigate credible judgments of CPA cases in a large national sample of forensic investigations of child abuse. Our primary aim was to explore to what extent CPA cases were judged differently than CSA cases, and whether child and event characteristics contributing to credible judgments of reports of either type of abuse are similar or different. As expected, the results suggest differences in the evaluation of CPA and CSA reports but similarities in related child and event predictors of such evaluations.
First, we found that only 41.9% of CPA cases were judged as credible, whereas the majority of cases (58.1%) were deemed to be unreliable or inconclusive, most probably resulting in the closing of these cases. The rate of CPA cases classified as credible is substantially lower than the corresponding proportion in CSA cases (56.7%). Therefore, given the fact that false abuse allegations made by children is rare, irrespective of the abuse being of a sexual or a physical nature (Trocme & Bala, 2005), it seems that the risk of underidentifying CPA may be even greater than that of underidentifying CSA.
The various factors found to be associated with credible judgments in the current study may provide valuable insights as to the source of professionals’ difficulty to support the veracity of some of the abuse reports they evaluate. Multigroup path analysis showed that the characteristics of the child and the abusive event contributing to credible judgments of children’s abuse reports are the same for sexual and physical abuse. Specifically, younger age and, to a lesser extent, cognitive delay were the most powerful predictors of an unreliable or inconclusive classification of a report of abuse, sexual or physical. Similar results have been reported previously in connection with CSA (Melkman et al., 2017). These findings are consistent with the accumulating evidence tying children’s developing cognitive, attentional, verbal, and communicative skills to the quality of their abuse reports (Hershkowitz et al., 2012). As children’s competency in these domains increases, so do their statements become longer, more accurate, and more consistent (Eisen et al., 2007; Ghetti et al., 2002), making their evaluation less challenging.
As expected, the marital status of the children’s parents, namely, the parents being divorced/separated rather than married, was also associated with a reduced likelihood of the report being assessed as credible. A closer examination revealed that these differences were significant only when the suspect was an extrafamilial figure. Therefore, the common perception that abuse allegations made by children of divorced or separated parents are viewed as less reliable, as they may be generated by one parent hoping to get an advantage over the other by means of false allegation (Mackay, 2014), does not seem to be supported by these data. The data, as well as current scientific literature, though, provide little in the way of explaining this interesting finding, and more research is needed to shed light on this issue.
Concerning the features of the abusive event, we found that previous findings on assessment of CSA reports, demonstrating that reports of multiple incidents of abuse, as well as reports of more severe abuse, are less likely to be regarded as credible (Melkman et al., 2017), may be generalizable across abuse types, physical and sexual. The lower rates of credible judgments assigned to cases of multiple abuse, rather than single abuse, coincide with evidence documenting the greater difficulties children experience isolating individual occurrences of repeated abuse (Powell & Thompson, 1996). This may be quite taxing for children, as they are required to be able to distinguish it from other occurrences, to report details specific to that occurrence, and to avoid confusing details across occurrences (see Roberts & Powell, 2001, for a review). These reporting difficulties are compounded by investigators’ common use of interviewing strategies which may be ill-suited for such circumstances. Consequently, they often fail to focus children on specific incidents (Brubacher, Malloy, Lamb, & Roberts, 2013) and to elicit effective markers of multiple occurrences (Powell, Roberts, & Guadagno, 2007). Additionally, studies focusing on expression of emotions during abuse disclosure show that the number of abuse incidents reported is inversely related to negative emotional displays (Sayfan, Mitchell, Goodman, Eisen, & Qin, 2008), possibly violating the expectations of less research-informed evaluators regarding typical behavior of abuse victims (e.g., sadness or distress). With regard to severity of abuse, in contrast to previous findings (Cross et al., 2005), cases involving more severe physical or sexual abuse were less likely to be validated as truthful. A similar pattern of relationship has been reported previously (Elliott & Briere, 1994; Melkman et al., 2017) and may reflect the child’s difficulty to recollect and describe in depth the more intrusive and traumatic forms of abuse. Alternatively, this finding may reflect the investigators’ reluctance to make credible judgments in high-stakes cases, as they will be required to defend their judgments in court.
As hypothesized, the identity of the suspect also significantly contributed to the likelihood of credible judgments, with cases of alleged physical or sexual victimization by an unknown suspect more likely to be classified as unreliable or inconclusive. It is not surprising that evaluators had greater difficulty supporting the veracity of reports lacking such critical detail as the suspect’s identity. More difficult to explain, given the consistent evidence indicating children’s reluctance to disclose intrafamilial abuse (Hershkowitz et al., 2015), is the fact that assessment of credibility was not affected by whether or not the child was allegedly abused by a parental figure. A possible explanation that warrants further investigation is that once children do disclose transgressions committed by their parents, the amount of detail they provide and the quality of their statements are not significantly different from those made by children reporting abuse by other figures.
One conclusion to be drawn from the fact that the factors described thus far were found to have the same effects on credible assessments of children’s reports, irrespective of the type of abuse, is that it is possible to consider a unified category of abuse, whether physical or sexual, for the purpose of understanding the variation in credible judgments and developing interventions that may assist evaluators in overcoming the inherent difficulties of assessing children’s statements.
Nevertheless, there are indications in the data that there are important differences between the rates of CPA and CSA reports deemed credible that are due to variables unaccounted for in this study. Previous indications that children experience greater difficulty providing detailed accounts of physical abuse (Hershkowitz & Elul, 1999), and do so with less consistency (Ghetti et al., 2002), have led scholars to conclude that children may be more reluctant to report physical abuse compared to sexual abuse. Consistent with these conclusions, we found that the percentage of CPA reports classified as unreliable or inconclusive was substantially higher than the corresponding percentage of CSA reports. This finding may be partially attributable to the considerable variation we observed between the two types of abuse in several of the characteristics of the child and the abusive event examined (e.g., the predominance of repeated exposure to abuse in CPA cases). Notwithstanding, a key finding of the present research is that above and beyond the effects of the factors examined, and after controlling for them, abuse type significantly contributed to the prediction of credible judgments. This suggests that additional features not included in the present analysis that distinguish between reports of physical and sexual abuse may be at play. One such feature may be related to the extent to which investigations of either type of abuse are instigated by disclosure from the alleged child victims in the preinvestigative phase. Recent research demonstrates that in CPA cases, alleged victims are less likely to disclose abuse prior to formal questioning, but that in such cases, there are more types of evidence outside of disclosure, compared with CSA cases (Rush et al., 2014). In light of their findings, the authors argue that the reluctance of physically abused children to disclose the abuse in the formal interview could be due to their disclosing it for the first time, the suspicion having arisen because of other evidence.
Finally, in addition to the contribution of the characteristics of the child and the abusive event, we found that substantial variation in credible assessments was explained by the identity of the evaluator. This lends support to claims made regarding the subjectivity of the evaluation process and the role an evaluator’s personal characteristics or professional experience may have in determining whether or not the child’s statement of abuse will be deemed truthful (Herman 2005). In particular, it has been suggested that evaluators vary significantly in their attitudes toward specificity (a focus on minimizing false positive errors or errors of overcalling) or sensitivity (a focus on minimizing false negative errors or errors of undercalling) when judging the veracity of children’s abuse reports (Everson & Sandoval, 2011). Further, compared with other professionals, child protection workers were found to be significantly more oriented toward specificity (Everson & Sandoval, 2011), which could explain the overall low rate of reports classified as credible in this study. This issue, however, was beyond the scope of the present analysis, and therefore more research is needed to explore the source of variation in evaluator’s judgments.
Conclusions
The strength of the present study lies in the fact that unlike the overall majority of the studies examining credibility assessment that have been analog by nature and therefore of limited ecological validity, the current analysis was based on a national sample of credibility assessments of investigated reports of abuse. Furthermore, to the best of our knowledge, this study is the first to explore credibility assessment in cases of physical abuse, the most common form of abuse reported by children (U.S. Department of Health & Human Services, 2016). The overall picture portrayed by the findings supports previous notions regarding the great difficulty evaluators encounter in establishing the veracity of children’s reports of physical as well as of sexual abuse. Under the circumstances of extreme uncertainty in which forensic evaluators operate, the risk of underidentification of true cases of abuse seems to be higher than one would expect. Apparently, this risk is particularly high in cases where physical abuse is reported. More research is needed in order to reveal the unique characteristics of reports of physical abuse that make their evaluations so challenging. At the same time, the current results, put together with previous research, make it increasingly clear that whether sexual or physical abuse is concerned, reports of the most severe cases of abuse, involving repeated victimization, of younger, or cognitively delayed, children are the ones least likely to be judged credible. It is important to bear in mind, though, that while these factors were significant in their contribution, effect sizes were relatively small. Given the large sample size, statistically significant correlates should be interpreted with caution and demand further replication.
It would appear that dealing with the challenges posed by such difficult cases as described above requires more nuanced investigative strategies. Recently, a revised version of the NICHD investigative protocol has been put forward, incorporating the provision of nonsuggestive emotional support aimed at assisting precisely those children, who due to their developmental deficiencies or the attributes of their abuse (i.e., repeated exposure to severe intrafamilial abuse) experience greater difficulties providing comprehensive accounts of their alleged victimization (Hershkowitz, Lamb, & Katz, 2014). Initial evidence testifies to the effectiveness of the revised version, showing that its use, rather than the use of the standard protocol, increased children’s cooperation and the amount of detail they provided and facilitated reports of alleged abuse made by alleged victims who may have otherwise been unwilling to make allegations (Ahern et al., 2014; Hershkowitz et al., 2015).
Limitations
Several limitations of the study should be acknowledged. First, the credibility judgments that were the focus of the present analysis were not corroborated by external evidence. The results inform us as to the probability that cases of physical or sexual abuse would be deemed credible and as to related characteristics of the child and the abusive events. Any inference about the accuracy of these judgments, though, is speculative. Furthermore, the data on credibility assessments, which served as the basis for the current analysis, did not allow distinguishing between cases classified as unreliable and those classified as inconclusive due to insufficient information or evidentiary material in support of the child’s report. Further, research employing a more nuanced distinction between such cases, and cases determined as credible, would provide a better understanding of the variation in credibility assessment and the possible factors that may explain it. In this study, repeated interviews with the same child were treated as separate cases. We took this approach because repeated interviews may differ in credibility assessment and other related factors, but the downside is that the interviews are not in fact independent of one another. In addition, although we assume that credibility judgments were made solely based on the impression from the child’s conduct during the interview and on the quality of the statement, we have no way of ascertaining that no other information was occasionally available to the interviewers. Using an administrative data file precludes the effective monitoring of the quality of data. However, such data provide a direct observation of large numbers of field cases, allowing to perform research of good ecological validity. Finally, several important features of children’s reports of abuse, previously suggested to affect the completeness of the accounts and their subsequent evaluation, such as disclosure of abuse prior to the forensic investigation or the existence of additional corroborating evidence, were not available in the current data set. Therefore, such attributes, which may be of particular importance when examining differences between reports of physical and sexual abuse, could not be explored with respect to their contribution to credibility assessment and should be included in future investigations.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
