Abstract
No generally accepted method exists for quantifying the degree of injury in homicide victims. This study explores six different injury severity scores with the goal to recommend a valid method that is reliable and easy to use. To investigate this issue, 103 homicides are examined regarding the correlations between these scores. This study concludes that the Homicide Injury Scale is valid, easy to use, and has a satisfactory inter-rater reliability.
Introduction
An injury severity score is used to summarize a person’s injuries using a single number. These methods are important in trauma research and are mostly used to study mortality on a group level. Injury severity scores can be used to compare the effectiveness of trauma care in different regions and time periods (Holcomb et al., 2007; Nathens, Xiong, & Shafi, 2008). They can be applied to all trauma patients (Tiret et al., 1989) or subgroups to study specific traumas such as gunshot injuries (Nasrullah & Razzak, 2009). Another often-studied subgroup of trauma is vehicular accidents, which enable evaluation of injury prevention devices, such as seat belts (Hitosugi & Takatsu, 2000; Leth & Ibsen, 2010; Ndiaye, Chambost, & Chiron, 2009). Given the utility of this number, many different scoring methods have been proposed and used during the last few decades. Each method has benefits and limitations, which can vary based upon the purpose for which each is used (Meredith et al., 2002; Tohira, Jacobs, Mountain, Gibson, & Yeo, 2012).
Homicide epidemiology typically includes the total number of homicides, causes of death, weapons used, and characteristics of the victims and offenders (Demetriades et al., 1998). Injury severity scores are seldom included; thus, an important dimension is lost. An injury severity score could be used in homicides to compare the degree of violence between different groups of people, geographical locations, and time periods. It could be used to identify statistical relationships between the severity of injury and the characteristics of the involved persons, such as sex, age, drug use, and relationship. In addition, an injury severity score may help answer whether there has been brutalization of lethal violence or whether decreasing homicide numbers reflect more effective health care (and not a less violent environment; Ericsson & Thiblin, 2002). An injury scoring method could also be included in offender profiling to facilitate the police investigation in cases where the perpetrator is unknown (Safarik & Jarvis, 2005). This study seeks to provide an overview of the various injury severity score systems and assess which system might be most beneficial for those interested in studying homicide.
Background
While not common, a few studies have used single scores to examine injury severity in the homicide context and illustrate the utility of this construct. The following discussion first summarizes the studies and then explains the various injury severity score techniques used.
Previous use of injury scores in homicide context
Two studies utilized the Injury Severity Score (ISS). One study used the score to assess the severity of the injuries inflicted by two different perpetrators, respectively (Schmidt, Orlopp, Dettmeyer, & Madea, 2002). This way of not only having a qualitative but also a quantitative way of assessing the injuries in a homicide victim might facilitate court decisions and make them more transparent (e.g., where the line lies between “regular” violence and excessive violence). The second study included homicides among other traumatic deaths (Friedman et al., 1996). When applying the ISS, they noted that homicide victims who had been beaten to death received low scores in a much greater proportion than did victims of stabbing or shooting. This finding highlights a problem that scores might be sensitive not only to the degree of violence but also to the type of violence (e.g., blunt, sharp, gunshot).
Some studies have been based on scoring techniques specifically dedicated to lethal violence (Ericsson & Thiblin, 2002; Jordan et al., 2010; Safarik & Jarvis, 2005). Such scores might prove more useful in a homicide context, as most other injury scores are designed to predict mortality, not to assess the overall degree of violence. Ericsson and Thiblin used a fairly objective approach with simple injury count. This simple construction can be expected to generate a good inter-rater reliability but might fall short with respect to validity as it does not take into account the severities of the specific injuries. Safarik and Jarvis defined a more subjective six-graded scale, where an overall assessment of the injuries is made. This kind of approach will probably generate a more valid result, but might suffer from worse inter-rater reliability. The scale has been used in a study to investigate correlations between injury severity and characteristics of the offender and the victim (Jordan et al., 2010). They found a higher degree of violence in elderly victims, which might for example be related to different vulnerability among the victims, or different intentions among the perpetrators. Generally speaking, objectivity is good for inter-rater reliability, whereas subjectivity might increase the validity of the score. A balance between these two is desirable.
Scoring methods have also been applied to both surviving and dead assault victims to study the development of violence over time (Eiskjaer, Schroder, Charles, & Petersen, 1992). Eiskjaer et al. did not find any increase of injury severity over time but concluded that the method they used (Abbreviated Injury Scale [AIS]) was not well suited for assault victims due to its low sensitivity in the case of minor damage. This highlights the issue that injury severity scores that are valid in one context (e.g., predicting mortality in severely injured persons) are not necessarily valid in other contexts (e.g., describing the degree of injury in assault victim).
Studies with quantitative assessment of homicide violence using injury severity scores have thus been performed, but no generally accepted, validated, and reliable quantitative method has been identified (Jordan et al., 2010; Safarik & Jarvis, 2005; Trojan & Krull, 2012). For injury quantification to become a useful tool in research and for practical application to homicide victims, there is a need for standardization.
Injury severity scoring systems
An injury severity score can be based on anatomical injuries only, or in combination with physiological parameters. An example of the latter is the Trauma Score–ISS, which includes the anatomically based ISS, as well as the Glasgow Coma Scale, systolic blood pressure, and respiratory rate (Champion et al., 1990). Physiological parameters are most often missing in the background information of homicide cases, while detailed descriptions of anatomical injuries are included in the autopsy report; therefore, a useful homicide injury severity score must be based exclusively on anatomical injuries.
Two of the most commonly used scoring systems based on anatomical injuries are the ISS and the International Classification of Disease Injury Severity Score (ICISS; Baker, O’Neill, Haddon, & Long, 1974; Osler, Rutledge, Deis, & Bedrick, 1996; Tohira et al., 2012). There is also a modification of the ISS called the New ISS (NISS; Osler, Baker, & Long, 1997). Both the ISS and NISS are based on the AIS (Keller et al., 1971). Another method that has been used to score injuries is to simply count the Total Number of Injuries (TNI; Ericsson & Thiblin, 2002). Finally, the Homicide Injury Scale (HIS) has been proposed as a method for assessing the degree of violence specifically in homicides (Safarik & Jarvis, 2005).
The AIS was introduced in 1971 as a standardized system for classifying the type and severity of injuries resulting from vehicular crashes (Keller et al., 1971). It has since undergone several revisions and its use has expanded to include most types of traumas. The AIS is a consensus-derived system based on anatomical injuries, where each injury has a unique code number and is assigned an injury severity score from 1 (minor) to 6 (maximal; Gennarelli & Wodzin, 2008). The AIS has been used to produce different types of injury severity scores. One straightforward approach, which we have not seen in the literature, is to simply add all AIS scores together to produce what we refer to as the Sum of AIS (SAIS). The rationale for doing this is that it takes into account both the number of injuries as well as their individual severities. It is worth emphasizing that the AIS scores are related to the specific injuries, not the outcome in a specific case. An injury is not arbitrarily coded as 6 just because the patient died from it (e.g., the possibly lethal injury of a carotid artery transection has a score of 4). The AIS manual is used under license, and we had the latest revision; AIS 2005 (2008 Update). While the manual contains detailed coding instructions and does require some anatomy knowledge, anyone capable of understanding an autopsy report will most probably manage the AIS manual.
The ISS was developed in 1974 and is based on the AIS (Baker et al., 1974). The purpose of the ISS is to summarize injury severity, especially in people with multiple traumas. The ISS has become the leading scoring system in trauma studies (Tohira et al., 2012). The body is divided into six regions (head or neck, face, chest, abdominal or pelvic contents, extremities or pelvic girdle, and external), and the single highest AIS score in each of the three most severely injured regions is squared and summed to calculate the ISS. If any injury has an AIS score of 6, the ISS is assigned the maximum value of 75.
The NISS is similar to the ISS (Osler et al., 1997), but instead of evaluating the three most severely injured regions, the three highest AIS scores, irrespective of body region, are squared and summed. Similar to the ISS, the maximum value of the NISS is 75.
The ICISS is not based on the AIS, but instead on the International Classification of Disease (ICD), which is used internationally as a diagnostic system in health care (Osler et al., 1996). In contrast to the AIS, which is consensus-derived, the ICISS is based on empirical patient material. For each injury code in the ICD, a survival risk ratio (SRR) is defined as the number of survivors with that code divided with the total number of patients with that code. Hence, the SRRs have values between 0 (no survivors) and 1 (no deaths). The ICISS is defined as the product of all SRRs associated with a patient’s ICD codes, and ranges from 0 to 1. As the SRRs are derived from patient data sources, they may differ between regions and countries. The SRR can also be termed the diagnosis-specific survival probability (DSP). We used ICISS values obtained by ICD-10 coding together with SRRs (or DSPs) derived from an international data source (Gedeborg et al., 2014).
The TNI is simply the sum of all injuries resulting from separate traumas. If one trauma evidently made more than one injury (e.g., a gunshot with entrance and exit wounds and organ damage along its trajectory), these injuries are counted as 1. This method has been used in a previous homicide study (Ericsson & Thiblin, 2002).
The HIS was proposed by co-workers at the Federal Bureau of Investigation (FBI) as a means of quantifying injury severity in homicides of elderly women (Safarik & Jarvis, 2005). Through a review of the autopsy report, an overall assessment of the injury severity is made according to a scale from 1 (least severe) to 6 (most severe).
The injury severity scores used in this study are summarized in figure 1.

The injury severity scores used in this study.
Research question
The aim of the present study is to identify an injury severity score that is suitable for homicide victims and that is comprehensible, unambiguous, and objective. In addition, the score must be valid and have a good inter-rater reliability. It should be quick to apply, so that large materials can be scored within reasonable time. Finally, it should not require the medical knowledge of a forensic pathologist, so that researchers with different educational backgrounds can easily use it. The six scoring systems described above provide the basis for this quest.
Method
Data
This study initially utilized all 6,715 deaths that were investigated at the forensic department in Stockholm, Sweden, during a 5-year period (2000-2004). These cases were assessed with respect to the cause of death certificate. In all deaths caused by trauma or poisoning, the forensic pathologist classifies the manner of death, irrespective of the judicial judgment, as “accident,” “deliberately self-inflicted,” “deliberately caused by other,” or “unclear whether intention existed.” In all cases where the forensic pathologist had assessed the manner of death as “deliberately caused by other,” the autopsy report was included in the study.
During the 5-year period, 127 deaths were classified as “deliberately caused by other” and, thus, considered for inclusion. A total of 24 cases were excluded: 13 due to circumstances that complicated injury assessment at autopsy (7 due to prolonged hospital care ranging from 4 days to 4 months, 4 due to putrefaction, 1 due to embalming, and 1 due to organ donation), 6 due to autopsy reports containing insufficient data, 3 due to secondary injuries (1 drowning, 1 fall from height, and 1 hit by car), and 2 because they were suspected to have been misclassified as homicides (1 medical mistake and 1 suicide). After exclusion, a total of 103 cases remained and were included in the study.
Application of the Injury Severity Scoring Systems
HIS
There are some terms used in the HIS that are ambiguous and, therefore, have to be subjectively assessed. These are “minor,” “moderate,” and “serious” external injuries as well as “overkill.” Safarik and Jarvis (2005) refer to the Crime Classification Manual (Douglas, Burgess, Burgess, & Ressler, 1992) for overkill, where (in a newer edition) overkill is defined as “excessive trauma beyond that necessary to cause death” (Douglas, Burgess, Burgess, & Ressler, 2006). The book also provides examples of homicides containing overkill. Thus, the precise definition of overkill remains elusive, which has forced researchers to make their own interpretations when applying the HIS (Jordan et al., 2010).
We defined “Overkill” as one of the following:
A total of 40 or more skin injuries (blunt, sharp, gunshot)
Three or more sharp wounds located at the head, neck, or trunk with internal organ injuries (including the pleura and large blood vessels)
Three or more gunshot wounds located at the head, neck, or trunk with internal organ injuries (including the pleura and large blood vessels)
The term “skin injuries” includes superficial contusions and abrasions as well as deeper lacerations and penetrations. The limit of 40 injuries was supported by an earlier study, where there seemed to be a cluster of outliers with more than 40 injuries (Ericsson & Thiblin, 2002). The choice of three or more sharp or gunshot wounds indicates that the attacker continues to exert potential lethal violence on the victim even after he or she presumably has become aware that possible lethal wounds have already been inflicted.
In the HIS, there is a demarcation of “minor” in contrast to “moderate to serious” related injuries. We did not further specify these definitions, they were subjectively assessed with the aid of the example given by Safarik and Jarvis (2005). The reason for this was that in cases without excessive number of injuries (i.e., our definition of 40 or more), our experience is that the type of injuries is more important than the number of injuries. For example, two gunshots are more serious than eight abrasions. Setting a fixed number for separation of these two entities would probably produce more counter-intuitive results than the 40 injuries definition of “overkill.” We reasoned that if a homicide victim has 40 injuries or more, this says something about the degree of violence no matter what type of injuries there are.
In the HIS, the number of “causes of death” is used. We defined “causes of death” as both those explicitly assessed as lethal in the autopsy report, as well as other injuries fulfilling the criteria previously defined as being lethal (Ericsson & Thiblin, 2002). In short, these criteria are injuries to internal organs and large blood vessels. If the forensic pathologist concluded that there were signs of mechanical asphyxia, this was also included as a potential lethal injury. All injuries were classified with respect to the type of violence as blunt, sharp, gunshot, or asphyxia. Cases with other types of lethal violence, for example, electrocution or burning, were excluded. To fulfill the HIS criteria of having “two or more” causes of death, there had to be lethal injuries in at least two modalities of violence (blunt, sharp, gunshot, asphyxia). Thus, two lethal knife wounds were counted as one cause of death, whereas a lethal knife wound and a lethal gunshot were counted as two.
ISS, NISS, ICISS, TNI, and SAIS
Besides the above-described modifications of the HIS, the scoring systems were applied in accordance with their original definitions. In the case of the SAIS, the original definition appears in the present study as it has not been used before.
Scoring
One of the authors applied injury severity scores to each case based on a review of the autopsy report. The applied scoring systems were as follows: SAIS, ISS, NISS, ICISS, HIS, and TNI. Another author also assessed the HIS for inter-rater reliability. The HIS was assessed using only the information in the final statement of the autopsy report. The final statement consists of a compilation of the autopsy findings together with the forensic pathologist’s opinion concerning when and how the injuries were inflicted as well as the cause and manner of death. The other scoring systems (SAIS, ISS, NISS, ICISS, and TNI) were assessed using the information from the full report (including the final statement). To test if HIS-scoring is dependent on the experience of forensic autopsies, the two assessors had different professional backgrounds (forensic pathology and psychiatry).
Analyses
The distributions and correlations of the injury severity scores were visualized using scatter plots and histograms. The Spearman rank correlation was used to assess the correlations between different scoring systems as well as the inter-rater reliability, and a p value < .05 was regarded as statistically significant. For inter-rater reliability, the Cohen’s kappa and simple agreement was also calculated.
To assess validity of a method, researchers need to have a “gold standard” with which to compare. As there is no generally accepted method for quantifying the degree of injury in homicide victims, a reference method had to be chosen. Out of the different techniques presented above, we argue that the SAIS can be seen as a “gold standard.” The rationale behind this is that it is an objective method with minimum subjectivity, and that it takes into account both the number of injuries as well as their individual severity. Moreover, the SAIS is based on the AIS, which is a well-known and widely used system of injury severity classification (Meredith et al., 2002). Thus, if a method correlated well with the SAIS (“gold standard”), we considered it as valid (measuring what we want, that is, the degree of violence).
Results
Among the lethal injuries, sharp injuries were the most common, occurring in 44% (n = 45) of the cases. Lethal gunshots and blunt injuries were present in 29% (n = 30) and 26% (n = 27), respectively. In 15% (n = 15), there were signs of asphyxia. Some victims received potentially lethal injuries within several modalities and were counted more than once; therefore, the proportions sum up to more than 100%.
The injury severity scores are summarized in Table 1. Spearman’s rank correlations were calculated for the different methods using the SAIS as a reference. All correlations were statistically significant (p < .05) and were as follows: ISS .31, NISS .23, HIS .72, ICISS −.61, TNI .81.
Results of the Injury Severity Scores.
Note. ISS = Injury Severity Score; NISS = New Injury Severity Score; SAIS = Sum of Abbreviated Injury Scale; ICISS = International Classification of Disease Injury Severity Score; HIS = Homicide Injury Scale; TNI = Total Number of Injuries.
The inter-rater reliability of our specified HIS was assessed based on a random selection of cases from 2000 to 2001 (n = 36). In this subset, the inter-rater reliability was .85 using Spearman’s rank correlation and .67 using Cohen’s kappa (unweighted). The two observers agreed in 81% of the cases.
Discussion
The aim of the present study was to identify a method for describing the degree of violence inflicted on homicide victims that is valid, reliable, and easy to use. In the present study, inter-model reliability was used to evaluate the validity, using the SAIS as a “gold standard.” When thinking about how one subjectively assesses the degree of violence in a homicide, most researchers will probably agree that the numbers of injuries as well as their individual severity are two key aspects. This is in part why we believe that the SAIS can be seen as a gold standard for quantifying injury severity in homicide victims. Furthermore, it is based on the AIS, a widely used and accepted method for injury severity classification in trauma research. The SAIS takes into account all injuries, superficial as well as deep. Each injury contributes to the sum with respect to how severe that injury is regarded to be on a group level, without consideration of the actual outcome in the specific case. Taken together, these properties make the SAIS straightforward and objective. As other scoring methods derived from AIS (such as ISS and NISS) have been validated in trauma research, the AIS should be a solid ground for scoring injury severity in homicide victims as well.
Despite these strengths, the AIS and scores derived from it such as the SAIS have weaknesses, including ones that affect their ability to be readily applied by researchers of various backgrounds. Some AIS scores depend on clinical parameters (such as blood loss and neurological deficits), and this information is often missing in autopsy reports. In the present study, if the report did not explicitly mention such parameters, we coded conservatively, which probably made the AIS values falsely low in some instances. In deaths due to physiological disturbances that do not necessarily leave any specific anatomic lesions, such as asphyxia, the related injuries might be minor. In such cases, both the AIS and SAIS scores will be low. Furthermore, some injuries are missing or insufficiently represented in the AIS manual, for example, the effects on inner organs by high velocity bullets.
Correlations between the SAIS and both the ISS and NISS were weak. This finding is not surprising as the ISS and NISS only take into account a maximum of three injuries. These methods have been proven efficient in predicting mortality in living patients, but our data suggest that they are of little use in homicide victims. When assessing the injury severity in homicide victims all injuries are of interest, not only the most severe ones. The observation made by Friedman et al. (1996) that the ISS was lower in people beaten to death compared with deaths due to sharp violence or gunshots is less pronounced with the SAIS, as all bruises and abrasions contribute to the sum (Friedman et al., 1996).
The correlation between the SAIS and ICISS was stronger. A main advantage of the ICISS over the ISS and NISS when using hospital-based data sources is that it can be calculated from information already present in the patients’ records. In homicide victims, this is seldom the case. Instead, researchers must look up the ICD codes, which can be time-consuming. Furthermore, the ICISS depends on SRRs, which may differ between regions and countries, making comparisons of research reports from different centers difficult. The SRRs also change with time due to improved health care, making longitudinal studies more complicated.
The TNI and HIS both strongly correlated with the SAIS. Both the TNI and HIS are much faster to apply than the SAIS, especially when there are large numbers of injuries. The TNI, though, often requires access to the entire autopsy protocol, which may be hard to interpret for those without autopsy experience. The HIS, on the other hand, has the advantage of depending on information solely from the summarized autopsy report statement, which can be reliably assessed without special skills in forensic pathology.
Another advantage of the HIS is that it is easily interpreted as the different scores are explicitly defined in words. This advantage of HIS is also one of its weaknesses, as it contains ambiguous elements. Most importantly, the use of “overkill” needs clarification to attain a good inter-rater reliability. We used a definition of overkill based on the number of injuries. A problem with this kind of definition is that it may (and will) not be valid in all individual cases. One can imagine a victim with 39 skin injuries that most researchers would assess as overkill, as well as one with 41 abrasions and contusions that most researchers would not classify as overkill. The definition, though, is simple and objective, which is good for inter-rater reliability. When specifying the definition of overkill, there is a balance between validity and reliability. We believe that the most important applications of an injury severity score in homicides will be on a group level, and on a group level the number of injuries will probably correlate well with the subjective assessment of “overkill.”
We evaluated the inter-rater reliability only for the HIS as the HIS contains subjective definitions, even after our modifications, and therefore it needed to be evaluated concerning reliability. The AIS-based methods, the ICISS, and the TNI are more clearly defined, and we therefore omitted inter-rater reliability testing of these. The high inter-rater reliability for the HIS might be affected if the overkill definition is changed. If we stick to the same principle of looking at the total number of certain injuries (e.g., number of gunshot injuries in head, neck, or trunk with internal organ injuries), we do not expect this to happen as such criteria are equally well defined. If more subjective elements were added to the definition, the inter-rater reliability would probably deteriorate. For practical reasons, we used a randomly chosen subset from the first two study years for inter-rater reliability testing. We believe that this subset (28% of the total number of cases) was large enough for the result to be valid for the whole set.
Conclusion
We believe the total SAIS is the closest to a gold standard for injury severity quantification in homicide victims. The HIS does have attractive features for use in homicide cases, but the concept of “overkill” is difficult to define. With our proposed overkill definition, the HIS is valid compared with the SAIS and has a high inter-rater reliability. Thus, the HIS can be used as a valid surrogate method for evaluating injury severity in homicide victims. As the HIS is easier to use and less time-consuming, it is well suited for application to large homicide materials. The use of standardized quantification of injury severity in homicide victims will improve criminology research and hopefully contribute to both homicide prevention and homicide investigations.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was financially supported by the Swedish National Board of Forensic Medicine.
