Abstract
This study examined the internal validity and predictive accuracy of the Guidelines for Stalking Assessment and Management (SAM), a structured professional judgment risk assessment tool for stalking. Interviewers rated 89 stalking offenders on the Psychopathy Checklist: Screening Version (PCL:SV) and SAM Nature (N) and Perpetrator (P) subscales. Researchers obtained stalking and violence outcomes prospectively from several sources, for an average follow-up period of 2.5 years. Cox Proportional Hazard analyses including SAM and PCL:SV scores demonstrated a significant positive relationship between SAM total and subscale scores in predicting stalking recidivism, whereas PCL:SV scores were negatively associated with recidivism. However, the SAM clinical risk ratings did not significantly predict stalking reoffending. There were also no significant associations between SAM scores and violent outcomes. These findings provide mixed support for the use of the SAM as a risk assessment tool for stalking offenders.
Stalking has gained considerable attention from mental health professionals, criminal justice researchers, and lawmakers. A notoriously persistent behavior, stalking is associated with rates of violence between 30% and 40% (McEwan, Mullen, MacKenzie, & Ogloff, 2009), and evokes considerable distress on the part of its victims (Thomas, Purcell, Pathé, & Mullen, 2008). Victims report high levels of psychopathology, such as anxiety, depression, and posttraumatic stress disorder (PTSD), as well as significant lifestyle changes (e.g., relocating, job changes) following stalking victimization (Abrams & Robinson, 2002; Pathé & Mullen, 1997; Purcell, Pathé, & Mullen, 2004). In response to these concerns, anti-stalking laws have been established in each of the 50 United States and most developed countries (Dennison & Thomson, 2005). As a result, ever-increasing numbers of stalking offenders come into contact with the criminal justice system, which, in turn, is forced to make decisions about how to most effectively intervene to reduce stalking and stalking-related violence. The purpose of this study is to evaluate the predictive validity of a stalking risk assessment instrument, the Guidelines for Stalking Assessment and Management (SAM; Kropp, Hart, & Lyon, 2008), which was designed to help clinicians assess and manage the risk of stalking recidivism.
Risk Factors for Stalking and Stalking-Related Violence
Mental health and forensic researchers have spent the past two decades investigating the phenomenon of stalking. This research has refined our understanding of the real risks associated with stalking. For example, studies of recidivism have demonstrated that more than 25% of offenders persist in their stalking for more than 1 year (McEwan, Mullen, & MacKenzie, 2008), and 50% to 60% of offenders who receive criminal justice sanctions and restrictions reoffend within the first year (Mohandie, Meloy, Green-McGowan, & Williams, 2006; Rosenfeld, 2003). Likewise, estimates of violence in the context of stalking typically range from 30% to 40% (McEwan et al., 2009; Rosenfeld, 2004), although cases of life-threatening violence appear to be uncommon (James & Farnham, 2003).
Researchers have identified a number of variables associated with stalking persistence and violence. For example, ex-intimate partners are at higher risk for both continued stalking (i.e., they usually stalk for longer periods of time than offenders who have never had an intimate relationship with the victim; Budd & Mattinson, 2000; Mohandie et al., 2006; Pathé & Mullen, 1997; Purcell, Pathé, & Mullen, 2002; Purcell et al., 2004; Rosenfeld, 2003; Tjaden & Thoennes, 1998), and are more often violent toward their victim (McEwan et al., 2009). Another robust risk factor for both renewed stalking and stalking-related violence is the presence of a personality disorder, and in particular cluster B disorders (i.e., Borderline and Antisocial; McEwan et al., 2008; Rosenfeld, 2003, 2004; Whyte, Petch, Penny, & Reiss, 2008). Stalking offenders whose behaviors persist and escalate into physical approach tend to be seeking intimacy or have a psychotic illness. However, in ex-intimates, approach and escalation behaviors appear more difficult to predict (McEwan, MacKenzie, Mullen, & James, 2012; McEwan et al., 2008). Additional predictors of stalking-related violence have included the presence of threats, substance abuse, younger age, and low education, though the consistency of these predictors has at times been mixed (McEwan, Mullen, & Purcell, 2007; Rosenfeld, 2004). It should be noted, however, that the research on stalking-related violence has largely relied on retrospective or cross-sectional analyses, differentiating offenders who were and were not violent. Only one study to date has evaluated stalking violence prospectively (Eke, Hilton, Meloy, Mohandie, & Williams, 2011).
One putative risk factor for stalking recidivism and violence that has not been clearly established in the existing literature is psychopathy. Psychopathy is widely accepted as an important risk factor for recidivism and violence, in general criminal justice and forensic mental health populations (Douglas, Vincent, & Edens, 2006; Hare, 1999; Walsh & Walsh, 2006). However, its relevance for stalking offenders is less clear. Although Reavis, Allen, and Meloy (2008) classified 15% of their sample of 78 stalking offenders as psychopathic, no data were provided regarding the relevance of this classification to stalking recidivism or violence. On the contrary, Storey, Hart, Meloy, and Reavis (2009) found extremely low rates of psychopathy among stalking offenders, with only 1 of 61 stalking offenders identified as psychopathic. Despite the low rate of psychopathy, they found significant correlations between psychopathy and other problematic stalking behaviors (e.g., escalating frequency and severity of their stalking behavior). In a more recent study, Kropp, Hart, Lyon, and Storey (2011) found significant correlations, ranging from .20 to .46, between psychopathy and clinician risk ratings of future stalking and stalking-related violence (based on the SAM, Kropp et al., 2008, described below). No research to date has directly evaluated the link between psychopathy and actual stalking recidivism or violence.
Stalking Risk Assessment
As the criminal justice system pays growing attention to stalking, mental health clinicians are increasingly asked to evaluate the risks posed by stalking offenders. Although some clinicians might simply assess for the presence of known predictors (e.g., those variables described above), two assessment tools have now been developed specifically for stalking risk assessment. Both of these instruments rely on the structured professional judgment (SPJ) approach to risk assessment, which operationally defines known risk and protective factors to facilitate risk judgments and the implementation of risk management strategies. This approach, exemplified by the Historical-Clinical-Risk Management-20 (HCR-20; Webster, Douglas, Eaves, & Hart, 1997) and Short-Term Assessment of Risk and Treatability (START; Webster, Martin, Brink, Nicholls, & Desmarais, 2009), combines the advantages of strictly actuarial risk assessments (by relying on empirically supported predictors) with the flexibility of clinical judgment (to determine how risk factors should be weighted).
The Stalking Risk Profile (SRP), developed by MacKenzie et al. (2009), is an SPJ instrument that takes into account five risk domains: (a) the nature of the relationship between the stalking offender and the victim; (b) the offender’s motivations; (c) the offender’s psychological, psychopathological, and social characteristics; (d) the psychological and social vulnerabilities of the victim; and (e) the legal and mental health context in which the stalking is occurring. The SRP guides the clinician to combine risk factors to assess risk for specific outcomes (persistence, escalation, violence), adjusting estimates based on the characteristics of the offender and context. To date, no research has been published documenting the utility of the SRP.
A second SPJ instrument tailored to stalking is the SAM (Kropp et al., 2008). The SAM includes 30 items divided into three subscales (see Table 1): 10 Nature (N) of Stalking items, 10 Perpetrator (P) Characteristics items, and 10 Victim Vulnerability items. Evaluators rate each item as present, possibly/partially present, or absent, both in relation to current stalking and past behavior. After making ratings for each item, the evaluator considers possible risk scenarios and intervention strategies to guide several global ratings (low, moderate, or high) regarding case prioritization, risk for continued stalking, and risk for serious physical harm.
SAM Items
Note. SAM = Guidelines for Stalking Assessment and Management (Kropp, Hart, & Lyon, 2008); N = SAM Nature subscale; P = SAM Perpetrator subscale.
To date, three published studies have utilized the SAM with samples of stalking offenders. Belfrage and Strand (2008) asked 41 Swedish police officers to complete the SAM based on cases they had previously encountered. They found a significant association between individual SAM items and the summary risk ratings, but did not assess the accuracy of these assessments. Storey et al. (2009) rated the SAM in a sample of 61 men convicted of stalking offenses, demonstrating both adequate inter-rater reliability (with intra-class correlations [ICCs] ranging from .63 to .77) as well as significant associations between psychopathy and several SAM items, More recently, Kropp et al. (2011) completed SAM ratings (based on file review) on 109 convicted stalking offenders referred to a forensic assessment clinic. They also demonstrated adequate inter-rater reliability for most SAM items and summary scores (most ICCs ranged from .76 to .82), but reliability was somewhat weaker for summary risk ratings (ICCs ranged from .39 to .71) and victim vulnerability items (ICC = .44). They also observed significant correlations between SAM subscales and summary risk ratings with psychopathy scores (rs ranging from .20 to .46), based on the Psychopathy Checklist: Screening Version (PCL:SV; Hart, Cox, & Hare, 1995). They found weak correlations between actuarial risk assessment estimates based on the Violence Risk Assessment Guide (VRAG; Quinsey, Harris, Rice, & Cormier, 1998) and the SAM violence risk rating (r = .25) but no significant association between VRAG risk estimates and SAM ratings of risk for continued stalking (r = .18). Although these findings indicate that the SAM can generally be scored reliably, the utility of SAM scores in differentiating stalking offenders who will reoffend and/or engage in stalking-related violence has not yet been established.
The present study attempts to fill this void, using a prospective evaluation of the SAM’s utility in a group of stalking offenders referred for court-mandated treatment. The specific hypotheses were that SAM summary risk judgments and subscale scores would be significantly and positively associated with subsequent stalking recidivism and any future violence. In addition, we used exploratory analyses to determine which SAM items most strongly predicted stalking and violent reoffending.
Method
Participants
Participants were 89 offenders convicted of stalking or harassment in New York City. All participants were referred (typically by a probation officer or judge) to a university-run treatment program in New York City following an arrest for stalking behavior (see Rosenfeld et al., 2007, for a description). The sample was comprised primarily of men (94.4%, n = 84) with an average age of 34.9 (SD = 10.7). The average years of education was 11.7 (SD = 2.5), and the sample was varied in terms of race and ethnicity, with 36.0% (n = 32) Hispanic, 30.3% (n = 27) African American, 21.3% (n = 19) Caucasian, and 12.4% (n = 11) Other or mixed race/ethnicity. Based on Structured Clinical Interview for DSM Disorders (SCID; First, Spitzer, Gibbon, & Williams, 2002) conducted by clinicians at baseline, the most common Axis I diagnosis was major depressive disorder (15.7%, n = 14), followed by psychotic disorders (13.5%, n = 12). Half of the participants (49.4%, n = 44) were diagnosed with a substance abuse disorder. Other diagnoses included PTSD, obsessive-compulsive disorder, and bipolar disorder. The most common personality disorder diagnoses were antisocial (16.7%, n = 15), borderline (8.9%, n = 8), paranoid (8.9%, n = 8), and schizoid (8.9%, n = 8). Other personality disorder diagnoses included narcissistic, schizotypal, and passive–aggressive.
All participants had an index offense that met the definition of stalking described in the New York State statute or met a definition of stalking established by the authors (described below). Although all participants had engaged in stalking behavior, only a subset of participants was formally charged with stalking (13.5%, n = 12) or harassment (20.2%, n = 18; see Table 2); official offense data were missing for a number of participants (n = 34). The victims were predominantly ex-intimate partners (79.8%, n = 71). The most common stalking behaviors, based on official records and self-report of victim allegations, were repeated phone calls or text messages (49.4%, n = 44), threats toward the victim (38.2%, n = 34), and violence toward the victim (29.2%, n = 26).
Current Offense and Criminal History
Note. OOP = order of protection.
Procedures
Potential participants were informed of the nature, risks, and benefits of the study (i.e., that they were participating in a treatment study and that the data collected would be published). Those individuals who agreed to participate signed the study consent form, and completed an intake assessment (typically two sessions), that included an interview with a trained graduate student or doctoral level psychologist. The semi-structured interview included information about the participant’s background, current psycho-social functioning, as well as past and present legal involvement. Participants were also administered the SCID (First et al., 2002) and the SCID module designed to assess personality disorders (SCID-II; First, Gibbon, Spitzer, Williams, & Benjamin, 1997). Based on the clinical interview and review of available records, which included recent criminal record information for all participants, the assessing clinician completed a number of measures immediately following the intake evaluation. These measures included the PCL:SV (Hart et al., 1995) and the P subscale, N subscale, and global risk ratings of the SAM (Kropp et al., 2008). The SAM Victim Vulnerability items were not coded due to lack of reliable information about the victims. Participants also completed a battery of self-report measures not described in this study. Six months after the end of treatment (or 1 year after the initial intake appointment, for those who did not complete treatment), participants were invited to return for a follow-up assessment. At follow-up assessment, participants completed an interview regarding their current functioning and any subsequent legal system involvement, along with the self-report measures, and received US$50 for their participation.
Individuals were included in the current study if their recent behavior or index offense were characterized by stalking behavior involving communicating with, following, or approaching an individual or that individual’s family, with the intent to annoy, harass, or cause fear in the victim. For many individuals (n = 68, 76.4%), this determination was based on fulfilling the New York State statutory definition of stalking, which requires the behavior to be perceived as intentional, serve no legitimate purpose, be directed at a specific person, and be likely to cause fear of physical, emotional, or financial damage. Alternatively, participants were classified as “stalking” offenders if their behavior was perceived as willful, repeated, unwanted, and fear inducing (based on definitions commonly used in the stalking literature; for example, Dennison & Thomson, 2005; Kropp, Hart, & Lyon, 2002; Meloy, 2007; Meloy & Gothard, 1995). To ensure that inclusion criteria were applied consistently, two independent raters, who were both masters-level psychology graduate students and had been trained on how to identify elements of the above definitions, reviewed each case and determined the presence or absence of each element. Discrepancies with regard to whether a case fulfilled the definition of stalking were resolved on a case-by-case basis by the first author. Only those cases that satisfied at least one of these two definitions of stalking were included in the current sample.
Outcome data regarding stalking recidivism and violence were collected from four sources. First, participants who returned for a voluntary follow-up interview were asked about outcomes in the 6 months since program completion. Second, research clinicians regularly monitored the public database of current charges in New York (Webcrims; http://iapps.courts.state.ny.us/webcivil/ecourtsMain) to establish whether participants had any new charges since their intake appointment. Third, the researchers obtained official criminal record information (including arrest, conviction, and sentencing information) for 27 (30%) participants from the New York State Division of Criminal Justice Services (i.e., “rap sheets”). Finally, the researchers obtained reports from the study clinicians who were treating the individual participants. Outcome data were monitored until June 2011, resulting in a variable follow-up period (i.e., longer for those participants who were recruited early in the study). Outcomes from all four sources were collapsed into two dichotomous variables representing the occurrence of stalking or harassment (recidivism) or violence following the intake appointment. Thus, outcome data included official charges as well as offenses not known to the criminal justice system.
Measures
The SAM (Kropp et al., 2008) is a SPJ tool designed to help clinicians assess risk of violence and reoffending in cases where there is a known or suspected history of stalking. In this study, only the N and P risk factors were assessed. As it was only possible to interview perpetrators, there was not sufficient information available to reliably code Victim Vulnerability factors. In addition to analyzing the summary risk ratings (case prioritization, risk for continued stalking, and risk for serious physical harm), we analyzed total scores for all SAM items (total score), and for the N and P subscales individually. These scores were calculated based on the highest ratings (either current or past) for each item. Total scores and subscale scores were calculated to allow for comparison with past research on other SPJ instruments. However, it is important to note that summing items for SPJ tools is a common research practice that does not reflect the intended clinical use of SPJ instruments, as summary risk ratings are intended to guide treatment planning rather than simply estimate risk.
Participants were also rated using the PCL:SV (Hart et al., 1995). The PCL:SV is a 12-item version of the Psychopathy Checklist–Revised (PCL-R) scale that assesses psychopathy and has been widely used in forensic mental health research (DeMatteo, Edens, & Hart, 2010). Studies using the PCL:SV have demonstrated strong reliability (Guy & Douglas, 2006; Skeem & Mulvey, 2001) and convergent validity (Edens, Buffington, & Tomicic, 2000; Poythress et al., 2010). Studies using the PCL:SV have also shown moderate predictive accuracy for violence in inmate and psychiatric populations, with better predictive accuracy for more serious violence (Douglas, Ogloff, Nicholls, & Grant, 1999; Ho, Thomson, & Darjee, 2009; Nicholls, Ogloff, & Douglas, 2004). Scores of 18 or greater on the PCL:SV are considered indicative of psychopathy (Cooke, Michie, Hart, & Hare, 1999).
Outcome variables included post-assessment stalking and violent behavior, as measured by self-report and official criminal record information. The definition of stalking included any new stalking or harassment charges, any violations of an existing order of protection, or any behavior that appeared to reflect stalking but did not result in stalking-specific criminal charges (e.g., unwanted following, communicating with or about an individual in a manner that would be fear inducing to a reasonable person). Threats and menacing were counted as stalking if they occurred in the context of a pattern of stalking. The definition of violence included physical assaults both within and outside the context of stalking, but not threats.
Statistical Analysis
To test whether there were any significant differences in SAM results between participants who reoffended versus those who did not, several statistical techniques were used. First, chi-square analyses were used to test whether any difference in SAM summary risk ratings existed between those who reoffended and those who did not. Then, t tests were used to test for significant differences in SAM summary scores (total score, N subscale, and P subscale) between the two groups. Because some individuals had a longer follow-up period than others (those who were consented earlier were followed for more time than those who were consented later in the study), the relationship between SAM ratings and scores and recidivism was also investigated using Cox Proportional Hazard analysis. Cox Proportional Hazard analysis takes into account the length of the follow-up period before the individual reoffends or before the follow-up period ends (this time frame is also known as “time at risk”). Of note, analysis of the assumption of homogeneity of survival functions supported the use of these models, as no significant difference in slopes was observed across levels of SAM scores for any of the analyses reported below (data available upon request). Lack of power was a limiting factor in this study; therefore, multivariate analyses were conducted for exploratory purposes only.
Results
Descriptive Statistics
Summary risk ratings, which were available for 73 (81%) participants, generally indicated a low-risk sample. On the case prioritization rating, 57.3% (n = 51) of participants were rated as low risk, whereas 22.5% (n = 20) were rated as medium risk and 1 participant (1.1%) was rated as high risk. On the risk of continued stalking rating, 59.6% (n = 53) were rated as low risk, 14.6% (n = 13) were rated as medium risk, and 5.6% (n = 5) were rated as high risk. On the risk of serious violence rating, 67.4% (n = 60) were rated as low risk, 11.2% (n = 10) were rated as medium risk, and 1.1% (n = 1) were rated as high risk. The mean total score for the SAM items was 21.87 (SD = 7.42) out of a possible maximum of 40. The mean N subscale score was 11.17 (SD = 4.04) and the mean P subscale score was 10.71 (SD = 4.69). Psychopathy ratings for the sample were low, with a mean PCL:SV score of 7.8 (SD = 4.87). Only 4 participants (4.4%) obtained PCL:SV scores of 18 or above, indicating that they were likely psychopathic, and 17 (19%) obtained scores between 12 and 17 (possibly psychopathic); PCL:SV data were missing for two participants.
The average length of follow-up for the sample was 2.51 years (SD = 1.38), with a range of 11.13 months to 5.90 years. During the follow-up period, roughly one third (34.8%, n = 31) of the participants were known to have reoffended with new stalking behavior or charges after completion of the intake evaluation. Of these 31 participants, 19 (61.3%) reoffended with stalking behavior in the first year following intake. Only 10 participants (11.2%) were identified as having engaged in violent behavior following intake, and 9 (90%) of those 10 offended with violence during the first year.
Reliability of the Sam
Cronbach’s coefficient alpha for the SAM total score was .75, indicating a moderate degree of internal consistency, with slightly lower alpha coefficients for the N (α = .60) and P subscales (α = .70). The range of corrected item-total correlations (CITCs) was from .15 to .48, with a median of .30 for the total score, .28 for the N scale and .37 for the P scale. These findings are consistent with previous research (Kropp et al., 2011), and are not surprising given that the SAM items are intended to capture different aspects of risk rather than one unified construct (Douglas & Reeves, 2010; Douglas, Skeem, & Nicholson, 2011).
Using a two-way mixed effects intra-class correlation (ICC2) model in a subsample of six (7%) participants, inter-rater reliability on the SAM total score (ICC2 = .77), N subscale (ICC2 = .64), and P Subscale (ICC2 = .88) was moderate, and inter-rater reliability on the PCL:SV was excellent (ICC2 = .97; Landis & Koch, 1977; Shrout, 1998). We found good inter-rater reliability for the case prioritization (ICC2 = .80) and continued stalking (ICC2 = .71) risk ratings, and moderate inter-rater reliability for the serious physical harm risk ratings (ICC2 = .60).
Predictive Validity
The results of Fisher’s exact tests showed no significant associations between any of the three global risk ratings and post-assessment stalking or violent behavior (see Table 3). Results of a Kaplan–Meier analysis show that there was no significant difference in reoffense patterns over time when comparing different risk ratings on the SAM case prioritization item (Log-rank χ2 = 0.42, p = .81; see Figures 1 and 2). There were also no differences in SAM scale and subscale scores based on whether the participant engaged in post-assessment stalking or violence (see Table 4), or any association between psychopathy scores and either outcome (post-assessment stalking or violence).
Accuracy of SAM Risk Predictions for Recidivism Outcomes (Fisher’s Exact Test)
Note. Percentages shown are column percentages. SAM = Guidelines for Stalking Assessment and Management (Kropp, Hart, & Lyon, 2008).

Stalking Reoffenses Over Time, Divided by SAM “Case Prioritization” Risk Level

Violent Reoffenses Over Time, Divided by SAM “Case Prioritization” Risk Level
PCL:SV and SAM Scores, for Stalking and Violent Reoffenders During Follow-Up Period
Note. PCL:SV = Psychopathy Checklist: Screening Version; SAM = Guidelines for Stalking Assessment and Management (Kropp, Hart, & Lyon, 2008); N = SAM Nature subscale; P = SAM Perpetrator subscale.
However, when outcome data were analyzed with time at risk incorporated (i.e., survival analysis with 57 cases censored), both SAM and PCL:SV scores were significantly associated with stalking reoffenses (−2LL = 204.25, Model χ2 = 9.01, df = 2, p = .01). Specifically, a Cox Proportional Hazard analysis including both the PCL:SV total score and the SAM total score indicated that both the SAM (B = .10, Wald = 7.47, p = .006, HR = 1.11, 95% confidence interval [CI] = [1.03,1.19]) and the PCL:SV (B = −.12, Wald = 5.22, p = .022, HR = 0.89, 95% CI = [0.80, 0.98]) provided a significant and unique contribution to predicting stalking recidivism, with higher SAM scores corresponding to more rapid reoffending and PCL:SV corresponding to a decreased likelihood of stalking recidivism. The model predicting violence based on PCL:SV and SAM total scores was not significant (see Table 5; −2LL = 85.19, Model χ2 = .33, df = 2, p = .85).
Cox Proportional Hazards Regression Coefficients for Stalking and Violent Outcomes
Note. HR = Hazard Ratio; N Cens = Number of censored cases; CI = confidence interval; PCL:SV = Psychopathy Checklist: Screening Version; SAM = Guidelines for Stalking Assessment and Management (Kropp, Hart, & Lyon, 2008); N = SAM Nature subscale; P = SAM Perpetrator subscale.
To test the predictive validity of the SAM subscales, we conducted additional Cox Proportional Hazard analyses with each subscale in place of the SAM total score. The model including the SAM N subscale and PCL:SV significantly predicted stalking recidivism (−2LL = 206.55, Model χ2 = 7.37, df = 2, p = .025). In this model, the SAM N subscale score significantly predicted stalking recidivism (B = .14, Wald = 6.22, p = .013, HR = 1.15, 95% CI = [1.03, 1.29]), whereas the PCL:SV was significantly negatively associated with stalking recidivism (B = −.10, Wald = 3.95, p = .05, HR = 0.91, 95% CI = [0.82, 0.99]). In a similar model, the SAM P subscale significantly predicted recidivism (B = .11, Wald = 4.24, p = .04, HR = 1.12, 95% CI = [1.01,1.24]); however, the PCL:SV did not (B = −.09, Wald = 3.29, p = .07, HR = 0.92, 95% CI = [0.83, 1.01]), and the overall model was not significant (−2LL = 208.23, Model χ2 = 5.69, df = 2, p = .06). Neither of the SAM subscales significantly predicted violent recidivism (see Table 5).
A final Cox regression model included both the SAM N and P scales entered separately, along with the PCL:SV, to analyze whether the subscales provided unique information to the prediction of stalking recidivism and to gauge the relative contribution of each subscale. This model indicated an overall significant effect (−2LL = 204.07, Model χ2 = 9.23, df = 3, p = .03) with significant effects for both the SAM N and PCL:SV but not the SAM P subscale (see Table 5). This model did not significantly predict violent reoffending.
Exploratory Analyses
Because past research has found that ex-intimate partners are more likely to persist in their stalking behavior and engage in violence than stalking offenders who did not have an intimate relationship with the victim, we explored the additive utility of this variable in the Cox Proportional Hazard model predicting stalking recidivism. However, there was no main effect of relationship status in this model (B = .38, Wald = 0.58, p = .45, HR = 1.47, 95% CI = [0.55, 3.93]) nor any significant interaction effect between relationship status (entered as a dichotomous variable) and either SAM or PCL:SV total scores (B = .01, Wald = 0.02, p = .89, HR = 1.01, 95% CI = [0.84, 1.22], and B = .47, Wald = 3.21, p = .07, HR = 1.60, 95% CI = [0.96, 2.68], respectively). Prior intimate relationship with the victim was also not a significant contributor to the model predicting violence (data available upon request). In addition, there was no significant difference in offense rate when comparing groups with different sources of available outcome data (i.e., official records vs. self-report; data available upon request).
Discussion
To our knowledge, this study represents the first prospective application of the SAM in a sample of primarily ex-intimate stalking offenders referred for outpatient mental health treatment. The current study provides mixed results regarding the predictive accuracy of the SAM in a prospective, long-term follow-up design. Consistent with study hypotheses, SAM total scores provided a significant and unique contribution to the prediction of renewed stalking. Although univariate analyses did not reveal any simple mean differences in SAM scores based on stalking recidivism, nor were the global risk ratings associated with stalking outcomes, Cox Proportional Hazard analyses demonstrated a significant relationship between SAM scores and stalking recidivism. In these models, PCL:SV scores were significantly negatively associated with recidivism. Roughly similar results were found for both the N and P subscales, but the P subscale demonstrated a slightly stronger association in the model that included both subscales (i.e., only the P subscale remained significant). Thus, although mixed, these findings provide some support for the use of the SAM as a risk assessment tool for stalking recidivism.
Contrary to our expectations, the SAM global risk ratings, total score, and subscale scores were not associated with violent behavior. Interpreting these null findings is complicated by the relatively low rate of violent reoffense (10 of 89, or 11%) and our inability to determine (based on official records) whether the violent outcome occurred in the context of stalking versus another setting and/or victim. In addition, our outpatient sample was perceived to be relatively low risk, as evidenced by the infrequency of “high-risk” ratings on the SAM. Only 1 of the 71 participants (1.4%) for whom global risk ratings were available was considered to pose a high risk for serious physical harm. Thus, it is unclear whether the null findings for predicting violence reflect the limitations of the SAM or are the result of methodological limitations in this study.
Another relevant finding from this study pertains to the reliability of SAM scores, both internal to the scale and across different raters. We observed moderate levels of internal consistency for the SAM total score and the N and P subscales. These coefficients, while low relative to traditional psychological measures, are typical of those observed with SPJ instruments, which are intended to capture a range of minimally overlapping predictors (Douglas & Reeves, 2010; Douglas et al., 2011). The ratings on the SAM items and global risk ratings had moderate inter-rater reliability. However, because inter-rater analyses for global risk ratings were conducted on a small subsample, further research should be conducted with the global risk ratings before any conclusions are drawn about their usefulness.
An unexpected finding from this study was the negative association between PCL:SV scores and the likelihood of stalking reoffending in the Cox Proportional Hazard models. These findings emerged only in models that included SAM total score, as well as the model that included the SAM N subscale, suggesting that the impact of psychopathy on stalking recidivism may hinge on the presence of other risk factors. Previous research has identified a very low prevalence rate for psychopathy in samples of stalking offenders (e.g., Storey et al., 2009), and a similarly low rate was observed in our sample (4%). A potential theoretical explanation (to be examined in future studies) is that psychopathic individuals’ shallow emotional and interpersonal tendencies make them unlikely to stalk in the first place, and less likely to reoffend compared with non-psychopathic individuals who experience stronger feelings of attachment toward the victim.
Although there were a number of important strengths in this study (prospective, long-term follow-up, multiple sources of outcome data), several methodological limitations temper these findings. Among the most important limitations was the treatment setting in which data were collected. All participants were evaluated with the SAM prior to treatment and a large proportion completed an intensive treatment program specifically targeted to stalking offenders. This intervention may have (hopefully) affected the accuracy of pre-treatment risk assessments by decreasing risk of reoffending and/or violence. This risk is more significant given the absence of SAM items that target treatment-related changes. Unlike other SPJ instruments (e.g., the HCR-20, the START), the SAM focuses exclusively on characteristics of the stalking behavior, offender, and victim, with no items reflecting amenability or access to treatment. It is unclear whether this reflects a limitation of the SAM but future research with untreated samples may clarify this question.
There were also limitations in how the SAM was applied in this study. First, we omitted the Victim Vulnerability (V) subscale. As is common in many offender treatment settings, contact with the victim was not feasible and therefore this subscale could not be reliably coded. Although the predictive accuracy of the SAM may have been stronger with the inclusion of the V subscale, it is unlikely for clinicians performing a risk assessment to have comprehensive and reliable information on the victim. The added benefit of including information from the victim (i.e., the V subscale) should be the subject of future research. In addition, although summary risk ratings were generated for most cases, evaluators did not generate risk scenarios to guide these ratings (as recommended by the scale authors). Indeed, a more thorough formulation that incorporates interactions between risk factors and generating risk scenarios tailored to the individual’s situation may lead to more accurate risk ratings. Given the absence of any research on the additive utility of risk scenarios on the accuracy of risk assessment, more research is needed to determine whether this shortcoming significantly hindered the accuracy of our SAM ratings. However, our approach is typical of many research studies on SPJ instruments (Douglas, Blanchard, Guy, Reeves, & Weir, 2010), including Kropp et al.’s (2011) initial study of the SAM. Nevertheless, summary risk ratings may have demonstrated greater predictive validity had we utilized all of the intended components of the SAM.
As noted above, our sample was comprised predominantly of low to moderate risk offenders, with very few rated as high risk for renewed stalking and/or violence. The mean SAM P, N, and total scores were comparable with those reported in Kropp et al. (2011), who drew their sample criminal justice records, versus our sample, which was comprised of community-based offenders sentenced to treatment as an alternative to incarceration. However, Kropp et al. (2011) included a larger variety of offenses in their sample (not strictly stalking), included more high-risk individuals, and had larger standard deviations in the SAM scores. In contrast, our sample likely had a restricted range of risk, as the highest risk offenders were unlikely to be referred to a community-based treatment setting, likely hindering the predictive accuracy of the SAM in this study (particularly with regard to future violence). Future research using a more heterogeneous sample (in terms of risk) may help to clarify the strengths and limitations of this instrument. Finally, this study was underpowered for multivariate analysis; therefore, the multivariate analyses in the present study were conducted for exploratory purposes only, and should be replicated with a larger sample.
Despite these limitations, and the mixed findings from this study, these results provide some support for the utility of the SAM with stalking offenders. SAM numeric scores appear to help in the prediction of stalking recidivism, but not violence. Furthermore, SAM risk ratings did not predict either stalking reoffending or violence, though these results may be partly accounted for by methodological limitations. Our data suggest greater validity for predicting stalking recidivism than violence, but the “low-risk” sample and low base rate of violence may have likely affected these null findings. Given the importance of accurate assessments of renewed stalking and violence risk, further research using the SAM, and perhaps comparing the SAM with other instruments, is certainly warranted.
Footnotes
Acknowledgements
The authors wish to thank all of the clinicians, researchers, and volunteers who made this project possible: Sherif Abdelmessih, George Anderson, Trevor Barese, Katherine Byars, Joanna Cahall, Niki Colombino, Sarah Coupland, Ronald Curtis, Michael Davenport, David Early, Shana Einzig, Joanna Fava, Virginia Fineran, Alexandra Fontanetta, Jacomina Gerbrandij, Haleh Ghanidazeh, Jacqueline Howe, André Ivanoff, Martin Kassen, Sara Kopelovich, Jennifer Loveland, Melissa Miele, Samantha Morin, Christopher Ng, Justin Perry, Ashley Pierson, Brian Pilecki, Lauren Saunders, Rachel Small, Steve Smith, Marissa Stanziani, Stephanie Stern, Matthew Stimmel, Zoe-Turner-Corn, Kyle Ward, Erin Williams, and everybody else who contributed to the project.
This research was supported in part by Grant R34 MH71841 from the National Institute of Mental Health (Barry Rosenfeld, principal investigator).
