Abstract
Suicide occurs in all regions of the world, in both high- and low-income countries, making it a major public health concern worldwide (World Health Organization [WHO], 2014). Specifically, every 40 seconds, an individual somewhere in the world dies by suicide, which translates to over 800,000 annual global deaths (WHO, 2014). Suicide represents 50% and 71% of all violent global deaths in men and women, respectively (WHO, 2014). Life expectancy is an important indicator of a population’s health (Braveman, Egerter, & Mockenhaupt, 2011), and individuals who experience suicidal ideation have disproportionately lower life expectancies, compared with nonsuicidal individuals.
Suicide is responsible for these many years of potential life lost (Center for Disease Control and Prevention [CDC], 2013), in part, because it is a multifaceted public health crisis. Suicidal ideation involves a complex interplay between individual, relationship, community, and societal factors (CDC, 2014). These complex factors make predicting suicide risk onerous, and this further potentiates vulnerability. Suicide risk is dynamic, fluctuating over time concurrent with both external events and internal experiences, which can change rapidly (Bryan & Rudd, 2012). Although prediction is multifarious, using a theoretical framework to guide the identification of suicide assessment instruments with the strongest psychometric properties may partially increase the ability to recognize individuals at high risk of suicide.
Theoretical Framework
Joiner’s (2005) Interpersonal-Psychological Theory of Suicidal Behavior (IPTS) was used to guide this review. The theoretical definition of suicide involves desire and ability to enact lethal means. Specifically, the individual will not die by suicide unless she or he has both the desire to die by suicide and the ability to do so. Suicidal desire, according to the theory, emerges when two interpersonal states—perceived burdensomeness (i.e., belief that one’s existence burdens others, and the perception that one’s death is worth more than one’s life to others) and thwarted belongingness (i.e., sense of alienation, disconnection, and social isolation)—are perceived as hopeless and experienced simultaneously. The third component of the IPTS is acquired capability, which refers to habituation to pain and fear, enabling one to more readily engage in self-harm. Specifically, Joiner (2005) posits that in addition to experiencing simultaneous perceived burdensomeness and thwarted belongingness, the capability to initiate suicidal behavior is acquired via exposure to painful and fear-provoking events (e.g., self-injury, previous suicide attempt, physical violence, combat situation) that habituate individuals to the pain and fear associated with death. Repeated exposure to these painful experiences creates the ability for lethal self-injurious behavior, in part by increasing the tolerance for pain and potentiating a fearlessness of death. As a result, there is a three-way interaction between perceived burdensomeness, low belongingness, and the acquired capability for suicide.
Methods
The Whittemore and Knafl (2005) methodology guided the literature search and was used to enhance the rigor of this review. Identification of suicide risk assessment instruments with newly evaluated psychometric properties provided boundaries for the systematic literature search. Data reduction, data comparison, conclusion drawing, and verification permitted thorough interpretation and synthesis of recent psychometric property evidence.
Information databases PsycINFO and PubMed were systematically searched with AND in addition to OR, in combination with the following medical subject heading terms and keywords: suicide, suicidal ideation, self-injurious behavior, self-harm, self-injury, suicide attempted, suicidal behaviors, risk assessment, risk, instrument, scale, questionnaires, psychiatric status rating scales, psychological tests, mental status schedule, screen, mental health, interpersonal psychological theory, interpersonal theory of suicide, psychometric, psychometrics, measure, and validation. The search strategy is further detailed in Figure 1, using the modified PRIMSA flow diagram (Moher, Liberati, Tetzlaff, Altman, & The PRISMA Group, 2009).
Modified PRISMA Flow Diagram (Moher et al., 2009).
Inclusion criteria for information databases included: peer-reviewed articles, suicidal ideation or suicidal behavior risk instrument, depression screen with a designated suicide item, sample with adults 18 years and older, and psychometric properties (i.e., reliability, validity, sensitivity, specificity, or factor analysis) evaluated within the past 6 years (2010 to 2015). Year limits were enacted to permit identification of current suicide risk screening trends and newly evaluated psychometrics. International articles with country- and language-specific instrument versions, which included sample psychometric properties, were included to promote evaluation of the populations screened by instrument. Additionally, articles with diverse methodologies were included and integrated into the review, as suggested by Whittemore and Knafl (2005).
Exclusion criteria for information databases included the following: articles not written in the English language, reviews and reports, articles published prior to 2010, and articles reiterating established psychometric properties but not reevaluating psychometric properties in new samples. Given our focus on newly evaluated suicide risk instrument psychometric properties in adult populations, adolescent and pediatric samples (i.e., <18 years) and adolescent-specific suicide risk instruments were excluded. Further, because pregnancy-related depression was not a particular focus, pregnancy-specific screening instruments were excluded, in part because the percentage of nonpregnant adults far exceeds the percentage of pregnant women at any given time. Finally, because the review focuses on screening during adulthood, geriatric-specific instruments were also excluded given that they are only applicable to a particular segment of the adult population only (i.e., older adults).
Results
Systematic searches resulted in the inclusion of 51 articles that assessed the psychometric properties of 16 suicide risk assessment instruments. Research participant descriptions, theoretical framework information, sample psychometric properties, and Oxford Center for Evidence-Based Medicine (2011) levels of evidence are detailed in Appendix A and synthesized below.
The During original scale development, in studies of psychiatric inpatients, the BSSI demonstrated high concurrent validity coefficients with the SSI (p < .001; Beck & Steer, 1991), was positively correlated with having made a suicide attempt (p < .001; Pinniti, Steer, Rissmiller, Nelson, & Beck, 2002), demonstrated high internal consistency (coefficient alpha .96 to .97; Beck & Steer, 1991; Pinniti et al., 2002), had 1-week test–retest reliabilities ranging from .54 to .88 (Beck & Steer, 1991; Pinniti et al., 2002), and total item-total correlations were significant beyond the .001 level (Pinniti et al., 2002). In recent evaluations, in a sample of U.S. male psychiatric inpatients, convergent validities with the Reasons for Attempting Suicide Questionnaire (RASQ) internal subscale and Adult Suicide Ideation Questionnaire (ASIQ; p < .01; Horon, McManus, Schmollinger, Barr, & Jimenez, 2013) were established. Further, convergent validities were established with the Beck Hopelessness Scale (BHS) in primarily female Asian American young adult volunteers (p < .01; Miranda, Gallagher, Bauchner, Vaysman, & Marroquin, 2012) and incarcerated U.S. psychiatric inpatients (p < .01; Horon et al., 2013). Finally, a Chinese version of the BSSI demonstrated convergent validity with the total Three-Dimensional Psychological Pain Scale, Psychache Scale, and Beck Depression Inventory (BDI), in majority female Chinese outpatients with mood or depressive disorders (p < .01; Li et al., 2014). Internal consistencies (Cronbach’s alpha) in U.S. samples ranged from .85 to .98, in studies that included incarcerated U.S. males (.85, Horon et al., 2013; .94, Smith, Wolford, Mandracchia, & Jahn, 2013), African American mothers with a history of suicide attempt (.91; Woods, Zimmerman, Carlin, Hill, & Kaslow, 2013), primarily Asian American women undergraduates (.96, Miranda, Valderrama, Tsypes, Gadol, & Gallagher, 2013; .98, Polanco-Roman & Miranda, 2013), and primarily female Asian American young adult volunteers (.95; Miranda et al., 2012). In two of these samples, which contained majority Asian American young women, follow-up internal consistencies (Cronbach’s alpha) ranged from .97 (Miranda et al., 2012) to .98 (Miranda et al., 2013). In majority female Chinese samples, the internal consistency (Cronbach’s alpha) of the Chinese version of current ideation was .90 (Li et al., 2014) and .91 (Xie et al., 2014). In these Chinese samples, worst ideation internal consistencies (Cronbach’s alpha) were .94 (Xie et al., 2014) and .95 (Li et al., 2014). Further, Cronbach’s alpha of the Chinese version, in a sample of primarily female Taiwanese adults with obsessive-compulsive disorder (OCD), was .85 (Tzu-Chi et al., 2010). Additionally, in a study including a majority of Pakistani women hospitalized after a suicide attempt, internal consistency (Cronbach’s alpha) of the Urdu translated version was .89 (Husain et al., 2014). Finally, the Korean version of the instrument, in a sample of majority South Korean women at high-risk for suicide, produced a Cronbach’s alpha of .92 (Kim, Ha, Yu, Park, & Ryu, 2014), and in a majority male Korean epileptic sample, Cronbach’s alpha was .82 (Lim et al., 2010). The In a study comparing psychiatric inpatients and nonclinical undergraduate students, during original scale development, the SBQ-R differentiated between suicidal and nonsuicidal groups (p < .001). A cutoff score of 2 in both clinical and nonclinical samples was most useful in identifying individuals with established suicide status, correctly identifying individuals as positive for suicide ideation or attempts (sensitivity: .80–1.0), and individuals identified as nonsuicidal were correctly identified as nonsuicide ideators or nonattempters (specificity: .96–1.0; Osman et al., 2001). Recent sample internal consistencies (Cronbach’s alpha) in U.S. adults ranged from .79 to .90 in studies that included majority male deployed military personnel (.79; Bryan, Hernandez, Sybil, & Clemans, 2013) and university students and adults (.90; O’Riley & Fiske, 2012). In a sample of primarily female United Kingdom individuals with a history of traumatic event exposure, Cronbach’s alpha of the English version was .87 (Panagioti, Gooding, & Nicholas, 2012). The instrument was also translated to German in another study, which included primarily female German adults, resulting in a Cronbach’s alpha of .76 (Wagner, Klinitzke, Brahler, & Kersting, 2013). The With regard to factor structure, in a study examining clinical and nonclinical young adults, the INQ-10, −12, and −15 demonstrated better fit than the INQ-18 and −25 during confirmatory factor analysis (p < .001; Hill et al., 2014). Additionally, factor analysis confirmed that the INQ consists of two distinct latent factors associated with burdensomeness and belongingness, in deployed U.S. military personnel samples (p < .001; Bryan, 2011), U.S. undergraduate students (p < .001; Freedenthal, Lamis, Osman, Kahlo, & Gutierrez, 2011), and in U.S. samples of younger and older adults (p < .001; Van Orden, Cukrowicz, Witte, & Joiner, 2012). Results indicated 10 (p < .001; Bryan, 2011), 12 (p < .001; Freedenthal et al., 2011), and 15 (p < .001; Van Orden et al., 2012) items provided reliable and acceptable fit. In an American undergraduate sample, INQ-12 perceived burdensomeness subscale scores were positively correlated with the BDI-II (p < . 01), BHS (p < .01), Modified Scale for Suicide Ideation (p < .01), Life Attitudes Schedule-Short Form (p < .01), and the Acquired Capability for Suicide Scale (ACSS; p < .05), and negatively correlated with the Multidimensional Scale of Perceived Social Support (p < .01), and Reasons for Living Inventory for Young Adults (RFL-YA; p < .01; Freedenthal et al., 2011). Similarly, in this sample, INQ-12 thwarted belongingness subscale scores were positively correlated with the BDI-II, BHS, Modified Scale for Suicide Ideation, Life Attitudes Schedule-Short Form, and negatively correlated with Multidimensional Scale of Perceived Social Support and RFL-YA (p < .01; Freedenthal et al., 2011). Higher sample BSSI scores, in a U.S. multisample study including undergraduates, young adults, and older adults, were also associated with greater thwarted belongingness and perceived burdensomeness subscale scores (p < .01; Van Orden et al., 2012). In this sample, the INQ also demonstrated predictive validity, as higher summed thwarted belongingness and perceived burdensomeness subscale scores were both associated with higher BSSI scores 1 month later (Belongingness, p < .05; Burdensomeness, p < .01; Van Orden et al., 2012). In majority male deployed U.S. military samples, INQ-10 internal consistency (Cronbach’s alpha) of belongingness was .86 and burdensomeness was .81 (Bryan, 2011; Bryan et al., 2013). In a primarily female U.S. undergraduate sample, INQ-12 Cronbach’s alpha of belongingness was .92 and burdensomeness was .93 (Freedenthal et al., 2011). In a sample containing American Indian or Alaska Natives, INQ-18 Cronbach’s alpha of belongingness was .90 and burdensomeness was .90 (O’Keefe & Wingate, 2013). Finally, in a sample including U.S. undergraduates, young adult outpatients, and older adults, INQ-25 Cronbach’s alpha of belongingness was .85 and burdensomeness was .89 (Van Orden et al., 2012). The ASIQ (Reynolds, 1991) is a 25-item self-report measure rated on 7-point item response scale. The instrument assesses frequency of suicidal thoughts, desire to die, suicidal plans, and suicidal behaviors occurring in the previous month. The potential range of total scores is 0 to 150. Higher scores indicate numerous suicidal cognitions occurring with regularity. During original scale development, in a study examining undergraduate students, the ASIQ had high reported reliability (coefficient alpha: .97), test–retest reliability (.86), contrasted groups validity (p < .001), and significant correlations (all p’s < .001) between the ASIQ and depression, hopelessness, anxiety, self-esteem, and history of prior suicide attempts (Reynolds, 1991). In a recent sample including U.S. male incarcerated psychiatric patients, convergent validities with the BHS, BSSI, and RASQ Internal subscale were demonstrated (p < .01; Horon et al., 2013). In this study, which included many men from the lowest socioeconomic statuses, sample internal consistencies (Cronbach’s alpha) ranged from .85 (i.e., cutoff score of 31) to .95 (i.e., no cutoff score; Horon et al., 2013). The Recent sample convergent validity between the SIS and the Cultural Assessment of Risk for Suicide total and subscale scores was established (p < .001; Chu et al., 2013) in a study including majority U.S. women, with inclusion of homosexual, bisexual, and transgender, populations. Further, in a majority male U.S. clinical military sample, construct validity was established with the Behavior and Symptom Identification Scale-24 (p < .001; Luxton, Rudd, Reger, & Gahm, 2011). In these studies, internal consistency (Cronbach’s alpha) was .91 (Luxton et al., 2011) and .94 (Chu et al., 2013). The An expansion of the SSI, which includes the SSI-W (worst ideation), uses the same format and scoring as the SSI-C, but focuses specifically on suicidal ideation at the worst point in one’s life (Beck, Brown, & Steer, 1997). During SSI-W scale development, in a study of psychiatric outpatients, internal consistency (Cronbach’s alpha) was .89 and the scale was associated with a history of suicide attempt (p < .001; Beck et al., 1997). In recent studies, English version interrater reliability was .89, in samples including majority male Italian adults with panic disorder (De Baradis et al., 2013) and obsessive-compulsive disorder (De Baradis et al., 2014). Further, in a sample including Finnish adults with major depressive disorder (MDD), the English version produced cutoff score dependent sensitivities ranging from .76 to .81 and cutoff score dependent specificities ranging from .68 to .92 (Vuorilehto et al., 2014). Internal consistency (Cronbach’s alpha) of the Korean version, in a sample of South Korean psychiatric in-patients, was .95 (Jon, Lee, & Park, 2013). The Internal consistency (Cronbach’s alpha) of the English version, in a sample study examining a majority of hospitalized Scottish females after a recent suicide attempt, was .86 (O’Connor, Smyth, Ferguson, Ryan, & Williams, 2013). Another study used the English version for 15% of participants, and the French version for the remaining 85%, in a sample containing an all-male Canadian incarcerated population (Naud & Daigle, 2010). In this male-incarcerated sample, a receiver operating characteristic (ROC) analysis showed the area under the curve for detecting hopelessness was .63, suicidal ideation was .66, negative self-evaluation was .64, and hostility was .64 (p < .001; Naud & Daigle, 2010). The sensitivity of the SPS in this incarcerated Canadian sample was .36 and the specificity was .85 (Naud & Daigle, 2010). The ACSS (Van Orden et al., 2008) is a 20-item self-report instrument, with scores rated on a 5-point Likert scale ranging from 0 to 4. Total scores can range from 0 to 80. The instrument assesses respondents’ fearlessness about death and acquired capability for suicide. Higher scores indicate less fear of death, greater pain tolerance, and exposure to painful and provocative events (creating acquired capability). Factor analysis, in a study of U.S. undergraduate students, demonstrated that the strongest factor loading was associated with item 19 (i.e., I am not at all afraid to die, p = .03), but all items were reasonable indicators of each factor (Ribeiro et al., 2014). In studies examining undergraduates and individuals with a history of suicide attempt, there was a strong correlation between total scores and perceived courage to make a suicide attempt (p < .001; Ribeiro et al., 2014), positive associations with exposure to painful and provocative events (p < .001; Bender, Gordon, Bresin, & Joiner, 2011; Van Orden et al., 2008), and a strong negative correlation with fear of suicide (p < .001; Ribeiro et al., 2014). Additional factor analysis, in a U.S. incarcerated male population, indicated that a four-factor model provided the best statistical and conceptual fit; three of the four factors were interpretable (i.e., general fearlessness and perceived pain tolerance, fearlessness of death, and spectator enjoyment of violence; p < .001; Smith et al., 2013). Sample internal consistencies (Cronbach’s alpha) ranged from .69 to .84 in studies including majority male U.S. deployed military personnel (.69; Bryan et al., 2013), primarily female American Indian or Alaska natives (.84; O’Keefe & Wingate, 2013), and German men exposed to violent video gaming (.84; Teismann, Fortsch, Baumgart, Het, & Michalak, 2014). The In a more recent study consisting of majority female American Indian or Alaska natives from 27 different tribes, the sample internal consistency (Cronbach’s alpha) was .88 (O’Keefe & Wingate, 2013). The During scale development, the BDI-II had high internal consistency (coefficient alpha .91) and moderate to strong convergent validities with other self-report and clinical rating scales of depression in studies including psychiatric, nonclinical young adult, and undergraduate populations (p < .001; Ball & Steer, 2003; Beck et al., 1996). While validating the scale in a sample of clinically depressed outpatients, two factors representing Somatic-Affective and Cognitive dimensions were found (p < .05), and confirmatory factor analysis supported a model in which the BDI-II reflected one underlying second-order dimension composed for two first-order factors representing cognitive and noncognitive symptoms (p < .001; Steer, Ball, Ranieri, & Beck, 1999). In a sample ROC analysis, including U.S. post-Myocardial Infarction patients, the area under the curve for diagnosing MDD was .962 (i.e., good discrimination between those with and without depression; p < .001; Huffman et al., 2010). Further, sample convergent validity with the BHS was established in a sample including primarily female U.S. young adult volunteers (p < .01; Miranda et al., 2012). Also, convergent validity with the Patient Health Questionnaire (PHQ)-9 was demonstrated in U.S. heart failure patients (p < .01; Hammash et al., 2012) and depressed Australian adults (p < .001; Titov et al., 2011). In a sample of primarily female Norwegian adults, using a cutoff score of 12, sensitivity was .85 and specificity was .88. (Kjaergarrd, Elisabeth, Wang, Waterloo, & Jorde, 2014). With a cutoff score of 14, however, in a sample of primarily male U.S. adults, sensitivity was .88 and specificity was .84 (Huffman et al., 2010. In this U.S. male sample, using a cutoff score of 16, sensitivity was .88 and specificity was .92 (Huffman et al., 2010). In a sample study examining majority Norwegian women who experienced first stroke, person-separation reliability (i.e., ability of the scale to distinguish at least three distinct groups of depression) was 1.99 (Lerdal, Kottorp, Gay, Grov, & Lee, 2014). Internal consistency (Cronbach’s alpha) ranged from .89 to 90 in studies examining U.S. heart failure patients (.89; Hammash et al., 2012), U.S. young adult volunteers (.90; Miranda et al., 2012), primarily female Norwegian adults (.89; Kjaergarrd et al., 2014), Norwegian adults with a history of stroke (.90; Lerdal et al., 2014), and depressed Australian adults (.90; Titov et al., 2011). The Arabic version of the instrument, in a sample including primarily female college students from Kuwait, demonstrated convergent validity with the Hopkins Symptoms Checklist-25 (p < .001; Al-Turkait & Ohaeri, 2010). The Arabic version, in this college sample from Kuwait, produced an internal consistency (Cronbach’s alpha) of .83 (Al-Turkait & Ohaeri, 2010). Finally, the Chinese version of the instrument, in a sample of Taiwanese outpatients with OCD, had a Cronbach’s alpha of .93 (Tzu-Chi et al., 2010). The The German version of the instrument, in a study examining majority women medical patients from the German general population, sample internal consistency (Cronbach’s alpha) was .84 (Kliem, Moble, Zenger, & Brahler, 2014). Further, in this study, the BDI-FS was positively correlated with the PHQ-9 (p < .001; Kliem et al., 2014). The During original scale development in primary care and obstetrics-gynecology studies, there was a strong association between increasing PHQ-9 scores and worsening functional status, disability days, and symptom-related difficulty (p < .05; Kroenke et al., 2001). The PHQ-9 was also positively correlated with the Mental Health Inventory (p < .05; Kroenke et al., 2001). Internal consistency (Cronbach’s alpha) was .89 and .86 in primary care and obstetrics-gynecology studies, respectively, and 48-hour test–retest reliability was .84 (Kroenke et al., 2001). In recent studies, PHQ-9 scores were strongly correlated with BDI-II scores, in samples including U.S. older adult heart failure patients (p < .01; Hammash et al., 2012) and depressed Australian adults (p < .001; Titov et al., 2011). There was also convergent validity with the BSSI in a sample of primarily female U.S. college students (p < .01; Polanco-Roman & Miranda, 2013). Sensitivity for different cutoff scores ranged from .54 to .92, in studies including U.S. cardiac and stroke patients (.54; Razykov, Zieglestein, Whooley, & Thombs, 2012), depressed U.S. adults (.69; Uebelacker, German, Baudiano, & Miller, 2011), U.S. heart failure patients (.70; Hammash et al., 2012), U.S. older adults (.88; Phelan et al., 2010), and U.S. epilepsy patients (.92, Rathore et al., 2014). In these sample studies, specificity for varying cutoff scores ranged from .74 to .92 (.74, Rathore et al., 2014; .80, Phelan et al., 2010; .84, Uebelacker et al., 2011; .90 Razykov et al., 2012; .92, Hammash et al., 2012). Finally, in a sample including depressed U.S. adults, the sensitivity of suicide Item 9 was .69 and the specificity of Item 9 was .84 (Uebelacker et al., 2011). Internal consistency (Cronbach’s alpha) ranged from .74 to .85 in samples including depressed Australian adults (.74; Titov et al., 2011), U.S. college students (.82, Polanco-Roman & Miranda, 2013; .83, Miranda et al., 2013), Hispanic American women (.84; Merz, Malcarne, Roesch, Riley, & Sadler, 2011), and U.S. heart failure patients (.85; Hammash et al., 2012). Follow-up internal consistency (Cronbach’s alpha) was .79 (Miranda et al., 2013) and .83 (Polanco-Roman & Miranda, 2013) in U.S. college student samples, and was .81 in a sample of depressed Australian adults (Titov et al., 2011). Interrater reliability was .81 in an international sample including advanced cancer patients (Lie et al., 2015). While evaluating the Iranian and Dutch versions of the instrument, in a ROC analysis, the area under the curve for diagnosing MDD was .83 in the Iranian version (p < .001; Khamseh et al., 2011) and was .87 in the Dutch version (p < .001; Zuithoff et al., 2010). The Dutch version produced a sensitivity and specificity of .82 (Zuithoff et al., 2010). In contrast, the Iranian version produced a sensitivity of .73 and specificity of .76 (Khamseh et al., 2011). Internal consistency (Cronbach’s alpha) of the Dutch version was .88 (Zuithoff et al., 2010). Cronbach’s alpha of the Iranian version was .87 (Khamseh et al., 2011). The Spanish version of the instrument, in a sample of Peruvian women, had an internal consistency (Cronbach’s alpha) of .81 (Zhong et al., 2014). Further, the Spanish version, in a sample of Spanish speaking U.S. women who emigrated from Mexico, had a Cronbach’s alpha of .85 (Merz et al., 2011). Finally, the sensitivity and specificity of suicide Item 9 in the Japanese version of the instrument was .70 and .97, respectively (Inagaki et al., 2013). The sensitivity and specificity of detecting MDD in the Japanese version with cutoff points of 4/5 were .86 and .85, respectively (Inagaki et al., 2013). The Internal consistency (Cronbach’s alpha) was .92 in a sample consisting of majority U.S. women with varying degrees of depression (Farzanfar et al., 2014) and was .84 in a study including majority male U.S. National Guard soldiers (Fine et al., 2013). Test–retest reliability (weighted Kappa) in the depressed U.S. sample was .76 (Farzanfar et al., 2014). Further, in the depressed U.S. sample, sensitivity was .82 and specificity was .90 (Farzanfar et al., 2014). In the U.S. National Guard sample, a cutoff score of 10 produced the optimal balance of sensitivity (.56) and specificity (.86; Fine et al., 2013). The During scale development, in a randomized trial including female outpatients with generalized anxiety disorder, the sensitivity of the S-STS in prospectively identifying subjects with suicidal thoughts or behaviors was 100%, as compared with the Hamilton Rating Scale for Depression suicide Item 3, which had a sensitivity of 63% (Coric et al., 2009). Criterion validity was reinforced in a more recent sample including majority Italian female undergraduate students, as individuals endorsing suicide ideation had higher S-STS global, suicide ideation subscale, and suicidal behavior subscale scores (p < .001; Preti et al., 2013). Further, in this sample, convergent validity with the General Health Questionnaire, and discriminative validity with the Rosenberg Self-Esteem Scale and Modified Social Support Survey was established (p < .001; Preti et al., 2013). Finally, in this sample, internal consistencies (Guttman’s lambda 2 for global scores, suicide ideation scores, and suicide behavior scores) ranged from .83 to .88 and test–retest consistency scores ranged from .46 to .88 (Preti et al., 2013). The During original instrument development, in studies including adolescents (nonsuicidal controls, individuals with suicidal ideation, and individuals with a recent suicide attempt) and psychiatrically distressed individuals, there were large differences on the SI-IAT between nonsuicidal persons and suicide ideators (p < .001; Nock & Banaji, 2007b), and suicide attempters (p < .001; Nock & Banaji, 2007b; Nock et al., 2010), as well as between suicide ideators and suicide attempters (p = .009; Nock & Banaji, 2007b). Further, during instrument validation testing, nonsuicidal adolescents had a negative association between self-injury and oneself, suicide ideators showed a small positive association between self-injury and oneself, and suicide attempters had a large positive association between self-injury and oneself (p < .05; Nock & Banaji, 2007b). In studies also including adult U.S. psychiatric inpatient individuals, prediction of suicide ideation and suicide attempt was accurate despite age, mood and substance disorders, hopelessness, and total number of psychiatric disorders (p < .001, Ellis, Rufino, & Green, 2016; p < .001, Nock & Banaji, 2007b; p < .05, Nock et al., 2010). Additionally, in this study of U.S. psychiatric inpatients, IAT scores were positively correlated with BSSI, BHS, and PHQ-9 scores (p < .01; Ellis et al., 2016). Moreover, in a sample including Canadian adults with suicidal ideation or recent self-harm, the Death or Life IAT significantly predicted self-harm (p = .02; Randall, Rowe, Dong, Nock, & Colman, 2013). In this Canadian sample, with a high cutoff, the Death or Life IAT sensitivity was 96.6% and specificity was 53.9%; with a low cutoff, the Death or Life IAT sensitivity was 58.6% and specificity was 96.2% (Randall et al., 2013). The The original instrument had 72-items; however, in a validation study including both clinical and nonclinical undergraduates and adults, principal-component factor analysis was applied and total items were reduced to 48 (Linehan et al., 1983). Subsequent factor analyses in nonclinical and psychiatric inpatient participants indicated that there were six primary reasons for living, encompassing the six RFL subscales (i.e., Survival and Coping Beliefs, Responsibility to Family, Child-Related Concerns, Fear of Suicide, Fear of Social Disapproval, and Moral Objections; Linehan et al., 1983). In this validation study, the RFL differentiated suicidal from nonsuicidal individuals (p < .001); specifically, in nonclinical individuals, the Fear of Suicide further differentiated between previous ideators and previous suicide attempters (p < .001; Linehan et al., 1983). Alternatively, in clinical individuals, Child-Related Concerns differentiated between current suicide ideators and current suicide attempters (p < .001; Linehan et al., 1983). However, in both clinical and nonclinical populations, Survival and Coping, Responsibility to Family, and Child-Related Concerns subscales were most useful in differentiating suicidal and nonsuicidal groups (Linehan et al., 1983).
In recent studies, internal consistency (Cronbach’s alpha) was .94 in a sample including majority female U.S. older adults (Segal, Marty, Meyer, & Coolidge, 2012) and was .96 in a sample containing African American mothers with a history of suicide attempt (Woods et al., 2013). In the study including U.S. older adults, Cronbach’s alpha for survival and coping beliefs was .94, responsibility to family was .86, child-related concerns was .78, fear of suicide was .78, fear of social disapproval was .81, and moral objections was .82 (Segal et al., 2012).
The English version was also translated to Malay in one study, and factor analysis confirmed six primary reasons for living (p < .001; Aishvarya et al., 2014). The Malaysian version was positively correlated with the Positive And Negative Suicide Ideation Inventory (PANSI-Positive), Rosenberg Self-Esteem Scale (RSE), Adult Trait Hope Scale (ATH), Provision of Social Relations (PSR), and Satisfaction with Life (SWL) (p < .001; Aishvarya et al., 2014). Further, in this study, the Malaysian version was negatively correlated with the Depression Anxiety Stress Scale (DASS), BHS, and PANSI-Negative (p < .001; Aishvarya et al., 2014). Internal consistency (Cronbach’s alpha) was .94 (Aishvarya et al., 2014).
Discussion
Approximately 90% of unplanned suicide attempts and 60% of planned first attempts occur within 1 year of the onset of suicidal ideation (American Psychiatric Association[APA], 2010). Thus, identifying psychometrically tested suicide risk assessment instruments with the strongest psychometric properties are paramount in recognizing individuals at high risk of suicide. There is not a universal set of strategies for suicidal ideation detection; however, the WHO (2014) recommends assessment of emotional distress, early identification of mental disorders and alcohol misuse, and reduction in access to the most prevalent means. In contrast, U.S. National Guideline Clearinghouse recommendations (NGC, 2014) state that suicide risk assessment involves a clinical interview with subsequent administration of Beck’s Hopelessness, Suicidal Ideation and Suicide Intent scales, BDI, and the Hamilton Rating Scale for Depression.
U.S. suicide risk assessment recommendations are based on the following levels of evidence: C (i.e., studies rated as 2+ , case control or cohort studies), D (i.e., evidence level 3 or 4), Q (i.e., qualitative studies with appropriate quality), and Good Clinical Practice (i.e., based on clinical experience; NGC, 2014). Sample articles support a portion of these national guidelines, as the SIS, SSI, and BDI-II had the first, second, and fourth highest internal consistencies, respectively, and administration of these instruments is equally feasible. Moreover, guideline levels of evidence and sample levels are similar, as 7.8% and 92.2% of included studies represent levels of Evidences 2 and 3, respectively.
Although there are conflicting theories regarding suicide, and several models were identified in this review, analysis of the sample population partially supports the IPTS. In studies that contained a majority of male participants (n = 11), over 36% of men were in the military and more than 27% were incarcerated, and it is likely these groups of men in particular had experienced or perpetrated traumatic events. Repeated provocative exposures, such as traumatic events, may create less fear of pain, injury, and death. Additionally, provocative exposures may potentiate feelings of low belongingness, particularly if the individual is removed from a familiar environment and placed in a combat or prison environment with others experiencing simultaneous stress.
Moreover, several sample participants indicated family members had attempted suicide or died by suicide. These psychologically painful experiences may also accelerate feelings of social isolation and decrease fear of death. Finally, over 30% of all participants sampled had made a previous suicide attempt, which is likely a conservative estimate because not all studies inquired about previous suicide attempts. Repeated self-injury supports gradual habituation toward increasingly lethal self-harm. This group of individuals, if expressing the desire to die by suicide, as endorsed by positive suicidal ideation on screening instruments, may increasingly develop the ability to die by suicide.
Although several countries have established suicide risk assessment screening instrument recommendations, current suicide risk assessment tools do not contain guidelines for mental health clinicians on how to tailor risk assessment for diverse patient populations, including sexual orientation, race or ethnicity, and religious diversity, which is problematic because willingness to report suicidal behavior varies by age, sex, race or ethnicity, and religion (American Psychological Association, 2012; WHO, 2014). The IPTS suggests that individuals gradually become more vulnerable, and insensitive screening practices may precipitate feelings of isolation and being misunderstood by others.
In the current sample, African American, Asian American, American Indian, Hispanic American, Latino(a) American, Pacific Islanders, Mestizos, Peruvian, Malay, Chinese, Korean, Japanese, Taiwanese, Pakistani, Australian, German, United Kingdom, Canadian, Norwegian, Dutch, Italian, Austrian, Finnish, Scottish, Arab, Iranian, and Indian populations were represented. Additionally, in one sample, transgender, homosexual, and bisexual populations were represented. Inclusion of diverse populations permitted preliminary evaluations of psychometric properties in these important populations.
However, although there was inclusion of diverse populations and non-English language versions (i.e., Chinese, Japanese, Korean, Urdu, Arabic, Iranian, Malay, Spanish, French, Dutch, and German), these diverse populations and language versions remain underrepresented. Further, some studies utilized validated non-English versions, whereas other studies used a translator to administer a translated version, and additional evaluations will more firmly establish psychometrics in these non-English versions.
The majority (68.6%) of studies used the English language version (n = 35). Of the 35 studies using the English version, 65.7% included U.S. populations (n = 23). This results in questionable usefulness and generalizability in clinical practice because many of the instruments are tested on nonrepresentative samples and have not been adequately tested in important subpopulations (APA, 2010). Generalizablity may further be limited if the English version of an instrument is applied in a country where several languages are spoken, and where cultural and religious perspectives are diverse. Thus, it is important to additionally test non-English versions of the instrument in studies that include more diverse populations, to validate psychometrics, and improve generalizability.
Comprehensive search strategies were employed in this review; however, the included suicide risk assessment instruments do not represent an exhaustive list. Although this work adds to the psychometric properties outlined in Brown’s (2001) review, includes additional screening instruments (i.e., INQ, ACSS, HDSQ-SS, BDI-FS, PHQ-9, TLC-PHQ-9, S-STS and Implicit Association Test), includes international populations, and integrates studies with varying methodologies, Brown provides a comprehensive summary of additional instruments. Particularly, one of the U.S. national guideline recommended instruments, the Hamilton Rating Scale for Depression, did not appear as a psychometrically tested suicide risk instrument during the search, but it is described in Brown’s (2001) review.
In addition, another potential limitation involves the low levels of evidence that support previously discussed results. Another obstacle involves relying so profoundly on the individual’s self-report in determining the effectiveness of a suicide risk assessment instrument. Only one sample instrument, the Implicit Association Test, did not involve participants relaying thoughts of self-harm and instead utilized computer-based assessments. Perhaps there is consistent underreporting of suicidal ideation in those who have acquired the ability and strongly desire to die by suicide. The resulting effects are practices and recommendations that are largely based on observational studies, validated primarily by participant self-report.
Conclusion
To confirm suicide risk assessment instruments’ psychometric properties and improve generalizability, more diverse population representation and additional representation of non-English versions in studies is required. Including underrepresented groups and non-English instruments will promote enhanced culturally and linguistically sensitive suicidal ideation and suicidal behavior instruments that may better predict risk. Additional research of underrepresented groups may also reduce the feelings of isolation and burdensomeness experienced by those with suicide ideation, if these groups are equally represented and included. Addressing existing research gaps may reduce morbidity (i.e., perceived burdensomeness and thwarted belongingness) and mortality (i.e., the acquired ability for lethal self-harm). Finally, addressing these research gaps will be important in understanding the social, cultural, economic, and political context of suicide.
Footnotes
Appendix A: Literature Table
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
