Abstract
Objective
People with diabetes have a higher risk of suicidal behaviors than the general population. However, few studies have focused on understanding this relationship. We investigated risk factors and predicted suicide attempts in people with diabetes using Least Absolute Shrinkage and Selection Operator (LASSO) regression.
Method
Data was retrieved from Cerner Real-World Data and included over 3 million diabetes patients in the study. LASSO regression was applied to identify associated factors. Gender, diabetes-type, and depression-specific LASSO regression models were analyzed.
Results
There were 7764 subjects diagnosed with suicide attempts with an average age of 45. Risk factors for suicide attempts in diabetes patients were American Indian or Alaska Native race (β = 0.637), receiving atypical antipsychotic agents (β = 0.704), benzodiazepines (β = 0.784), or antihistamines (β = 0.528). Amyotrophy was negatively associated with suicide attempts in males (β = 2.025); in contrast, amyotrophy significantly increased the risk in females (β = 3.339). Using a MAOI was negatively related to suicide attempts in T1DM patients (β = 7.304). Age less than 20 was positively associated with suicide attempts in depressed (β = 2.093) and non-depressed patients (β = 1.497). The LASSO model achieved a 94.4% AUC and 87.4% F1 score.
Conclusions
To our knowledge, this is the first study to use LASSO regression to identify risk factors for suicide attempts in patients with diabetes. The shrinkage technique successfully reduced the number of variables in the model to improve the fit. Further research is needed to determine cause-and-effect relationships. The results may help providers to identify high-risk groups for suicide attempt among diabetic patients.
Introduction
Diabetes mellitus (DM) is a chronic non-communicable disease characterized by high blood sugar. People with diabetes have higher rates of depression, anxiety, and severe psychological distress than non-diabetic people.1-4 In 2020, the National Survey on Drug Use and Health reported that 1.2 million Americans aged 18 and older, or 0.5% of the US population, attempted suicide in the past year. 5 Research has shown that people with type 1 diabetes mellitus (T1DM) are three to four times more likely to attempt suicide than the general population. 5 Similarly, people with newly diagnosed type 2 diabetes mellitus (T2DM) have a rate of suicide attempts two times greater than the general population. 6 Suicide attempts are significantly associated with the diagnosis of depressive disorders, which is prevalent in people with diabetes.7,8 Studies have found that 38.2% of adults with T2DM were diagnosed with major depressive disorder, and the prevalence of suicide attempts among these adults was significantly higher (21.8%) than that of the general adult population (0.5%). 8
While suicide attempts significantly impact people with diabetes, only a few studies have focused on understanding the relationship between suicide attempts and diabetes. 9 Previous studies applied logistic regression analysis to identify factors that increased the risk of attempted suicide among people with diabetes, such as females,5,10 younger age, 9 normal body mass index (BMI), 9 depression,5,10,11 history of childhood abuse, 5 history of alcohol abuse, 5 and higher insulin in cerebrospinal fluid. 12
It is widely acknowledged that the comorbidity of depression with diabetes is considered a higher risk factor for suicide attempts (HR = 5.64; 95% CI 4.70–6.77).13-18 However, there are several limitations to using logistic regression analysis. First, the number of observations should be more than the number of features. Second, logistic regression may not work well with a complex relationship when many variables are considered in the model, such as medical data, because a high number of variables in the logistic regression can lead to overfitting. 19 LASSO (Least Absolute Shrinkage and Selection Operator) is a regularization technique used to reduce overfitting in a logistic regression model. It works by adding a penalty term (λ) to the log-likelihood function, which shrinks the estimates of the coefficients. This process allows LASSO to act as a feature selection method, effectively setting coefficients that contribute most to the error to zero.20-22
The aim of this study was to identify factors associated with attempted suicide in individuals with diabetes. To achieve this goal, we first conducted a literature review to identify potential risk factors for inclusion as variables in our study. We then used multivariate and LASSO regression analyses to determine which factors were statistically significant in predicting suicide attempts among people with diabetes, using data from the Cerner Real-World Data™ (CRWD). Our long-term objective is to develop a clinical decision support system for healthcare providers to identify patients with diabetes who are at high risk of attempting suicide. This system would aid in providing suicide prevention measures during outpatient clinic visits.
Method
Risk factor identification via literature review
We conducted a literature review to identify potential risk factors for suicide attempts in people with diabetes and created a list of related data elements. The first objective of the literature review was to identify risk factors for suicide attempts among people with diabetes (Figure 1). We searched articles from three databases, including PubMed, Scopus, and PsycINFO, using MeSH terms: “suicide, attempted,” “diabetes mellitus, type 1”, “diabetes mellitus, type 2,” and “risk factors.” The results were limited to 31 December 2021. There were few studies focusing on identifying risk factors for suicide attempts and diabetes. Thus, we expanded the results by using “suicide” instead of “suicide, attempted” based on the MeSH hierarchy. Articles were included if they met the following criteria: (1) observational studies, including cross-sectional, case-control, and cohort studies; (2) suicidal behaviors diagnosed by standardized suicide behaviors assessment tools; (3) reporting the association between T1DM or T2DM and suicide; (4) studies reporting specific risk factors of comorbid suicide with T1DM or T2DM; (5) the subjects that were diagnosed with T1DM or T2DM at the baseline; (6) studies published in English. We excluded articles if they met the following criteria: (1) studies reporting risk factors for neither suicide nor diabetes (T1DM and T2DM); (2) unavailable full texts; (3) reviews, case reports, case series, letters to the editor, and abstract publications; (4) gestational diabetes. We excluded gestational diabetes because the pathophysiology of gestational diabetes is different from T1DM and T2DM.
23
Also, the physiology of pregnant women changes throughout the pregnancy.
24
Flow diagram of literature review to identify risk factors of suicidal behavior among people with diabetes.
The second objective of the literature review was to identify risk factors for depression among people with diabetes (Figure 2). We found that previous studies focusing on the relationship between suicide attempts and diabetes were limited, and depression is a major risk factor for suicide attempts among people with diabetes. Therefore, we considered the possible risk factors, not only the relationship between suicide attempts and diabetes but also the relationship between depression and diabetes. We searched articles from three databases, including PubMed, Scopus, and PsycINFO, using MeSH terms: “depression,” “diabetes mellitus, type 1”, “diabetes mellitus, type 2,” and “risk factors.” The results were limited to 31 December 2021. Last, we combined the results from both literature reviews and created a list of candidate risk factors. Flow diagram of literature review to identify risk factors of depression among people with diabetes.
Data collection
We extracted data elements from Cerner Real-World Data (CRWD) from 2010 to 2020 using the International Classification of Diseases, Ninth/Tenth Revision, Clinical Modification (ICD-9/10-CM). CRWD is a large data source consisting of approximately 100 million de-identified patients and encompasses 1.5 billion HIPAA-compliant patient encounters from 117 hospitals’ electronic health records (EHR) across the United States. CRWD contains longitudinal data, including demographics, encounters, clinical events, conditions, immunization, labs, measurements, medications, order lists, procedures, and results. The diagnoses of patients use coding systems, including ICD-9/10-CM and Systemized Nomenclature of Medicine - Clinical Terms (SNOMED). 25 A quantitative analysis was performed to determine the prevalence and risk factors of suicide attempts among diabetes patients. We limited data from 2010 to 2020 to control the data quality because we found that data before 2010 had incomplete information and timestamp errors. To develop our dataset, we extracted CRWD through HealtheDataLab™, a data science ecosystem built and deployed on Amazon Web Services. 26 HealtheDataLab™ uses Jupyter notebook. 27 as a web-based interactive interface for analysis and offers open-source tools, such as Structured Query Language (SQL), Python, and R in the Apache Spark environment. 28
Statistical analysis
We determined the minimum number of study participants needed to achieve sufficient statistical power. Our calculations were based on an estimated population prevalence of 7.6%, a prevalence of 0.2% in the study group, a significance level (Alpha) of 0.05, a type II error rate (Beta) of 0.05, and a desired power of 0.95. The results of our analysis indicate that a minimum of 64 participants is required for the study. In order to identify potential risk factors, our dataset was analyzed using R programming language version 4.0.2 on the HealtheDataLab™ platform. We employed Chi-square tests with 95% confidence intervals to assess the association between different variables and suicide attempts among patients with diabetes. Variables with more than 50% missing values were removed from the analysis. 29 Missing values of less than 50% were imputed using Multivariate Imputation by Chained Equations (MICE) method. 30 The dataset was divided into two sets, with 90% used for training and 10% for testing. Random Over-Sampling Examples (ROSE), a bootstrap-based technique, was used to address the problem of class imbalance in the training dataset, i.e. when there is an unequal distribution of a class of suicide attempts and a class of non-suicide attempts. Using an imbalanced dataset may cause a loss of necessary information from the minority class to build the model, resulting in biased prediction and misleading accuracy. ROSE over-samples the minority class (suicide attempts) by randomly selecting and duplicating instances until the class distribution is balanced. This process creates a new training dataset with equal numbers of instances for both classes, allowing the classifier to learn both classes equally. 31
To improve the performance and determine the most suitable model for our study, we developed two predictive models. First was a logistic regression model fitted using the “glmnet” package and the “glm” function. This model was used to determine risk factors by analyzing the association between variables and suicide attempts among patients with diabetes and determining the adjusted odds ratios (aOR) with 95% confidence intervals (CI). The model was then tested for prediction using the test data set.
The second model was a LASSO regression model, which was fitted using the “cv.glmnet” function. This model used feature selection to minimize the overfitting problem caused by a high number of variables. LASSO regularization was applied by adding a penalty term (λ) to the log-likelihood function and setting the coefficients that contributed most to the error to zero. Ten-fold cross-validation was used to select the largest λ within one standard error of the minimum binomial deviance. The best cross-validated λ was 0.00079, used in the final model to determine risk factors with coefficients.
To interpret the results from the LASSO regression model, the magnitude of the coefficients (
In order to further understand the potential risk factors for suicide attempts among patients with diabetes, we conducted subgroup analyses specific to gender and diabetes type. Some studies have suggested that women may have a higher risk of suicide attempts than men due to a higher prevalence of certain mental disorders such as anxiety, depression, and eating disorders. 32 However, other studies have presented contradictory findings. 32 Similarly, there is conflicting evidence on whether patients with T1DM or T2DM have a higher risk of suicide. 33 To address these inconsistencies, we utilized LASSO regression analysis to analyze the differences between gender, diabetes type, and depression subgroups.
Results
Patients’ characteristics
After querying data using ICD-9/10 and text searching, we identified 52,217,517 unique patients in the CRWD. A total of 3,266,856 individual diabetes patients were included in our dataset from 01-01-2010 to 31-12-2020. Of 3,266,856 patients diagnosed with diabetes, 7764 (0.2%) were diagnosed with suicide attempts. For the non-diabetic group, 80,035 of 42,324,121 patients (0.2%) were diagnosed with suicide attempts. The average age of diabetes patients with suicide attempts was 45 years (SD = 16, 8-88), and 47% were aged 41–60. Approximately half of them were female (57%), never married (52%), and. BMI ≥30 kg/m2 (54%). Most of them were white (74%) and non-Hispanic (74%). The prevalence of diabetes patients with suicide attempts who had a family history of psychiatric illness was 9%. Less than 5% of them had a death of family members or divorce and a history of childhood abuse.
LASSO regression analysis
Risk factors for suicide attempts in diabetes patients comparing between models.
Gender-specific analysis
Coefficients of gender-, type of diabetes-, and depression-specific LASSO regression analysis.
Diabetes type-specific analysis
T2DM-specific LASSO regression model had 87.9% accuracy and 94.4% AUC. Depression had the highest coefficient (
Depression-specific analysis
The LASSO regression model for depressed diabetes patients had 44.0% accuracy and 86.1% AUC. Age ≤20 years had the highest coefficient (
Discussion
To our knowledge, this is the first study that used LASSO regression analysis with the largest dataset to identify associated factors of suicide attempts in people with diabetes. We conducted a literature review to identify candidate risk factors as variables. Then, we performed multivariate and LASSO regression analyses to determine statistically significant risk factors for suicide attempts among people with diabetes using the dataset from the Cerner Real-World Data™. We compared the performances between the multivariate regression model and the LASSO regression model. The accuracy and AUC of those two models were similar. However, the multivariate model had a low precision (1.7%) and F1 score (3.3%). The LASSO model had 87.8% precision related to a low false positive rate and achieved an 87.4% F1 score or weighted average of precision and recall. Overall, we found that the LASSO model performed better than the multivariate regression model, which suggests that LASSO was a better approach for this study. After analyzing data using LASSO regression, we discovered several known and unknown risk factors for suicide attempts in people with diabetes. We also identified risk factors for suicide attempts among each subgroup: males with diabetes, females with diabetes, T1DM, T2DM, people with diabetes and depression, and people with diabetes and non-depression.
Significant findings
We discovered additional risk factors for suicide attempts among people with diabetes that were not considered as known risk factors in the literature, such as being an American Indian or Alaska Native (
Amyotrophy and gender
Diabetic amyotrophy is a rare diabetes complication (<1%) characterized by weakness and wasting of the muscles in the legs and hips, as well as pain and numbness in these areas. 40 Our study found that amyotrophy had a negative association with suicide attempts in males with diabetes. In contrast, it had a positive association with suicide attempts in females with diabetes. To our knowledge, no study has identified the relationship between diabetes amyotrophy and suicide. Erlangsen et al. studied more than 7 million individuals to determine the relationship between neurological disorders and suicide. The study reported that individuals diagnosed with neurological disorders had a 1.8 times higher incidence rate of suicide than those without neurological disorders, and the incidence rate increased to 4.9 times for individuals with amyotrophic lateral sclerosis. 41
MAOI and diabetes
When comparing T1DM and T2DM subgroups, we found that T1DM patients who received MAOI had a negative association with suicide attempts. Emory and Mizrahi studied MAOI in patients with T1DM and depression. 42 They explained that MAOI in T1DM patients increases dopamine levels, insulin signaling, and catecholamines levels, which leads to reduced HbA1c and blood sugar. MAOI also increases serotonin levels, which consequently leads to improved depression symptoms. This suggests T1DM patients who received MAOI may have decreased risk of suicide attempts.
Age and depression
Depression is known as a major risk factor for suicide.13-18 Therefore, we decided to compare depressed and non-depressed diabetes patients with suicide attempts. We found that patients with diabetes aged less than 20 had the highest association with suicide attempts for both groups. Gómez-Peralta et al. found that younger patients with diabetes had three to four times increased odds of attempted suicide over the older population (44.75 ± 14.01, p = 0.001). 9 Robinson et al. also supported that patients with diabetes aged between 15 and 25 had a three-fold increase in attempted suicide compared to non-diabetic people. 43 The reasons may be explained by the challenges of managing diabetes at a younger age. In addition, factors such as feeling like a burden to family members or caregivers, the daily requirement for insulin injections, blood glucose monitoring, and dietary restrictions may contribute to suicidal behaviors.43-45 However, further research is needed to explore the role of age in suicidal behaviors in people with diabetes.
Limitations
Our study has some limitations that should be noted. Firstly, as a cross-sectional study, our findings cannot establish a cause-and-effect relationship between risk factors and suicide attempts among those with diabetes. Secondly, the majority of participants were White and non-Hispanic, which means that the findings may have limited generalizability. Thirdly, electronic health records were used as the data source, which may contain incomplete, inconsistent, or inaccurate information. However, we controlled data quality by removing risk factors with high amounts of missing or outlier values. Finally, the proportion of suicide attempts among diabetes patients in our study was lower than what has been reported in other studies. However, the sample size of our study has sufficient power (>95%) to detect a significant difference between diabetes patients with and without suicide attempts.
Conclusion
To summarize, our study aimed to identify the factors associated with suicide attempts in people with diabetes by analyzing a large dataset using LASSO regression. We uncovered several associated factors, including new findings such as being an American Indian or Alaska Native, antihistamines, atypical agents, and benzodiazepines. Additionally, amyotrophy was associated with suicide attempts in both males and females with diabetes. Furthermore, our study found that T1DM who received MAOI medication had a negative association with suicide attempts. Lastly, patients with diabetes aged less than 20 had the highest association with suicide attempts for both patients with and without depression. Our findings can be utilized by healthcare providers as clinical decision support to identify high-risk groups for suicide attempts among diabetes patients and provide necessary preventive measures during outpatient clinic visits. However, it should be noted that our study was cross-sectional and could not establish causality. Further research in the form of cohort studies is needed to study cause-and-effect relationships between risk factors and suicide attempts in diabetes patients.
Footnotes
Authors’ Contributions
Ploypun Narindrarangkura: Conceptualization, Methodology, Software, Data curation, Writing- Original draft preparation, Visualization, Investigation. Ploypun Narindrarangkura: Supervision. Patricia E. Alafaireet, Uzma Khan, Kim Min Soon: Writing- Reviewing and Editing.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Agency for Healthcare Research and Quality [R21HS028032]; and the National Institute of Diabetes and Digestive and Kidney Disorders [P30DK092950].
