ABSTRACT
Background: Suicide risk prediction models frequently rely on structured electronic health record (EHR) data, including patient demographics and health care usage variables. Unstructured EHR data, such as clinical notes, may improve predictive accuracy by allowing access to detailed information that does not exist in structured data fields. To assess comparative benefits of including unstructured data, we developed a large case-control dataset matched on a state-of-the-art structured EHR suicide risk algorithm, utilized natural language processing (NLP) to derive a clinical note predictive model, and evaluated to what extent this model provided predictive accuracy over and above existing predictive thresholds.
Methods: We developed a matched case-control sample of Veterans Health Administration (VHA) patients in 2017 and 2018. Each case (all patients that died by suicide in that interval, n = 4,584) was matched with 5 controls (patients who remained alive during treatment year) who shared the same suicide risk percentile. All sample EHR notes were selected and abstracted using NLP methods. We applied machine-learning classification algorithms to NLP output to develop predictive models. We calculated area under the curve (AUC) and suicide risk concentration to evaluate predictive accuracy overall and for high-risk patients.
Results: The best performing NLP-derived models provided 19% overall additional predictive accuracy (AUC = 0.69; 95% CI, 0.67, 0.72) and 6-fold additional risk concentration for patients at the highest risk tier (top 0.1%), relative to the structured EHR model.
Conclusions: The NLP-supplemented predictive models provided considerable benefit when compared to conventional structured EHR models. Results support future structured and unstructured EHR risk model integrations.
J Clin Psychiatry 2023;84(4):22m14568
Author affiliations are listed at the end of this article.
Continue Reading...
Did you know members enjoy unlimited free PDF downloads as part of their subscription? Subscribe today for instant access to this article and our entire library in your preferred format. Alternatively, you can purchase the PDF of this article individually.
References (52)
- NIMH. Suicide is a Leading Cause of Death in the United States. https://www.nimh.nih.gov/health/statistics/suicide. Published March 2022.
- Hedegaard H, Curtin S, Warner M. Increase in Suicide Mortality in the United States, 1999–2018. National Center for Health Statistics (US). https://stacks.cdc.gov/view/cdc/86670. 2020.
- Office of Mental Health and Suicide Prevention. 2021 National Veteran Suicide Prevention Annual Report. https://www.mentalhealth.va.gov/docs/data-sheets/2021/2021-National-Veteran-Suicide-Prevention-Annual-Report-FINAL-9-8-21.pdf. 2021.
- Rubin R. Task Force to Prevent Veteran Suicides. JAMA. 2019;322(4):295. PubMed CrossRef
- Carroll D, Kearney LK, Miller MA. Addressing suicide in the veteran population: engaging a public health approach. Front Psychiatry. 2020;11:569069. PubMed CrossRef
- VA Health Care. REACH VET, Predictive Analytics for Suicide Prevention. https://www.dspo.mil/Portals/113/Documents/2017%20Conference/Presentations/REACH%20VET%20Predictive%20Modeling.pdf?ver=2017-08-10-132615-843. 2017.
- Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. PubMed CrossRef
- McCarthy JF, Bossarte RM, Katz IR, et al. Predictive modeling and concentration of the risk of suicide: implications for preventive interventions in the US Department of Veterans Affairs. Am J Public Health. 2015;105(9):1935–1942. PubMed CrossRef
- Kessler RC. Clinical epidemiological research on suicide-related behaviors-where we are and where we need to go. JAMA Psychiatry. 2019;76(8):777–778. PubMed CrossRef
- Cook BL, Progovac AM, Chen P, et al. Novel use of Natural Language Processing (NLP) to predict suicidal ideation and psychiatric symptoms in a text-based mental health intervention in Madrid. Comput Math Methods Med. 2016:8708434. PubMed CrossRef
- Poulin C, Shiner B, Thompson P, et al. Predicting the risk of suicide by analyzing the text of clinical notes. PLoS One. 2014;9(1):e85733. PubMed CrossRef
- Levis M, Levy J, Dufort V, et al. Leveraging unstructured electronic medical record notes to derive population-specific suicide risk models. Psychiatry Res. 2022;315:114703. PubMed CrossRef
- Levis M, Leonard Westgate C, Gui J, et al. Natural language processing of clinical mental health notes may add predictive value to existing suicide risk models. Psychol Med. 2021;51(8):1382–1391. PubMed CrossRef
- Tsui FR, Shi L, Ruiz V, et al. Natural language processing and machine learning of electronic health records for prediction of first-time suicide attempts. JAMIA Open. 2021;4(1):b011. PubMed CrossRef
- VA DoD. Center of Excellence for Suicide Prevention. Joint Department of Veterans Affairs (VA) and Department of Defense (DoD) Mortality Data Repository - National Death Index (NDI). MIRECC website. https://www.mirecc.va.gov/suicideprevention/documents/VA_DoD-MDR_Flyer.pdf. Accessed December 31, 2020.
- Lacy MG. Efficiently studying rare events: case-control methods for sociologists. Sociol Perspect. 1997;40(1):129–154. CrossRef
- Torous J, Larsen ME, Depp C, et al. Smartphones, sensors, and machine learning to advance real-time prediction and interventions for suicide prevention: a review of current progress and next steps. Curr Psychiatry Rep. 2018;20(7):51. PubMed CrossRef
- Beel J, Gipp B, Langer S, et al. Research-paper recommender systems: a literature survey. Int J Digit Libr. 2016;17(4):305–338. CrossRef
- Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manage. 1988;24(5):513–523. CrossRef
- Sun W, Cai Z, Li Y, et al. Data processing and text mining technologies on electronic medical records: a review. J Healthc Eng. 2018:4302425. PubMed CrossRef
- Pimpalkar AP, Retna Raj RJ. Influence of pre-processing strategies on the performance of ml classifiers exploiting TF-IDF and BOW features. ADCAIJ Adv Distrib Comput Artif Intell J. 2020;9(2):49–68. CrossRef
- Young T, Hazarika D, Poria S, et al. Recent trends in deep learning based natural language processing [review article]. IEEE Comput Intell Mag. 2018;13(3):55–75. CrossRef
- Lemon SC, Roy J, Clark MA, et al. Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann Behav Med. 2003;26(3):172–181. PubMed CrossRef
- Ho TK. Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. 1995;1:278-282.
- Chen T, Guestrin C. XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM; 2016:785-794. doi:10.1145/2939672.2939785
- Strobl C, Boulesteix AL, Augustin T. Unbiased split selection for classification trees based on the Gini Index. Comput Stat Data Anal. 2007;52(1):483–501. CrossRef
- Zhang H. Exploring conditions for the optimality of Naive Bayes. Int J Pattern Recognit Artif Intell. 2005;19(02):183–198. CrossRef
- Hosmer DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression. 1st ed. Wiley; 2013.
- Susan S, Kumar A. The balancing trick: optimized sampling of imbalanced datasets: a brief survey of the recent state of the art. Eng Rep. 2021;3(4). CrossRef
- Qi Z. The text classification of theft crime based on TF-IDF and XGBoost model. In: 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA). IEEE; 2020:1241–1246. doi:10.1109/ICAICA50127.2020.9182555
- Wang Y, Wang X-J. A new approach to feature selection in text classification. In: 2005 International Conference on Machine Learning and Cybernetics. IEEE; 2005;6:3814–3819. doi:10.1109/ICMLC.2005.1527604
- Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf?ref=https:/. 2011;12:2825–2830.
- Kanakaraj M, Guddeti RMR. Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques. In: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015). IEEE; 2015:169-170. doi:10.1109/ICOSC.2015.7050801
- Bostwick JM, Pabbati C, Geske JR, et al. Suicide attempt as a risk factor for completed suicide: even more lethal than we knew. Am J Psychiatry. 2016;173(11):1094–1100. PubMed CrossRef
- Costa L da S, Alencar ÁP, Nascimento Neto PJ, et al. Risk factors for suicide in bipolar disorder: a systematic review. J Affect Disord. 2015;170:237–254. PubMed CrossRef
- Stene-Larsen K, Reneflot A. Contact with primary and mental health care prior to suicide: a systematic review of the literature from 2000 to 2017. Scand J Public Health. 2019;47(1):9–17. PubMed CrossRef
- Henson KE, Brock R, Charnock J, et al. Risk of suicide after cancer diagnosis in England. JAMA Psychiatry. 2019;76(1):51–60. PubMed CrossRef
- Van Orden KA, Cukrowicz KC, Witte TK, et al. Thwarted belongingness and perceived burdensomeness: construct validity and psychometric properties of the Interpersonal Needs Questionnaire. Psychol Assess. 2012;24(1):197–215. PubMed CrossRef
- Anestis MD, Houtsma C. The association between gun ownership and statewide overall suicide rates. Suicide Life Threat Behav. 2018;48(2):204–217. PubMed CrossRef
- Borges G, Bagge CL, Cherpitel CJ, et al. A meta-analysis of acute use of alcohol and the risk of suicide attempt. Psychol Med. 2017;47(5):949–957. PubMed CrossRef
- Pavarin RM, Sanchini S, Tadonio L, et al. Suicide mortality risk in a cohort of individuals treated for alcohol, heroin or cocaine abuse: results of a follow-up study. Psychiatry Res. 2021;296:113639. PubMed CrossRef
- Peltzman T, Gottlieb DJ, Shiner B, et al. Electroconvulsive therapy in Veterans Health Administration hospitals: prevalence, patterns of use, and patient characteristics. J ECT. 2020;36(2):130–136. PubMed CrossRef
- Watts BV, Peltzman T, Shiner B. Electroconvulsive therapy and death by suicide. J Clin Psychiatry. 2022;83(3):21m13886. PubMed CrossRef
- Kessler RC, Bossarte RM, Luedtke A, et al. Suicide prediction models: a critical review of recent research with recommendations for the way forward. Mol Psychiatry. 2020;25(1):168–179. PubMed
- Ganzini L, Denneson LM, Press N, et al. Trust is the basis for effective suicide risk screening and assessment in veterans. J Gen Intern Med. 2013;28(9):1215–1221. PubMed CrossRef
- Husky MM, Zablith I, Alvarez Fernandez V, et al. Factors associated with suicidal ideation disclosure: results from a large population-based study. J Affect Disord. 2016;205:36–43. PubMed CrossRef
- Kleiman EM, Nock MK. New directions for improving the prediction, prevention, and treatment of suicidal thoughts and behaviors among hospital patients. Gen Hosp Psychiatry. 2020;63:1–4. PubMed CrossRef
- Kessler RC, Bernecker SL, Bossarte RM, et al. The role of big data analytics in predicting suicide. In: Passos IC, Mwangi B, Kapczinski F, eds. Personalized Psychiatry. Springer International Publishing; 2019:77–98.
- McCarthy JF, Cooper SA, Dent KR, et al. Evaluation of the Recovery Engagement and Coordination for Health-Veterans Enhanced Treatment suicide risk modeling clinical program in the Veterans Health Administration. JAMA Netw Open. 2021;4(10):e2129900. PubMed CrossRef
- Loh HW, Ooi CP, Seoni S, et al. Application of explainable artificial intelligence for healthcare: a systematic review of the last decade (2011–2022). Comput Methods Programs Biomed. 2022;226:107161. PubMed CrossRef
- Devlin J, Chang MW, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. ArXiv181004805 Cs. arXiv website. https://arxiv.org/abs/1810.04805. Published online May 24, 2019. Accessed February 14, 2022.
- Andrade C. Mean difference, standardized mean difference (SMD), and their use in meta-analysis: as simple as it gets. J Clin Psychiatry. 2020;81(5):20f13681. PubMed CrossRef
Members enjoy free PDF downloads on all articles.
Save
Cite
Already a member? Login
Advertisement
GAM ID: sidebar-top