Doctoral Candidate Seattle Pacific University Seattle, Washington, United States
Abstract Text: ChatGPT, a large language model created by OpenAI, has the potential to contribute to the mental health field with its ability to generate advanced language predictions. The use of ChatGPT in the mental health sphere has raised ethical concerns regarding data privacy, biases, and overreliance on technology. Despite these challenges, ChatGPT's advanced capabilities could potentially transform access to care by reducing clinician burden. One such capability includes assessment of self-directed violence (SDV), including suicidal and non-suicidal behavior. Accurate classification is crucial because both branches require different treatment and hospitalization-related decisions, yet research suggests that half of the people who die by suicide make a health care visit within 1 month of their death (Ribeiro et al., 2017). The use of ChatGPT at the assessment stage may aid clinicians in identifying high-risk clients, allowing for rapid and appropriate evidence-based intervention. However, research on Chat GPT’s clinical capabilities is limited, with extant research evaluating ChatGPT’s ability to accurately assess suicide risk (Levkovich & Elyoseph, 2023). No study has evaluated ChatGPT’s ability to distinguish between different types of SDV. The aim of this exploratory study was to a) determine to what extent acts of SDV are classified correctly according to the Self-Directed Violence Classification System (SDVCS) and b) compare results to the performance of human samples. ChatGPT (version 3.5) was tasked with classifying 13 clinical vignettes describing different acts of SDV and giving a 2-4 sentence rationale for each answer. Vignettes included descriptions of the following: suicide attempt, suicide attempt interrupted by self or other, suicidal ideation, non-suicidal SDV, or not enough information/un-determined SDV. We compared percentage of accuracy to the performance of four human samples as reported by Cwik & Teismann (2017). Answer rationales provided by ChatGPT were coded using deductive content analysis. Criteria was established for all SDVCS options using official definitions as defined by Crosby et al. (2011). Number of criteria ranged from one (suicidal ideation) to five (suicide attempt interrupted by self or other). Vignette response rationales were then systematically coded using these criteria. Results indicated that ChatGPT had a 76.9% accuracy rate compared to laypeople (54.5%), undergraduate psychology students (60.4%), psychologists-in-training (68.1%), and licensed psychotherapists (66.7%). ChatGPT’s rationales for correct answers met all necessary criteria according to SDVCS definitions. Of the three incorrectly classified vignettes, two responses missed criteria related to identifying the presence or absence of suicidal intent. ChatGPT’s elevated accuracy rate over human samples indicate that ChatGPT may be able to augment mental health services, possibly at the initial assessment stage. However, incorrectly classified vignettes highlight weaknesses in ChatGPT’s abilities, indicating its potential use as a supplement to aid clinicians’ informed decision making. Future research plans include assessing clinician trust in ChatGPT-supported SDV risk assessment.