Symposia
Technology/Digital Health
Shannon Wiltsey Stirman, Ph.D.
Professor
National Center for PTSD and Stanford University
Menlo Park, California, United States
Elizabeth Stade, PhD (she/her/hers)
Postdoctoral fellow
Stanford University
PALO ALTO, California, United States
Sohayla Elhusseini, B.A.
Student
University of Kentucky
San Carlos, California, United States
Stefanie T. LoSavio, ABPP (she/her/hers)
Assistant Professor
University of Texas Health Science Center at San Antonio
San Antonio, Texas, United States
Bailee Schuhmann, PhD (she/her/hers)
Postdoctoral fellow
University of Texas Health Sciences Center for San Antonio
San Antonio, Texas, United States
Shashanka Subrahmanya, M.S. (he/him/his)
Computer Scientist
Stanford University
PALO ALTO, California, United States
Johannes Eichstaedt, PhD (he/him/his)
Assistant Professor
Stanford University
PALO ALTO, California, United States
Katherine Dondanville, ABPP, Psy.D. (she/her/hers)
associate professor
The University of Texas Health Science Center at San Antonio
San Antonio, Texas, United States
Research has demonstrated a link between fidelity to cognitive processing therapy and PTSD treatment outcomes. However, evidence suggests that fidelity may decrease over time without ongoing support. Assessing fidelity by reviewing session recordings requires substantial time and training and is not feasible in most clinical settings. Advances in natural language processing (NLP) led to reliable automated coding of some treatment modalities, which paved the way for more scalable fidelity assessment. However, this process requires an intensive process of training NLP models on hundreds if not thousands of fidelity-coded therapy sessions, and patients are often unaware that their session recordings are used to train models that may ultimately have commercial applications. With the advent of generative artificial intelligence applications (also known as Large Language Models—LLMs) such as GPT-4, it has become possible to train applications to code sessions through prompt engineering. In this presentation, we describe the process of training and validating fidelity assessment using LLMs in an ongoing study of asynchronous messaging-based CPT for PTSD. Eighty transcripts were coded by an LLM and two modes of comparison were used to assess model reliability. Preliminary findings demonstrate good reliability and agreement with human coding for treatment adherence. We will also describe ongoing work to assess competence and to provide support for improved treatment fidelity and additional generative AI approaches to supporting therapist fidelity. Best practices and implications for implementation of evidence-based treatments in routine care settings will also be discussed.