Obsessive Compulsive and Related Disorders
Navigating Obsessive-Compulsive Complexity: Harnessing Deep Neural Networks for Predictive Modeling
Alixandra L. Wilens, M.A.
Clinical Research Assistant
Pediatric Anxiety Research Center, Bradley Hospital
Merrick, New York, United States
Gregory N. Muller, Ph.D.
Assistant Professor & Psychologist
University of Texas at Austin
Austin, Texas, United States
Joseph P.H. McNamara, Ph.D.
Division Chief
University of Florida
Gainesville, Florida, United States
Brian A. Zaboski, ABPP, Ph.D.
Assistant Professor
Yale University
Wallingford, Connecticut, United States
Obsessive-compulsive disorder (OCD) affects 2-4% of the United States population, and predicting obsessive-compulsive severity is challenging due to the heterogeneous and multidimensional relationships between clinical variables and OC symptoms. A growing body of literature has used machine learning models in the study of OC symptoms to account for multivariate relationships. This has been immensely beneficial, with some studies showing OC prediction accuracy exceeding 80%. Building upon this prior work, our goal was to predict OC severity from clinical and demographic variables with a deep neural network to find an optimal model for prediction. Our neural network is distinct in that it can model the categorical and continuous variables present in clinical data without losing information. Participants consisted of a U.S. adult sample recruited online. Following validity checks, the total sample comprised 229 participants (56% female) with a mean age of 32 years old (SD = 5.8). All participants completed measures on OC symptoms, religiosity, spirituality, personality, and demographics (including education level, race, ethnicity, religion, and income). For data analysis, we tested a novel strategy of processing input data through a layer in the neural network to transform categorical variables into continuous representations, so no information was lost. We compared our neural network to three competing models in terms of predictive accuracy: a linear regression, a decision tree, and a random forest. As predicted, the neural network closely matched the linear regression in terms of root mean squared error (11.31 to 10.99, respectively) while still maintaining nonlinear relationships and accounting for categorical variables without data loss. Moreover, the neural network was superior to the two machine learning models. Therefore, the neural network was best in terms of predicting OC severity without assuming linearity between input and output variables. Additionally, the novel inclusion of an embedding layer for categorical variables increased its predictive power beyond machine learning architectures. Overall, our study underscores the significance of embracing non-linear models in OC prediction, as well as encouraging a reevaluation of conventional analytical approaches when prediction is central to research questions. We hope that this work will help demonstrate the importance of deep learning in predictive models that can allow for more actionable clinical decision-making.