Files
Abstract
Disproportionalities in school discipline have become a popular research topic in recent decades, particularly at the school-, district-, and state-level. While there are a handful of reports using nationally representative data, this study provides a closer look into factors affecting out-of-school suspensions in 41,339 K-12 U.S. schools. The public-use data file for the 2017-2018 school year used in this study was obtained from the Civil Rights Data Collection conducted by the U.S. Department of Education's Office for Civil Rights. This study utilizes two machine learning methods, logistic regression for classification and random forest, to determine which factors raise the risk of disciplinary action such as out-of-school suspensions. Our results indicate that random forest outperforms logistic regression in terms of classification accuracy. Moreover, both methods indicate that the number of counselors as well as retention rates and the number of harassment and bullying allegations have significant predictive power in this classification problem.