Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS
Cite

Files

Abstract

Offensive language is a significant detriment to social media environments. Existing research predominantly assumes monolingual expression, overlooking the prevalent behavior of code-switching (CS). To address this critical knowledge gap, this study identifies and empirically validates the distinct stylometric characteristics of code-switched (CSed) offensive language. Additionally, we developed methods to construct the first social media dataset specifically for CSed offensive content. Our analysis of this dataset reveals that CSed offensive language exhibits unique stylometric characteristics; moreover, these characteristics vary between the language segments involved in the CS. Furthermore, incorporating these features significantly enhances the performance of offensive language detection models. These findings offer significant research and practical implications for social media researchers, platforms, moderators, and users.

Details

PDF

Statistics

from
to
Export
Download Full History