Multi-Class Detection of Abusive Language Using Automated Machine Learning
Jorgensen Mackenzie, Choi Minho, Niemann Marco, Brunk Jens, Becker Jörg
Abstract
Abusive language detection online is a daunting task for moderators. We propose Automated Machine Learning (Auto-ML) to semi-automate abusive language detection and to assist moderators. In this paper, we show that multi-class classification powered by Auto-ML is successful in detecting abusive language in English and German as well as and better than the state-ofthe- art machine learning models. We also highlight how we combatted the imbalanced data problem in our data-sets through feature selection and undersampling methods. We propose Auto-ML as a promising approach to the field of abusive language detection, especially for small companies who may have little machine learning knowledge and computing resources.
Keywords
Abusive Language Detection, Automated-Machine Learning, Multi-Class Classification