Multi-Class Detection of Abusive Language Using Automated Machine Learning

Jorgensen Mackenzie, Choi Minho, Niemann Marco, Brunk Jens, Becker Jörg


Abstract

Abusive language detection online is a daunting task for moderators. We propose Automated Machine Learning (Auto-ML) to semi-automate abusive language detection and to assist moderators. In this paper, we show that multi-class classification powered by Auto-ML is successful in detecting abusive language in English and German as well as and better than the state-ofthe- art machine learning models. We also highlight how we combatted the imbalanced data problem in our data-sets through feature selection and undersampling methods. We propose Auto-ML as a promising approach to the field of abusive language detection, especially for small companies who may have little machine learning knowledge and computing resources.

Keywords
Abusive Language Detection, Automated-Machine Learning, Multi-Class Classification



Publication type
Research article in digital collection (conference)

Peer reviewed
Yes

Publication status
Published

Year
2020

Conference
15. Internationale Tagung Wirtschaftsinformatik (WI 2020)

Venue
Potsdam

Language
English

DOI

Full text