Multi-Class Detection of Abusive Language Using Automated Machine Learning

Jorgensen, Mackenzie; Choi, Minho; Niemann, Marco; Brunk, Jens; Becker, Jörg

Multi-Class Detection of Abusive Language Using Automated Machine Learning

Jorgensen Mackenzie, Choi Minho, Niemann Marco, Brunk Jens, Becker Jörg

Abstract

Abusive language detection online is a daunting task for moderators. We propose Automated Machine Learning (Auto-ML) to semi-automate abusive language detection and to assist moderators. In this paper, we show that multi-class classification powered by Auto-ML is successful in detecting abusive language in English and German as well as and better than the state-ofthe- art machine learning models. We also highlight how we combatted the imbalanced data problem in our data-sets through feature selection and undersampling methods. We propose Auto-ML as a promising approach to the field of abusive language detection, especially for small companies who may have little machine learning knowledge and computing resources.

Keywords
Abusive Language Detection, Automated-Machine Learning, Multi-Class Classification

Publication type

Research article in digital collection (conference)

Peer reviewed

Yes

Publication status

Published

Year

2020

Conference

15. Internationale Tagung Wirtschaftsinformatik (WI 2020)

Venue

Potsdam

Language

English

DOI

https://doi.org/10.30844/wi_2020_r7-jorgensen

Full text

R7_Jorgensen-Multi-Class_Detection_of_Abusive_Language_Using_Automated_Machine_Learning-248_c.pdf