[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-TaskLearning for Offensive Language Detection

Wenliang Dai, Tiezheng Yu, Zihan Liu, Pascale Fung


Abstract
Nowadays, offensive content in social media has become a serious problem, and automatically detecting offensive language is an essential task. In this paper, we build an offensive language detection system, which combines multi-task learning with BERT-based models. Using a pre-trained language model such as BERT, we can effectively learn the representations for noisy text in social media. Besides, to boost the performance of offensive language detection, we leverage the supervision signals from other related tasks. In the OffensEval-2020 competition, our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place (92.23%F1). An empirical analysis is provided to explain the effectiveness of our approaches.
Anthology ID:
2020.semeval-1.272
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Editors:
Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
2060–2066
Language:
URL:
https://aclanthology.org/2020.semeval-1.272
DOI:
10.18653/v1/2020.semeval-1.272
Bibkey:
Cite (ACL):
Wenliang Dai, Tiezheng Yu, Zihan Liu, and Pascale Fung. 2020. Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-TaskLearning for Offensive Language Detection. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 2060–2066, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-TaskLearning for Offensive Language Detection (Dai et al., SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.272.pdf
Code
 wenliangdai/multi-task-offensive-language-detection
Data
OLID