Abstract
This paper presents the overview of the shared task 7, Fine-Grained Dialogue Social Bias Measurement, in NLPCC 2022. In this paper, we introduce the task, explain the construction of the provided dataset, analyze the evaluation results and summarize the submitted approaches. This shared task aims to measure the social bias in dialogue scenarios in a fine-grained categorization which is challenging due to the complex and implicit bias expression. The context-sensitive bias responses in dialogue scenarios make this task even more complicated. We provide 25k data for training and 3k data for evaluation. The dataset is collected from a Chinese question-answering forum Zhihu (www.zhihu.com). Except for the above-mentioned bias attitude label, this dataset is also finely annotated with multiple auxiliary labels. There are 11 participating teams and 35 submissions in total. We adopt the macro F1 score to evaluate the submitted results, and the highest score is 0.5903. The submitted approaches focus on different aspects of this problem and use diverse techniques to boost the performance. All the relevant information can also be found at https://para-zhou.github.io/NLPCC-Task7-BiasEval/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barikeri, S., Lauscher, A., Vulić, I., Glavaš, G.: RedditBias: a real-world resource for bias evaluation and debiasing of conversational language models. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 1941–1955 (2021). https://doi.org/10.18653/v1/2021.acl-long.151, https://aclanthology.org/2021.acl-long.151
Basta, C., Costa-jussà, M.R., Casas, N.: Evaluating the underlying gender bias in contextualized word embeddings. In: Proceedings of the 1st Workshop on Gender Bias in Natural Language Processing, August 2019, pp. 33–39 (2019). https://doi.org/10.18653/v1/W19-3805, https://aclanthology.org/W19-3805
Blodgett, S.L., Barocas, S., Daumé III, H., Wallach, H.: Language (technology) is power: a critical survey of “bias” in NLP. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020, pp. 5454–5476 (2020). https://doi.org/10.18653/v1/2020.acl-main.485, https://aclanthology.org/2020.acl-main.485
Cheng, L., Mosallanezhad, A., Silva, Y., Hall, D., Liu, H.: Mitigating bias in session-based cyberbullying detection: a non-compromising approach. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 2158–2168 (2021)
Davidson, T., Bhattacharya, D., Weber, I.: Racial bias in hate speech and abusive language detection datasets. In: Proceedings of the 3rd Workshop on Abusive Language Online, August 2019, pp. 25–35 (2019). https://doi.org/10.18653/v1/W19-3504, https://aclanthology.org/W19-3504
Deng, J., Zhou, J., Sun, H., Mi, F., Huang, M.: COLD: a benchmark for Chinese offensive language detection (2022)
Lee, N., Madotto, A., Fung, P.: Exploring social bias in chatbots using stereotype knowledge. In: Proceedings of the 2019 Workshop on Widening NLP, pp. 177–180 (2019). https://aclanthology.org/W19-3655
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=rJzIBfZAb
Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. Statistics arXiv arXiv:1605.07725 (2016)
Nadeem, M., Bethke, A., Reddy, S.: StereoSet: measuring stereotypical bias in pretrained language models. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 5356–5371 (2021). https://aclanthology.org/2021.acl-long.416/
Sap, M., Card, D., Gabriel, S., Choi, Y., Smith, N.A.: The risk of racial bias in hate speech detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, July 2019, pp. 1668–1678 (2019). https://doi.org/10.18653/v1/P19-1163, https://aclanthl45ology.org/P19-1163
Sap, M., Gabriel, S., Qin, L., Jurafsky, D., Smith, N.A., Choi, Y.: Social bias frames: reasoning about social and power implications of language. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5477–5490 (2020). https://aclanthology.org/2020.acl-main.486/?ref=https://githubhelp.com
Schick, T., Udupa, S., Schütze, H.: Self-diagnosis and self-debiasing: a proposal for reducing corpus-based bias in NLP. Trans. Assoc. Comput. Linguist. 9, 1408–1424 (2021)
Sun, H., et al.: On the safety of conversational models: taxonomy, dataset, and benchmark. In: Findings of the Association for Computational Linguistics, ACL 2022, pp. 3906–3923 (2022). https://aclanthology.org/2022.findings-acl.308
Thoppilan, R., et al.: LaMDA: language models for dialog applications (2022)
Xu, J., Ju, D., Li, M., Boureau, Y.L., Weston, J., Dinan, E.: Recipes for safety in open-domain chatbots (2020). https://doi.org/10.48550/arXiv.2010.07079, https://arxiv.org/abs/2010.07079
Zhao, J., Wang, T., Yatskar, M., Cotterell, R., Ordonez, V., Chang, K.W.: Gender bias in contextualized word embeddings. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (2019). https://doi.org/10.18653/v1/N19-1064, https://par.nsf.gov/biblio/10144868
Zhou, J., et al.: Towards identifying social bias in dialog systems: frame, datasets, and benchmarks (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, J., Mi, F., Meng, H., Deng, J. (2022). Overview of NLPCC 2022 Shared Task 7: Fine-Grained Dialogue Social Bias Measurement. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13552. Springer, Cham. https://doi.org/10.1007/978-3-031-17189-5_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-17189-5_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17188-8
Online ISBN: 978-3-031-17189-5
eBook Packages: Computer ScienceComputer Science (R0)