[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3366424.3383559acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Quantifying Gender Bias in Different Corpora

Published: 20 April 2020 Publication History

Abstract

Word embedding models have been shown to be effective in performing a wide variety of Natural Language Processing (NLP) tasks such as identifying audiences for web advertisements, parsing resumés to select promising job candidates, and translating documents from one language to another. However, it has been demonstrated that NLP systems learn gender bias from the corpora of documents on which they are trained. It is increasingly common for pre-trained models to be used as a starting point for building applications in a wide range of areas including critical decision making applications. It is also very easy to use a pre-trained model as the basis for a new application without careful consideration of the original nature of the training set. In this paper, we quantify the degree to which gender bias differs with the corpora used for training. We look especially at the impact of starting with a pre-trained model and fine-tuning with additional data. Specifically, we calculate a measure of direct gender bias on several pre-trained models including BERT’s Wikipedia and Book corpus models as well as on several fine-tuned General Language Understanding Evaluation (GLUE) benchmarks. In addition, we evaluate the bias from several more extreme corpora including the Jigsaw identity toxic dataset that includes toxic speech biased against race, gender, religion, and disability and the RtGender dataset that includes speech specifically labelled by gender. Our results reveal that the direct gender bias of the Jigsaw toxic identity dataset is surprisingly close to that of the base pre-trained Google model, but the RtGender dataset has significantly higher direct gender bias than the base model. When the bias learned by an NLP system can vary significantly with the corpora used for training, it becomes important to consider and report these details, especially for use in critical decision-making applications.

References

[1]
Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in neural information processing systems. 4349–4357.
[2]
J. Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters Business News. (2018). https://www.reuters.com/article/amazoncom-jobs-automation/rpt-insight-amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSL2N1WP1RO
[3]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).
[4]
Leora Morgenstern Ernest Davis and Charles Ortiz. [n.d.]. The Winograd Schema Challenge. https://cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WS.html
[5]
Hector J. Levesque, Ernest Davis, and Leora Morgenstern. 2012. The Winograd Schema Challenge(KR’12). AAAI Press, 552–561. http://dl.acm.org/citation.cfm?id=3031843.3031909
[6]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.
[7]
Eric Nalisnick, Bhaskar Mitra, Nick Craswell, and Rich Caruana. 2016. Improving document ranking with dual word embeddings. In Proceedings of the 25th International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, 83–84.
[8]
Nikita Nangia and Samuel R Bowman. 2019. Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark. arXiv preprint arXiv:1905.10425(2019).
[9]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250(2016).
[10]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andrew Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing. 1631–1642.
[11]
Rob Voigt, David Jurgens, Vinodkumar Prabhakaran, Dan Jurafsky, and Yulia Tsvetkov. 2018. RtGender: A corpus for studying differential responses to gender. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
[12]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R Bowman. 2018. Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461(2018).
[13]
Alex Warstadt and Samuel R Bowman. 2019. Grammatical Analysis of Pretrained Sentence Encoders with Acceptability Judgments. arXiv preprint arXiv:1901.03438(2019).

Cited By

View all
  • (2024)Automatically Distinguishing People’s Explicit and Implicit Attitude Bias by Bridging Psychological Measurements with Sentiment Analysis on Large CorporaApplied Sciences10.3390/app1410419114:10(4191)Online publication date: 15-May-2024
  • (2024)A Way of Making Smart Health Through Collaborating Machine Learning with the Bibliometrics2024 4th International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE)10.1109/ICACITE60783.2024.10617395(209-214)Online publication date: 14-May-2024
  • (2024)MarIA and BETO are sexist: evaluating gender bias in large language models for SpanishLanguage Resources and Evaluation10.1007/s10579-023-09670-358:4(1387-1417)Online publication date: 1-Dec-2024
  • Show More Cited By

Index Terms

  1. Quantifying Gender Bias in Different Corpora
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '20: Companion Proceedings of the Web Conference 2020
        April 2020
        854 pages
        ISBN:9781450370240
        DOI:10.1145/3366424
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 April 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. BERT
        2. datasets
        3. gender bias
        4. natural language processing

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '20
        Sponsor:
        WWW '20: The Web Conference 2020
        April 20 - 24, 2020
        Taipei, Taiwan

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)203
        • Downloads (Last 6 weeks)13
        Reflects downloads up to 19 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Automatically Distinguishing People’s Explicit and Implicit Attitude Bias by Bridging Psychological Measurements with Sentiment Analysis on Large CorporaApplied Sciences10.3390/app1410419114:10(4191)Online publication date: 15-May-2024
        • (2024)A Way of Making Smart Health Through Collaborating Machine Learning with the Bibliometrics2024 4th International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE)10.1109/ICACITE60783.2024.10617395(209-214)Online publication date: 14-May-2024
        • (2024)MarIA and BETO are sexist: evaluating gender bias in large language models for SpanishLanguage Resources and Evaluation10.1007/s10579-023-09670-358:4(1387-1417)Online publication date: 1-Dec-2024
        • (2023)Characterizing gender stereotypes in popular fiction: A machine learning approachOnline Journal of Communication and Media Technologies10.30935/ojcmt/1364413:4(e202349)Online publication date: 2023
        • (2023)Mitigating Bias in GLAM Search EnginesProceedings of the 34th ACM Conference on Hypertext and Social Media10.1145/3603163.3609043(1-5)Online publication date: 4-Sep-2023
        • (2023)Large scale analysis of gender bias and sexism in song lyricsEPJ Data Science10.1140/epjds/s13688-023-00384-812:1Online publication date: 20-Apr-2023
        • (2023)Identifying Gender Bias in Online Crime News Indonesia Using Word Embedding2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation (ICAMIMIA)10.1109/ICAMIMIA60881.2023.10427911(774-778)Online publication date: 14-Nov-2023
        • (2023)Natural Language Processing Workload Optimization Using Container Based DeploymentEmerging Trends in Expert Applications and Security10.1007/978-981-99-1946-8_10(93-103)Online publication date: 30-Jun-2023
        • (2022)Exploring gender biases in ML and AI academic research through systematic literature reviewFrontiers in Artificial Intelligence10.3389/frai.2022.9768385Online publication date: 11-Oct-2022
        • (2022)Quantification and Mitigation of Directional Pairwise Class Confusion Bias in a Chatbot Intent Classification ModelInternational Journal of Semantic Computing10.1142/S1793351X2250004016:04(497-520)Online publication date: 8-Aug-2022
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media