Quantifying Gender Bias in the English Language

ECE 4424 Final Project

May 8 2019

Project Team 1

Nick Gomez

Gurleen Matharoo

Introduction to the Project

For our final project, we are choosing to reproduce results from the paper “Man is to Computer Programmer as Woman is to Homemaker?” studying the gender bias in language. This paper outlines how there are certain words in the English language that are more commonly associated with one gender or another, and how these biases are viewed in a societal scope. Along with identifying biased words, the research team attempted to “de-bias” certain words in order to provide a better method for expressing certain attributes or roles in society.


Relevant Papers to this Project

Two other papers that are related to this area of research are “Double/Debiased Machine Learning for Treatment and Structural Parameters” and “Recommendations as Treatments: Debiasing Learning and Evaluation”. These papers cover different methods for identifying different types of bias within a dataset. While this is not exactly what we aim to achieve, we hope to adapt some of these ideas of identifying bias to a new dataset in order to root out gender bias. All three papers referenced are linked in the top bar of this webpage.


Testing

Upon further research into the main topic of quantifying gender bias and debiasing language, the team that published the “Man is to Computer Programmer..." paper also publicized their GitHub repository. This included their source code that was used to run their tests and generate their initial results. This allowed us to simply import a new data set, run it against their code to see what kind of results we get.

Our new data set was a large text file containing over twenty-six thousand different words and names along with an associated score that will be used to determine bias similarly to the original dataset.

The next step was to identify gender-based pairs called neighbors; The code matches up a stereotypically masculine word to its stereotypically feminine counterpart.

Another metric that was calculated was to give a score to a list of professions that go negative if they are perceived as masculine and positive if they are perceived as feminine, with zero being neutral.

After these initial tests were run, it was time to run the debiasing code against the dataset. The same tests were conducted to calculate neighbors and weigh professions based on gender perceptions.


Conclusion

The purpose of this experiment was to see if it was possible to “de-bias” certain words in order to provide a better method for expressing certain attributes or roles in society. We went about this by running a python script against our list of words that first gave each word a "bias value" to initially judge the words that we chose, and secondly ran a debiasing algorithm to change the relationships of the words to allow them to be more gender neutral. The results show that there are ample ways to describe a career in a gender neutral way. The tests also showed that just purely on the algorithms used to assign jobs a bias value, the program thinks that some words are "biased" when in reality those words arent necessarily attributed to one gender or another. This shows that even though the program does a pretty good job at assigning and removing bias, there are still some inconsistencies and errors.