Mining huge datasets of English reveals stereotypes about gender, race, and class prevalent in English-speaking societies. Tessa Charlesworth and colleagues developed a stepwise procedure, Flexible Intersectional Stereotype Extraction (FISE), which they applied to billions of words of English Internet text. This allowed them to explore traits associated with intersectional identities, by quantifying how often occupation labels or trait adjectives were deployed near phrases that referred to multiple identities, such as “Black Women,” “Rich Men,” “Poor Women,” or “White Men.” In their analysis the authors first show that the method is a valid way of extracting stereotypes: occupations that were, in reality, dominated by certain identities (e.g., architect, engineer, manager are dominated by White men) are also, in language, strongly associated with that same intersectional group at a rate significantly above chance—about 70%. Next, the authors looked at personality traits. The FISE procedure found that 59% of studied traits were associated with “White Men” but just 5% of traits were associated with “Black Women.” According to the authors, the imbalances in trait frequencies indicate a pervasive androcentric (male-centric) and ethnocentric (White-centric) bias in English. The valence (positivity/negativity) of the associated traits were also imbalanced. Some 78% of traits associated with “White Rich” were positive while only 21% of traits associated with “Black Poor” were positive. Patterns such as these have downstream consequences in AI, computer translation, and text generation, according to the authors. In addition to understanding how intersectional bias shapes such outcomes, the authors note that FISE can be used to research a range of intersectional identities across languages and even across history.
Journal
PNAS Nexus
Article Title
Extracting intersectional stereotypes from embeddings: Developing and validating the Flexible Intersectional Stereotype Extraction procedure
Article Publication Date
19-Mar-2024