Resources › Whitepaper › Identifying and mitigating gender bias in structured interview responses › Gender differences in language
Gender differences in language
A number of quantitative sociolinguistic studies have revealed significant gender differences in language (Coates, 2015). More recently, social media content has been used to conduct large scale studies examining gender differences in language. Researchers have been able to predict gender from social media content with accuracies exceeding 91% (Schwartz et al., 2013), indicating the amount of gender information encoded in language.
In the context of personnel selection, gender differences in language have been highlighted as a way bias can be introduced at the application (e.g., résumé, cover letter, etc.), screening and interview levels. Investigating the bias at the application level, Foley and Williamson (2018) observed that recruiters were able to use implicit signals and cues to infer the gender identities of applicants in anonymized job applications, reintroducing the possibility of gender bias. The well-publicized experiment at Amazon where a resume parser built with past data generating gender-biased outcomes is a good example of how language bias can influence predictive machine learning models. In this work, we present the examination of gender differences in language observed at the assessment level using data from a text-chat based automated structured interview. Structured interviews have been shown to reduce bias over unstructured interviews (Levashina et al., 2014).
We aimed to quantify the amount of gender information contained in 1) the raw text
responses to structured interview questions, and 2) the measures (or features) calculated from the responses as per the interview scoring rubric (e.g., personality traits, behavioral competencies, communication skills etc.). Quantification of gender-related information is achieved by training machine learning models to predict gender from raw text and derived features, respectively, and using their accuracies as proxies for the gender information1 present in each case. We hypothesized that features derived according to a clearly defined rubric would contain less gender information compared to raw candidate responses.