Resources › eBook › Trust through transparency the choices you make as a leader in ethical ai › The strategic choices that make up the sapia ai product
The strategic choices that make up the Sapia.ai product
Building any new product requires you to make strategic choices. For regular products, you may not need to be fully transparent about those choices; with AI-based products, especially ones that assist high stakes decisions such as hiring, you must consider ethics, the law, and the overall experience. Awareness of those strategic choices are key to building trust among all users of an AI product, especially in gauging the impact and risks associated with using AI.
With all the noise and promises made with AI, the things you choose not to do are as important as those you do. Here are some of those choice we have made so far:
We don’t use video data for any AI component. That includes not using transcribed text from video data.
Why does it matter? Video data is fraught with bias. Here is just one example of how it can horribly go wrong. We do not use or build any ML models with video data. In fact, we delete all video data at 180 days post submission. We don’t keep them for research, analysis, or validation.
We don’t use CV data, data scraped from the web, metadata, or any other kind of third-party data. The only data used in the algorithms are those given by the candidate with consent.
Why does it matter? CV data is easily gameable and mostly optimised for keyword matching algorithms. Graduates, for example, have been fine-tuning their CVs counter to the behaviours of matching programs for decades. There is ample advice on the web – including on university career sites – telling graduates how to stand out, generally through manipulation of algorithms. Apart from being a historical record of experience, CVs add less value in assessing hard skills, soft skills, or the potential of a candidate. CVs have been shown to lead to biased outcomes, even when demographic specific terms such as gender indicating words are removed, as highlighted in this study.
We also believe using any data collected without the consent of the candidate – such as data scraped from social media or metadata collected in the assessment process (e.g. time to complete, number of attempts, etc) – is ethically wrong. In addition to whether such data has any validity in hiring decisions, using data without the knowledge of the candidate can lead to spurious outcomes. Consider the case of using social media data, which puts candidates who are active on social media at an advantage (even disadvantage) compared to ones who are not present or less active on social media.