Back

Question-aware outlier answer detection for fairer AI scoring of interviews

Artificial Intelligence-based interview scoring learns from past interview answers, which makes it hard for it to determine if a candidate is legitimately answering the question if their response includes context or an example rarely seen in training data.

Moreover, AI interview platforms may be susceptible to adversarial inputs where an irrelevant answer may receive a high score. Both scenarios raise fairness concerns and can erode trust in AI job interviews (Madaio et al, 2020).

This is why identifying outliers that differ significantly from the majority of answers and flagging them for manual review become crucial steps toward responsible and fair use of AI interview software. While simple rule-based methods (Reiz and Pongor, 2011) could help filter out some irrelevant answers based on answer length and regular expressions, these methods do not take into account the context and content of the answer and question. Someone may describe a very unique, yet relevant, situation in response to an AI for interviews question, which you wouldn’t want to disregard.

In this study, we introduce an unsupervised, question-aware, multi-context outlier detection model that can help detect anomalous answers contextually and semantically. The unsupervised approach is deemed to be more practical compared to a supervised model that requires a large labeled dataset of outlier answers. It helps bootstrap an outlier detector that can then be enhanced through human feedback. 

We tested the outlier model to ascertain how well it is able to correctly identify 177,691 actual hired candidate interview answers from outliers, (e.g., movie reviews, news articles, nonsensical text, and sentences generated using BERT (Vaswani et al, 2017) with random starting words). 

Our model outperformed the baseline One-class SVM outlier detector (Li et al, 2003), in detecting outliers from actual interview answers. The performance of our model over the baseline unsupervised model can be explained by both question-aware learning and multi-context learning, which help yield better contextual representations for detecting outlier answers from typical interview answers. 

We also conducted a human evaluation on 10,689 interview answers of candidates who were not hired and might have provided outlier answers. Our model predicted 0.16% of the answers as outliers with only 5.9% of them being false positives. All of these false predictions describe contexts related to family and personal life in their answers but are relevant to the question. It is reasonable that these answers are labeled as an outlier by our model since they are contextually and semantically different from most interview answers.

While a data-driven AI interviewer can help counter flaws in human interviewers, answers that are significantly different to training data can lead to spurious predictive outcomes. In this study, we show how a question-aware multi-context outlier detection model could be applied to identify outlier answers. Flagging such answers for human review enhances fairness as well as provides a supervised signal to improve the outlier detection model over time. 

References:

Dai, Y., Qi, J., & Zhang, R. (2020). Joint recognition of names and publications in academic homepages. In Proceedings of the 13th International Conference on Web Search and Data Mining (pp. 133-141).

Li, K. L., Huang, H. K., Tian, S. F., & Xu, W. (2003). Improving one-class SVM for anomaly detection. In Proceedings of the 2003 international conference on machine learning and cybernetics (IEEE Cat. No. 03EX693) (Vol. 5, pp. 3077-3081). IEEE.

Madaio, M. A., Stark, L., Wortman Vaughan, J., & Wallach, H. (2020). Co-designing checklists to understand organizational challenges and opportunities around fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-14).

Reiz, B., & Pongor, S. (2011). Psychologically Inspired, Rule-Based Outlier Detection in Noisy Data. In SYNASC (pp. 131-136)

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.


Blog

Joe & the Juice Partners with Sapia.ai, Scaling an Exceptional Candidate Experience and Cutting Time to Hire

Read the full press release about the partnership here.

Joe & the Juice, the trailblazing global juice bar and coffee concept, is renowned for its vibrant culture and commitment to cultivating talent. With humble roots from one store in Copenhagen, now with a presence in 17 markets, Joe & The Juice has built a culture that fosters growth and celebrates individuality.

But, as their footprint expands, so does the challenge of finding and hiring the right talent to embody their unique culture. With over 300,000 applications annually, the traditional hiring process using CVs was falling short – leaving candidates waiting and creating inefficiencies for the recruitment team. To address this, Joe & The Juice turned to Sapia.ai, a pioneer in ethical AI hiring solutions.

A Fresh Approach to Hiring

Through this partnership, Joe & The Juice has transformed its hiring process into an inclusive, efficient, and brand-aligned experience. Instead of faceless CVs, candidates now engage in an innovative chat-based interview that reflects the brand’s energy and ethos. Available in multiple languages, the AI-driven interview screens for alignment with the “Juicer DNA” and the brand’s core values, ensuring that every candidate feels seen and valued.

Candidates receive an engaging and fair interview experience as well as personality insights and coaching tips as part of their journey. In fact, 93% of candidates have found these insights useful, helping to deliver a world-class experience to candidates who are also potential guests of the brand.

“Every candidate interaction reflects our brand,” Sebastian Jeppesen, Global Head of Recruitment, shared. “Sapia.ai makes our recruitment process fair, enriching, and culture-driven.”

Results That Matter

For Joe & The Juice, the collaboration has yielded impressive results:

  • 33% Reduction in Screening Time: Pre-vetted shortlists from Sapia.ai’s platform ensure that recruiters can focus on top candidates, getting them behind the bar faster.

  • Improved Candidate Satisfaction: With a 9/10 satisfaction score from over 55,000 interviews, candidates appreciate the fairness and transparency of the process.

  • Bias-Free Hiring: By eliminating CVs and integrating blind AI that prioritizes fairness, Joe & The Juice ensures their hiring reflects the diverse communities they serve.

Frederik Rosenstand, Group Director of People & Development at Joe & The Juice, highlighted the transformative impact: “Our juicers are our future leaders, so using ethical AI to find the people who belong at Joe is critical to our long-term success. And now we do that with a fair, unbiased experience that aligns directly with our brand.”

Trailblazing for the hospitality industry

In an industry so wholly centred on people, Joe & the Juice is paving the way for similar brands to adopt technology that enables inclusive, human-first experiences that can reflect a brand’s core values. 

If you’re curious about how Sapia.ai can transform your hiring process, check out our full case study on Joe & The Juice here.

 

Read Online
Blog

Sapia.ai Wrapped 2024

It’s been a year of Big Moves at Sapia.ai. From welcoming groundbreaking brands to achieving incredible milestones in our product innovation and scale, we’re pushing the boundaries of what’s possible in hiring.

And we’re just getting started 🚀

Take a look at the highlights of 2024 

All-in-one hiring platform
This year, with the addition of Live Interview, we’re proud to say our platform now covers screening, assessing and scheduling.
It’s an all-in-one volume hiring platform that enables our customers to deliver a world-leading experience from application through to offer.

Supercharging hiring efficiency
Every 15 seconds, a candidate is interviewed with Sapia.ai.
This year, we’ve saved hiring managers and recruiters hours of precious time that can now be used for higher-value tasks. 

See why our users love us 

Giving candidates the best experience
Our platform allows candidates to be their best selves, so our customers can find the people that truly belong with them. They’re proud to use a technology that’s changing hiring, for good.

Share the candidate love

Leading the way in AI for hiring 

We’ve continued to push the boundaries in leveraging ethical AI for hiring, with new products on the way for Coaching, Internal Mobility & Interview Builders. 

Join us in celebrating an incredible 2024

Read Online
Blog

Situational Judgement Tests vs. AI Chat Interviews: A Modern Perspective on Candidate Assessment

Choosing the right tool for assessing candidates can be challenging. For years, situational judgement tests (SJTs) have been a common choice for evaluating behaviour and decision-making skills. However, they come with limitations that can make the hiring process less effective and less inclusive.

AI-enabled chat-based interviews, such as Sapia.ai, provide organisations with a modern alternative. They focus on understanding candidates as individuals and creating a hiring experience that is both fair and insightful while enabling efficient screening and selection. 

This shift raises important questions: Are SJTs still a tool that should be considered for volume hiring? And what do AI assessments offer in comparison?

1. The Static Nature of SJTs

Traditional SJTs use predefined multiple-choice questions to assess behavioural tendencies and situational knowledge. While useful for screening, these static frameworks lack the flexibility to adapt based on real-world performance data or evolving role requirements. 

Once created, SJTs don’t adapt to new data or evolving organisational needs. They rely on fixed scenarios and responses that may not fully reflect the dynamic realities of modern workplaces, and as a result, their relevance may diminish over time.

AI-enabled chat interviews, on the other hand, are inherently adaptive. Using machine learning, these tools can continuously refine their models based on feedback from real-world outcomes such as hiring or turnover data. This ability to evolve ensures the assessments align with organisations’ needs.

2. Richer Data Through Open-Ended Responses

One of the main critiques of SJTs is their reliance on multiple-choice responses. While structured and straightforward, these options may not capture the full scope of a candidate’s thinking, communication skills, or problem-solving ability. The approach is often limiting, reducing complex human behaviour to a few predefined choices.

AI-enabled chat interviews work more holistically and dynamically. These tools provide a more complete picture of a person by allowing candidates to answer questions in their own words. Natural language processing (NLP) analyses their responses, offering insights into personality traits, communication skills, and behavioural tendencies. This open-ended format lets candidates express themselves authentically, giving employers a deeper understanding of their potential.

3. The Candidate Experience: Stressful or Supportive?

SJTs often include time constraints and rigid formats, which can create pressure for candidates. This is especially true when candidates feel forced to choose options that don’t fully reflect how they would actually behave. The process can feel impersonal, even transactional.

In contrast, chat-based interviews are designed to be conversational and low-pressure for candidates. By removing time limits and adopting a familiar chat interface, these tools help candidates feel more at ease. They also frequently include personalised feedback, turning the assessment into a valuable experience for the candidate, not just the employer.

4. Addressing Bias and Fairness

Traditional SJTs are prone to transparency issues, as candidates can often identify and select the “best practice” answers without revealing their true tendencies. Additionally, static test designs can unintentionally embed bias; due to the nature of the timed test, SJTs have been found to disadvantage some groups. 

AI chat interviews, when developed ethically within a framework like Sapia.ai’s FAIR Hiring Framework, eliminate explicit bias by relying solely on the content of a candidate’s responses. Their machine learning models are continuously validated for fairness, ensuring that hiring decisions are free from subjective judgments or irrelevant demographic factors.

5. An Assessment That Improves Over Time

Workplaces are constantly changing, and hiring tools need to keep up. SJTs’ fixed nature can make them less effective as roles evolve or organizational priorities shift. They provide a snapshot but not a dynamic view of what’s needed.

AI-enabled chat interviews are built to adapt. With feedback loops and continuous learning, they incorporate real-world hiring outcomes—like retention and performance data—into their models. This ensures that assessments stay relevant and effective over time.

Rethinking Candidate Assessment

As hiring demands grow more complex, so does the need for tools that can capture the whole person, not just their response to hypothetical scenarios. While SJTs have played an important role in hiring practices, they are increasingly being replaced by tools like AI-enabled chat interviews.

These modern approaches provide richer data, adapt to changing needs, and create a richer and more engaging experience for candidates. Perhaps most importantly, they emphasise fairness and inclusivity, aligning with the growing demand for unbiased hiring practices.

For organisations evaluating their assessment tools, the question isn’t just which method is “better.” Understanding the specific needs of your roles, teams, and candidates will help you  choose tools that help you make decisions that are both informed and equitable.

Read Online