We are often asked by talent leaders and hiring managers whether interviews conducted via a text-based chat disadvantage people who have English as a Second Language (EASL).
While that may seem intuitive, the data tells a different story.
Aggregate results across a variety of Sapia.ai clients that use our AI Smart Interviewer indicate that EASL candidates, in general, perform better than Native English speakers.
While these results may seem surprising, the science that underpins our AI Smart Interviewer has been created to mitigate bias, and we test this constantly.
Standard testing includes the “4/5th rule”, the industry standard test for adverse impact. It ensures the selection ratio of a minority group is at least four-fifths (80%) of the selection ratio of the majority group.
When comparing Native English Speakers with Non-Native English Speakers (EASL), it is shown that EASL candidates are scored higher on average by our AI Smart Interviewer and therefore auto-progressed at a higher rate than those whose native language is English, achieving a 4/5ths rule score of 100%.
Assessing language using Sapia.ai
When it comes to assessing language skills using Sapia.ai proprietary written language assessments, we have developed two aggregate measures called “basic communication skills” and “advanced communication skills”.
– Basic skills look for language fundamentals like spelling, grammar, readability etc.
– Advanced skills look at the sophistication of language (e.g. vocabulary).
It is important to note that the dimensions used within each measure such as spelling and grammar are weighted in such a way that not all misspelled words or grammatically incorrect sentences result in a penalty. These aggregate measures are benchmarked and validated using our large interview dataset across multiple role families.
Further, in Sapia.ai assessments, these measures are not always weighted the same and are set depending on how important language skills are for a particular job.
For example, for a customer-facing retail role, “basic skills” might be set as “medium” and “advanced skills” as “low” or as simply ignored. A retail team member may be required to jot down notes or write the occasional report or email. Basic writing skills may be helpful but not essential, hence the “medium” weighting and minimal impact on their overall score. Other personality traits and behavioral competencies may play a stronger role in determining role-fit.
Secondly, the scores are benchmarked within a relevant population. A retail worker’s “basic skills” score is not compared against graduates or call center staff.
Here’s how the scoring might work:
Maria applies for a retail role and gets a basic skills score that puts her in the top 20% of the population, that is, within a population of retail candidates. This percentile is used in the final score calculation. That way no one is disadvantaged, and candidates are only compared within a comparable group. The basic skills score received by Maria that placed her in the top 20% of retail applicants is 54/100.
In comparison, Michael, a graduate applicant, receives a basic skills score of 72 and is in the top 30% of graduate applicants. Michael has scored higher than Maria in his basic skills, but in their respective populations, Maria has done better than Michael.
There are also other factors to consider when thinking about smart chat interviews and their impact on EASL candidates.
In a spoken test or video test, candidates have fewer chances to re-record their answers. In our Chat Interview™, we give candidates unlimited chances to refine their answers, allowing them to edit the text until they are ready to submit. An EASL candidate will have as much opportunity as they want to refine their answers with no pressure.
Candidates can do the test at their own pace, so the time taken to complete the test is not a factor that will impact the scores. An EASL candidate will have enough time to work on the language and get it right.
You may still be wondering how we ensure EASL candidates’ personality traits and behavioral competencies are also accurately assessed.
Our Chat Interview™ uses Natural Language Processing, machine learning, and optimization methods to score structured interview responses, fairly and consistently.
Our scoring leverages data from over 1 billion words written by over 3.5 million diverse candidates across many different role families and regions.
Based on one’s use of language, we derive signals that matter, like personality and behavioral competencies, that are then used in a predictive algorithm based on the ideal candidate profile to generate a score and recommendation.
We don’t use simple keyword matching, and we consider more than just the words used. Phrasing, syntax, structure, and context all matter. Perfect grammar and spelling don’t matter for the majority of constructs.
Taken together, our highly tuned assessment models combined with the validity of structured interviews represent a far more enjoyable and reliable assessment experience for EASL candidates, especially when compared to traditional assessments.
Being data-driven means we can constantly and vigilantly check that EASL candidates are not disadvantaged in how they are assessed.
Why neuroinclusion can’t be a retrofit and how Sapia.ai is building a better experience for every candidate.
In the past, if you were neurodivergent and applying for a job, you were often asked to disclose your diagnosis to get a basic accommodation – extra time on a test, maybe the option to skip a task. That disclosure often came with risk: of judgment, of stigma, or just being seen as different.
This wasn’t inclusion. It was bureaucracy. And it made neurodiverse candidates carry the burden of fitting in.
We’ve come a long way, but we’re not there yet.
Over the last two decades, hiring practices have slowly moved away from reactive accommodations toward proactive, human-centric design. Leading employers began experimenting with:
But even these advances have often been limited in scope, applied to special hiring programs or specific roles. Neurodiverse talent still encounters systems built for neurotypical profiles, with limited flexibility and a heavy dose of social performance pressure.
Hiring needs to look different.
Truly inclusive hiring doesn’t rely on diagnosis or disclosure. It doesn’t just give a select few special treatment. It’s about removing friction for everyone, especially those who’ve historically been excluded.
That’s why Sapia.ai was built with universal design principles from day one.
Here’s what that looks like in practice:
It’s not a workaround. It’s a rework.
We tend to assume that social or “casual” interview formats make people comfortable. But for many neurodiverse individuals, icebreakers, group exercises, and informal chats are the problem, not the solution.
When we asked 6,000 neurodiverse candidates about their experience using Sapia.ai’s chat-based interview, they told us:
“It felt very 1:1 and trustworthy… I had time to fully think about my answers.”
“It was less anxiety-inducing than video interviews.”
“I like that all applicants get initial interviews which ensures an unbiased and fair way to weigh-up candidates.”
Some AI systems claim to infer skills or fit from resumes or behavioural data. But if the training data is biased or the experience itself is exclusionary, you’re just replicating the same inequity with more speed and scale.
Inclusion means seeing people for who they are, not who they resemble in your data set.
At Sapia.ai, every interaction is transparent, explainable, and scientifically validated. We use structured, fair assessments that work for all brains, not just neurotypical ones.
Neurodiversity is rising in both awareness and representation. However, inclusion won’t scale unless the systems behind hiring change as well.
That’s why we built a platform that:
Sapia.ai is already powering inclusive, structured, and scalable hiring for global employers like BT Group, Costa Coffee and Concentrix. Want to see how your hiring process can be more inclusive for neurodivergent individuals? Let’s chat.
There’s growing interest in AI-driven tools that infer skills from CVs, LinkedIn profiles, and other passive data sources. These systems claim to map someone’s capability based on the words they use, the jobs they’ve held, and patterns derived from millions of similar profiles. In theory, it’s efficient. But when inference becomes the primary basis for hiring or promotion, we need to scrutinise what’s actually being measured and what’s not.
Let’s be clear: the technology isn’t the problem. Modern inference engines use advanced natural language processing, embeddings, and knowledge graphs. The science behind them is genuinely impressive. And when they’re used alongside richer sources of data, such as internal project contributions, validated assessments, or behavioural evidence, they can offer valuable insight for workforce planning and development.
But we need to separate the two ideas:
The risk lies in conflating the two.
CVs and LinkedIn profiles are riddled with bias, inconsistency, and omission. They’re self-authored, unverified, and often written strategically – for example, to enhance certain experiences or downplay others in response to a job ad.
And different groups represent themselves in different ways. Ahuja (2024) showed, for example, that male MBA graduates in India tend to self-promote more than their female peers. Something as simple as a longer LinkedIn ‘About’ section becomes a proxy for perceived competence.
Job titles are vague. Skill descriptions vary. Proficiency is rarely signposted. Even where systems draw on internal performance data, the quality is often questionable. Ratings tend to cluster (remember the year everyone got a ‘3’ at your org?) and can often reflect manager bias or company culture more than actual output.
The most advanced skill inference platforms use layered data: open web sources like job ads and bios, public databases like O*NET and ESCO, internal frameworks, even anonymised behavioural signals from platform users. This breadth gives a more complete picture, and the models powering it are undeniably sophisticated.
But sophistication doesn’t equal accuracy.
These systems rely heavily on proxies and correlations, rather than observed behaviour. They estimate presence, not proficiency. And when used in high-stakes decisions, that distinction matters.
In many inference systems, it’s hard to trace where a skill came from. Was it picked up from a keyword? Assumed from a job title? Correlated with others in similar roles? The logic is rarely visible, and that’s a problem, especially when decisions based on these inferences affect access to jobs, development, or promotion.
Inferred skills suggest someone might have a capability. But hiring isn’t about possibility. It’s about evidence of capability. Saying you’ve led a team isn’t the same as doing it well. Collecting or observing actual examples of behaviour allows you to evaluate someone’s true competence at a claimed skill.
Some platforms try to infer proficiency, too, but this is still inference, not measurement. No matter how smart the model, it’s still drawing conclusions from indirect data.
By contrast, validated assessments like structured interviews, simulations, and psychometric tools are designed to measure. They observe behaviour against defined criteria, use consistent scoring frameworks (like Behaviourally Anchored Rating Scales, or BARS), and provide a transparent, defensible basis for decision-making. In doing this, the level or proficiency of a skill can be placed on a properly calibrated scale.
But here’s the thing: we don’t have to choose one over the other.
The real opportunity lies in combining the rigour of measurement with the scalability of inference.
Start with measurement
Define the skills that matter. Use structured tools to capture behavioural evidence. Set a clear standard for what good looks like. For example, define Behaviourally Anchored Rating Scales (BARS) when assessing interviews for skills. Using a framework like Sapia.ai’s Competency Framework is critical for defining what you want to measure.
Layer in inference
Apply AI to scale scoring, add contextual nuance, and detect deeper patterns that human assessors might miss, especially when reviewing large volumes of data.
Anchor the whole system in transparency and validation
Ensure people understand how inferences are made by providing clear explanations. Continuously test for fairness. Keep human oversight in the loop, especially where the stakes are high. More information on ensuring AI systems are transparent can be found in this paper.
This hybrid model respects the strengths and limits of both approaches. It recognises that AI can’t replace human judgement, but it can enhance it. That inference can extend reach, but only measurement can give you higher confidence in the results.
Inference can support and guide, but only measurement can prove. And when people’s futures are on the line, proof should always win.
Ahuja, A. (2024). LinkedIn profile analysis reveals gender-based differences in self-presentation among Indian MBA graduates. Journal of Business and Psychology.
Hiring for care is unlike any other sector. Recruiters are looking for people who can bring empathy, resilience, and energy to the most demanding human roles. Whether it’s dental care, mental health, or aged care, new hires are charged with looking after others when they’re most vulnerable. The stakes are high.
Hiring for care is exactly where leveraging ethical AI can make the biggest impact.
The best carers don’t always have the best CVs.
That’s why our chat-based AI interview doesn’t screen for qualifications. It screens for the the skills that matter when caring for others. The traits that define a brilliant care worker, things like:
Empathy, Self-awareness, Accountability, Teamwork, and Energy.
The best way to uncover these traits is through structured behavioural science, delivered through an experience that allows candidates to open up. Giving candidates space to give real-life, open-text answers. With no time pressure or video stress. Then, our AI picks up the signals that matter, free from any demographic data or bias-inducing signals.
Candidates’ answers to our structured interview questions aren’t simply ticking boxes. They’re a window into how someone shows up under pressure. And they’re helping leading care organisations hire people who belong in care and those who stay.
Inclusivity should be a core foundation of any talent assessment, and it’s a fundamental requirement for hirers in the care industry.
When healthcare hirers use chat-based AI interviews, designed to be inclusive for all groups, candidates complete their interviews when and where they choose, without the bias traps of face-to-face or phone screening. There are no accents to judge, no assumptions, just their words and their story.
And it works:
Drop-offs are reduced, and engagement & employer brand advocacy go up. Building a brand that candidates want to work for includes providing a hiring experience that candidates want to complete.
Our smart chat already works for some of the most respected names in healthcare and community services. Here’s a sample of the outcomes that are possible by leveraging ethical AI, a validated scientific assessment, wrapped in an experience that candidates love:
The case study tells the full story of how Sapia.ai helped Anglicare, Abano Healthcare, and Berry Street transform their hiring processes by scaling up, reducing burnout, and hiring with heart.
Download it here: