We hope that the debate over the value of diverse teams is now over. There is plenty of evidence that diverse teams lead to better decisions and therefore, business outcomes for any organisation.
This means that CHROs today are being charged with interrupting the bias in their people decisions and expected to manage bias as closely as the CFO manages the financials.
But the use of Ai tools in hiring and promotion requires careful consideration to ensure the technology does not inadvertently introduce bias or amplify any existing biases.
To assist HR decision-makers to navigate these decisions confidently, we invite you to consider these 8 critical questions when selecting your Ai technology.
You will find not only the key questions to ask when testing the tools but why these are critical questions to ask and how to differentiate between the answers you are given.
Another way to ask this is: what data do you use to assess someone’s fit for a role?
First up- why is this an important question to ask …
Machine-learning algorithms use statistics to find and apply patterns in data. Data can be anything that can be measured or recorded, e.g. numbers, words, images, clicks etc. If it can be digitally stored, it can be fed into a machine-
learning algorithm.
The process is quite basic: find the pattern, apply the pattern.
This is why the data you use to build a predictive model, called training data, is so critical to understand.
In HR, the kinds of data that could be used to build predictive models for hiring and promotion are:
If you consider the range of data that can be used in training data, not all data sources are equal, and on its surface, you can certainly see how some carry the risk of amplifying existing bases and the risk of alienating your candidates.
Using data that is invisible to the candidate may impact your employer brand. And relying on behavioural data such as how quickly a candidate completes the assessment, social data or any data that is invisible to the candidate might expose you to not only brand risk but also a legal risk. Will your candidates trust an assessment that uses data that is invisible to them, scraped about them or which can’t be readily explained?
Increasingly companies are measuring the business cost from poor hiring processes that contribute to customer churn. 65% of candidates with a positive experience would be a customer again even if they were not hired and 81% will share their positive experience with family, friends and peers (Source: Talent Board).
Visibility of the data used to generate recommendations is also linked to explainability which is a common attribute now demanded by both governments and organisations in the responsible use of Ai.
Video Ai tools have been legally challenged on the basis that they fail to comply with baseline standards for AI decision-making, such as the OECD AI Principles and the Universal Guidelines for AI.
Or that they perpetuate societal biases and could end up penalising nonnative speakers, visibly nervous interviewees or anyone else who doesn’t fit the model for look and speech.
If you are keen to attract and retain applicants through your recruitment pipeline, you may also care about how explainable and trustworthy your assessment is. When the candidate can see the data that is used about them and knows that only the data they consent to give is being used, they may be more likely to apply and complete the process. Think about how your own trust in a recruitment process could be affected by different assessment types.
1st party data is data such as the interview responses written by a candidate to answer an interview question. It is given openly, consensually and knowingly. There is full awareness about what this data is going to be used for and it’s typically data that is gathered for that reason only.
3rd party data is data that is drawn from or acquired through public sources about a candidate such as their Twitter profile. It could be your social media profile. It is data that is not created for the specific use case of interviewing for a job, but which is scraped and extracted and applied for a different purpose. It is self-evident that an Ai tool that combines visible data and 1st party data is likely to be both more accurate in the application for recruitment and have outcomes more likely to be trusted by the candidate and the recruiter.
At PredictiveHire, we are committed to building ethical and engaging assessments. This is why we have taken the path of a text chat with no time pressure. We allow candidates to take their own time, reflect and submit answers in text format.
We strictly do not use any information other than the candidate responses to the interview questions (i.e. fairness through unawareness – algorithm knows nothing about sensitive attributes).
For example, no explicit use of race, age, name, location etc, candidate behavioural data such as how long they take to complete, how fast they type, how many corrections they make, information scraped from the internet etc. While these signals may carry information, we do not use any such data.
Another way to ask this is – Can you explain how your algorithm works? and does your solution use deep learning models?
This is an interesting question especially given that we humans typically obfuscate our reasons for rejecting a candidate behind the catch-all explanation of “Susie was not a cultural fit”.
For some reason, we humans have a higher-order need and expectation to unpack how an algorithm arrived at a recommendation. Perhaps because there is not much to say to a phone call that tells you were rejected for cultural fit.
This is probably the most important aspect to consider, especially if you are the change leader in this area. It is fair to expect that if an algorithm affects someone’s life, you need to see how that algorithm works.
Transparency and explainability are fundamental ingredients of trust, and there is plenty of research to show that high trust relationships create the most productive relationships and cultures.
This is also one substantial benefit of using AI at the top of the funnel to screen candidates. Subject to what kind of Ai you use, it enables you to explain why a candidate was screened in or out.
This means recruitment decisions become consistent and fairer with AI screening tools.
But if Ai solutions are not clear why some inputs (called “features” in machine learning jargon) are used and how they contribute to the outcome, explainability becomes impossible.
For example, when deep learning models are used, you are sacrificing explainability for accuracy. Because no one can explain how a particular data feature contributed to the recommendation. This can further erode candidate trust and impact your brand.
The most important thing is that you know what data is being used and then ultimately, it’s your choice as to whether you feel comfortable to explain the algorithm’s recommendations to both your people and the candidate.
Assessment should be underpinned by validated scientific methods and like all science, the proof is in the research that underpins that methodology.
This raises another question for anyone looking to rely on AI tools for human decision making – where is the published and peer-reviewed research that ensures you can have confidence that a) it works and b) it’s fair.
This is an important question given the novelty of AI methods and the pace at which they advance.
At PredictiveHire, we have published our research to ensure that anyone can investigate for themselves the science that underpins our AI solution.
INSERT RESEARCH
We continuously analyse the data used to train models for latent patterns that reveal insights for our customers as well as inform us of improving the outcomes.
It’s probably self-evident why this is an important question to ask. You can’t have much confidence in the algorithm being fair for your candidates if no one is testing that regularly.
Many assessments report on studies they have conducted on testing for bias. While this is useful, it does not guarantee that the assessment may not demonstrate biases in new candidate cohorts it’s applied on.
The notion of “data drift” discussed in machine learning highlights how changing patterns in data can cause models to behave differently than expected, especially when the new data is significantly different from the training data.
Therefore on-going monitoring of models is critical in identifying and mitigating risks of bias.
Potential biases in data can be tested for and measured.
These include all assumed biases such as between gender and race groups that can be added to a suite of tests. These tests can be extended to include other groups of interest where those group attributes are available like English As Second Language (EASL) users.
On bias testing, look out for at least these 3 tests and ask to see the tech manual and an example bias testing report.
INSERT IMAGE
At PredictiveHire, we conduct all the above tests. We conduct statistical tests to check for significant differences between groups of feature values, model outcomes and recommendations. Tests such as t-tests, effect sizes, ANOVA, 4/5th, Chi-Squared etc. are used for this. We consider this standard practice.
We go beyond the above standard proportional and distribution tests on fairness and adhere to stricter fairness considerations, especially at the model training stage on the error rates. These include following guidelines set by IBM’s AI Fairness 360 Open Source Toolkit. Reference: https://aif360.mybluemix.net/) and the Aequitas project at the Centre for Data Science and Public Policy at the University of Chicago
We continuously analyse the data used to train models for latent patterns that reveal insights for our customers as well as inform us of improving the outcomes.
We all know that despite best intentions, we cannot be trained out of our biases. Especially the unconscious biases.
This is another reason why using data-driven methods to screen candidates is fairer than using humans.
Biases can occur in many different forms. Algorithms and Ai learn according to the profile of the data we feed it. If the data it learns from is taken from a CV, it’s only going to amplify our existing biases. Only clean data, like the answers to specific job-related questions, can give us a true bias-free outcome.
If any biases are discovered, the vendor should be able to investigate and highlight the cause of the bias (e.g. a feature or definition of fitness) and take corrective measure to mitigate it.
If you care about inclusivity, then you want every candidate to have an equal and fair opportunity at participating in the recruitment process.
This means taking account of minority groups such as those with autism, dyslexia and English as a second language (EASL), as well as the obvious need to ensure the approach is inclusive for different ethnic groups, ages and genders.
At PredictiveHire, we test the algorithms for bias on gender and race. Tests can be conducted for almost any group in which the customer is interested. For example, we run tests on “English As a Second Language” (EASL) vs. native speakers.
If one motivation for you introducing Ai tools to your recruitment process is to deliver more diverse hiring outcomes, it’s natural you should expect the provider to have demonstrated this kind of impact in its customers.
If you don’t measure it, you probably won’t improve it. At PredictiveHire, we provide you with tools to measure equality. Multiple dimensions are measured through the pipeline from those who applied, were recommended and then who was ultimately hired.
8. What is the composition of the team building this technology?
Thankfully, HR decision-makers are much more aware of how human bias can creep into technology design. Think of how the dominance of one trait in the human designers and builders have created an inadvertent unfair outcome.
In 2012, YouTube noticed something odd.
About 10% of the videos being uploaded were upside down.
When designers investigated the problem, they found something unexpected: Left-handed people picked up their phones differently, rotating them 180 degrees, which lead to upside-down videos being uploaded,
The issue here was a lack of diversity in the design process. The engineers and designers who created the YouTube app were all right-handed, and none had considered that some people might pick up their phones differently.
In our team at PredictiveHire, from the top down, we look for diversity in its broadest definition.
Gender, race, age, education, immigrant vs native-born, personality traits, work experience. It all adds up to ensure that we minimise our collective blind spots and create a candidate and user experience that works for the greatest number of people and minimises bias.
What other questions have you used to validate the fairness and integrity of the Ai tools you have selected to augment your hiring and promotion processes?
We’d love to know!
A new study has just confirmed what many in HR have long suspected: traditional psychometric tests are no longer the gold standard for hiring.
Published in Frontiers in Psychology, the research compared AI-powered, chat-based interviews to traditional assessments, finding that structured, conversational AI interviews significantly reduce social desirability bias, deliver a better candidate experience, and offer a fairer path to talent discovery.
We’ve always believed hiring should be about understanding people and their potential, rather than reducing them to static scores. This latest research validates that approach, signalling to employers what modern, fair and inclusive hiring should look like.
While used for many decades in the absence of a more candidate-first approach, psychometric testing has some fatal flaws.
For starters, these tests rely heavily on self-reporting. Candidates are expected to assess their own traits. Could you truly and honestly rate how conscientious you are, how well you manage stress, or how likely you are to follow rules? Human beings are nuanced, and in high-stakes situations like job applications, most people are answering to impress, which can lead to less-than-honest self-evaluations.
This is known as social desirability bias: a tendency to respond in ways that are perceived as more favourable or acceptable, even if they don’t reflect reality. In other words, traditional assessments often capture a version of the candidate that’s curated for the test, not the person who will show up to work.
Worse still, these assessments can feel cold, transactional, even intimidating. They do little to surface communication skills, adaptability, or real-world problem solving, the things that make someone great at a job. And for many candidates, especially those from underrepresented backgrounds, the format itself can feel exclusionary.
Enter conversational AI.
Organisations have been using chat-based interviews to assess talent since before 2018, and they offer a distinctly different approach.
Rather than asking candidates to rate themselves on abstract traits, they invite them into a structured, open-ended conversation. This creates space for candidates to share stories, explain their thinking, and demonstrate how they communicate and solve problems.
The format reduces stress and pressure because it feels more like messaging than testing. Candidates can be more authentic, and their responses have been proven to reveal personality traits, values, and competencies in a context that mirrors honest workplace communication.
Importantly, every candidate receives the same questions, evaluated against the same objective, explainable framework. These interviews are structured by design, evaluated by AI models like Sapia.ai’s InterviewBERT, and built on deep language analysis. That means better data, richer insights, and a process that works at scale without compromising fairness.
The new study, published in Frontiers in Psychology, put AI-powered, chat-based interviews head-to-head with traditional psychometric assessments, and the results were striking.
One of the most significant takeaways was that candidates are less likely to “fake good” in chat interviews. The study found that AI-led conversations reduce social desirability bias, giving a more honest, unfiltered view of how people think and express themselves. That’s because, unlike multiple-choice questionnaires, chat-based assessments don’t offer obvious “right” answers – it’s on the candidate to express themselves authentically and not guess teh answer they think they would be rewarded for.
The research also confirmed what our candidate feedback has shown for years: people actually enjoy this kind of assessment. Participants rated the chat interviews as more engaging, less stressful, and more respectful of their individuality. In a hiring landscape where candidate experience is make-or-break, this matters.
And while traditional psychometric tests still show higher predictive validity in isolated lab conditions, the researchers were clear: real-world hiring decisions can’t be reduced to prediction alone. Fairness, transparency, and experience matter just as much, often more, when building trust and attracting top talent.
Sapia.ai was spotlighted in the study as a leader in this space, with our InterviewBERT model recognised for its ability to interpret candidate responses in a way that’s explainable, responsible, and grounded in science.
Today, hiring has to be about earning trust and empowering candidates to show up as their full selves, and having a voice in the process.
Traditional assessments often strip candidates of agency. They’re asked to conform, perform, and second-guess what the “right” answer might be. Chat-based interviews flip that dynamic. By inviting candidates into an open conversation, they offer something rare in hiring: autonomy. Candidates can tell their story, explain their thinking, and share how they approach real-world challenges, all in their own words.
This signals respect from the employer. It says: We trust you to show us who you are.
Hiring should be a two-way street – a long-held belief we’ve had, now backed by peer-reviewed science. The new research confirms that AI-led interviews can reduce bias, enhance fairness, and give candidates control over how they’re seen and evaluated.
It’s time for a new way to map progress in AI adoption, and pilots are not it.
Over the past year, I’ve been lucky enough to see inside dozens of enterprise AI programs. As a CEO, founder, and recently, judge in the inaugural Australian Financial Review AI Awards.
And here’s what struck me:
Despite the hype, we still don’t have a shared language for AI maturity in business.
Some companies are racing ahead. Others are still building slide decks. But the real issue is that even the orgs that are “doing AI” often don’t know what good looks like.
The most successful AI adoption strategy does not have you buying the hottest Gen AI tool or spinning up a chatbot to solve one use case. What it should do is build organisational capability in AI ethics, AI governance, data, design, and most of all, leadership.
It’s time we introduced a real AI Maturity Model. Not a checklist. A considered progression model. Something that recognises where your organisation is today and what needs to evolve next, safely, responsibly, and strategically.
Here’s an early sketch based on what I’ve seen:
AI is a capability.And like any capability, it needs time, structure, investment, and a map.
If you’re an HR leader, CIO, or enterprise buyer, and you’re trying to separate the real from the theatre, maturity thinking is your edge.
Let’s stop asking, “Who’s using AI?”
And start asking: “How mature is our AI practice and what’s the next step?”
I’m working on a more complete model now, based on what I’ve seen in Australia, the UK, and across our customer base. If you’re thinking about this too, I’d love to hear from you.
For too long, AI in hiring has been a black box. It promises speed, fairness, and efficiency, but rarely shows its work.
That era is ending.
“AI hiring should never feel like a mystery. Transparency builds trust, and trust drives adoption.”
At Sapia.ai, we’ve always worked to provide transparency to our customers. Whether with explainable scores, understandable AI models, or by sharing ROI data regularly, it’s a founding principle on which we build all of our products.
Now, with Discover Insights, transparency is embedded into our user experience. And it’s giving TA leaders the clarity to lead with confidence.
Transparency Is the New Talent Advantage
Candidates expect fairness. Executives demand ROI. Boards want compliance. Transparency delivers all three.
Even visionary Talent Leaders can find it difficult to move beyond managing processes to driving strategy without the right data. Discover Insights changes that.
“When talent leaders can see what’s working (and why) they can stop defending their strategy and start owning it.”
What it is: The median time between application and hire.
Why it matters: This is your speedometer. A sharp view of how long hiring takes and how that varies by cohort, role, or team helps you identify delays and prove efficiency gains to leadership.
Faster time to hire = faster access to revenue-driving talent.
What it is: Satisfaction scores, brand advocacy measures, and unfiltered candidate comments.
Why it matters: Many platforms track satisfaction. Sapia.ai’s Discover Insights takes it further, measuring whether that satisfaction translates into employer and consumer brand advocacy.
And with verbatim feedback collected at scale, talent leaders don’t have to guess how candidates feel. They can read it, learn from it, and take action.
You don’t just measure experience. You understand it in the candidates’ own words.
What it is: The percentage of candidates who exit the hiring process at different stages, and how to spot why.
Why it matters: Understanding drop-off points lets teams fix friction quickly. Embedding automation early in the funnel reduces recruiter workload and elevates top candidates, getting them talking to your hiring teams faster.
Assessment completion benchmarks in volume hiring range between 60–80%, but with a mobile-first, chat-based format like Sapia.ai’s, clients often exceed that.
Optimising your funnel isn’t about doing more. It’s about doing smarter, with less effort and better outcomes.
What it is: The percentage of completed applications that result in a hire.
Why it matters: This is your funnel efficiency score. A high yield means your sourcing, screening, and selection are aligned. A low one? There’s leakage, misfit, or missed opportunity.
Hiring yield signals funnel health, recruiter performance, and candidate-process fit.
What it is: Insights into how candidate scores are distributed, and whether responses appear copied or AI-generated.
Why it matters: In high-volume hiring, a normal distribution of scores suggests your assessment is calibrated fairly. If it’s skewed too far left or right, it could be too hard or too easy, and that affects trust.
Add in answer originality, and you can track engagement integrity, protecting both your process and your brand.
To effectively lead, you need more than simply tracking; you need insights enabling action.
When you can see how AI impacts every part of your hiring, from recruiter productivity to candidate sentiment to untapped talent, you lead with insight, not assumption. And that’s how TA earns a seat at the strategy table.