Building AI to Unlearn Bias in Recruiting

Bias and its impact on our workforce is an acute problem in our society. For example, a study by the Ascend Foundation showed that white men and women in major Silicon Valley firms were 154 percent more likely to become executives than other races. We also know that an unbiased society resulting in a diverse workforce can generate significant financial and cultural upsides. For example, a McKinsey study on diversity concluded that companies in the top quartile of diversity performance generate 35 percent more financial returns than the median.

Our society has made significant progress in uncovering and unlearning causes of bias. However, our progress on unbiasing the society is potentially at risk with artificial intelligence advancements. And an AI-led future is evident, but it is our collective responsibility to get it right.

What’s causing bias today?

In our current recruiting processes, most decisions are dependent on and driven by humans. A person’s judgments can often be based on a small number of anecdotal data points (e.g., if three similar applicants perform well in a sales executive interview, we might subconsciously start screening for traits common to them). Also, these judgment calls vary based on the decision-maker’s cultural, social and educational background, leading to low consistency across individuals. Furthermore, we aren’t always self-aware of all the filters we are applying to make judgments. This is commonly referred to as unconscious bias.

With increases in automation over time, humans get involved at fewer points in the decision-making process. While this reduces inconsistency and task completion errors, the actual risk of bias might not reduce much, as most of the automated tasks were quite objective to begin with (e.g. automatic changing of statuses in the ATS). As long as humans are making subjective decisions without checks and balances, there would be bias in the system.

Can AI reduce bias?

The new era of AI-led automation is replacing human’s subjective decision-making and is now evaluating large amounts of data that were not earlier methodically factored in. The advantage of using machines is that judgments are based on holistic correlations on a statistically significant sample, leading to high consistency and better outcomes.

However, one needs to be aware of the following gotchas:

  • Without explicit external guidance, machines will use every factor for its face value. This means that while mathematically they aren’t biased, socially they could be using factors that we deem inappropriate.
  • The current implementation of AI in the form of machine learning feels much more like a black box. It is often difficult to dissect these systems and understand how decisions are being engendered within them. The internal guts and flows don’t always translate to real-world notions.
  • Most of these systems are GIGO, or garbage in garbage out. This means that if incorrect data or insufficient data is fed in, the models can be wrongly trained or over-trained and thus not much good at making sound judgments.

Training for the same outcomes that humans would have made doesn’t really take out bias (as the premise would be that the original decisions had no bias in them).
Let’s take examples of recruiting AI technology sold today for subjective decision-making.

    • Personality/skill-based assessment: There are a ton of assessment systems that profile your current high- and low-performing employees to determine the right qualities of a new hire. While the smarter version of these systems doesn’t take into account a subject’s personal information, how does one make sure that the qualities that are taken into account aren’t directly correlated to more frowned-upon factors? How do we know that if we hired more of the same type of “high-performing” people versus the current mix, the company performance would actually improve?
    • Automated résumé screening for job matching: Given that résumés don’t really have a standard template, they are often bashed for being insufficient and inconsistent in representing candidate skills and qualities. There are also known predilections on how different genders speak about their skills in résumés. Yet résumé parsing technology often does not account for these factors when scoring candidates.
    • Video assessment: There are video interviewing tools out there that automatically screen for certain qualities, like number of times a candidate says “please” or number of times they smile. They also give hiring managers the ability to view the videos before inviting someone for the interview. How do we make sure that this doesn’t lead to unconscious screening for gender or race?

How can we implement AI right to solve the bias perpetuation challenge?

Machine learning systems have algorithms that are sensitive to input and training data. Thus, it’s foundational to ensure that the input data is indeed appropriate and devoid of bias-causing factors, e.g., personally identifiable information and equal employment opportunity data like gender, race, name, military status, etc.

Consider the following before buying and deploying a system:

      • Ensure the system can share how the input parameters are related to outcomes (e.g., correlation strengths).
      • Keep a control data set that is separate from the training set. When an algorithm goes through changes or self-improvement, it must provide visibility into how the outcomes are changing for the control set and whether it is net positive.
      • Ensure that the system doesn’t produce outcomes based on any and all correlations. Understand patterns of which correlations matter and whether there is a logical explanation (causality). Furthermore, inspect where the human outcome is different from the machine and learn from the differences. In essence, separate causality from correlations.


Show that the system actively discourages conscious bias and only uses factors that matter.

      • Once the system is configured for deployment, create a test data set that has data points representing traditional bias.
      • Test the algorithm against this set and observe where outcome is different from what would have been expected.
      • Compare what those outcomes would be for the control set if human recruiters went through steps independently. This can train the system better and also help human recruiters uncover their own biases!


Leave room for the unknown and establish processes for external input.

      • Set up a processes where people can add in complaints.
      • Ultimately, bias elimination will take time; it’s an evolution that our society is advancing to uncover and unlearn. A system, just like a human, doesn’t know what it doesn’t know, so there should always be room for feedback that can go in as training data.
      • The test set should grow with unique complicated cases that the system couldn’t handle. Every time a data point appears that wasn’t covered well within the algorithm, we should add that to the control set to check.


Just like how we train humans to uncover and reduce bias, AI systems must also be taught by providing quality untainted training data and built to provide the needed visibility and control to solve the AI and bias challenge.

Schedule Demo Widget
How AllyO helps you adapt now and prepare for the future of hiringLearn More