By Yezhou Ma and Zhihao Liu | MS Students, Cornell Tech
Based on a Digital Life Seminar talk given by Dr. Benjamin Fish, Microsoft Research Montreal, at Cornell Tech on October 31, 2019.
Background
Artificial Intelligence is increasingly employed in a variety of industries and domains, one of which is recruitment. Automated recruitment based on Machine Learning has come to reality as advanced algorithms and models are being rolled out. ML-based recruitment is a data-driven process that saves a tremendous amount of time to manually screen resumes and prevents recruiters’ personal biases in candidate assessment. Thisway Global LLC, which developed the award winning AI platform AI4jobs, has concluded three main advantages ML-based recruitment over traditional hiring process as “standardized process for faster results, elimination of exorbitant costs, and data-driven results with increased matching accuracy”.[1] These reasons have well explained why the new recruitment process is thriving.
However, the magic behind ML-based recruitment process–driven by data and referring to previous cases–presents issues by inheriting the bias in previous judgement and social discrimination on race, gender, age, etc. In some less prudent ML models, the bias is even exacerbated and candidates belonging to minority groups could simply be filtered in the first phase. Therefore, there is controversy over ML recruitment and fairness is the main concern which has oftentimes been shadowed by efficiency and cost.
Existing Definitions
Before we try to improve the ML model for the sake of fairness, we first need to understand what type of fairness we are supposed to attain. Previous computational definitions of fairness are depicted as equalizing resources (denoted as y) across groups (denoted as S). The statistical parity requires the percentage of S = 0 getting label + should be the same as that of S = 1 getting label +. In recruitment circumstances, we need to consider feature y as the minimum requirement of a qualified candidate. Then the statistical parity turns to equalized odds which means percent of S = 0 get label + the same as the percent of S = 1 get label + if their y are all true. In other words, similar people should get similar results.
Another two existing definitions of fairness are Equality of Outcome (EOO) and Equality of Opportunity (EOP). EOO gives more resources to those who actually need more. In the hiring market scenario, it favors people with special needs and enables them to succeed despite their disadvantages in the competition. However, EOO removes peoples' incentive to work hard. On the contrary, EOP provides the same amount of resources to everyone regardless. It lets everyone start fresh and play with what they have, kind of like playing a game. However, the difference between reality and a game is that people are intrinsically different in this society and some people do need extra support as far as fairness is concerned. In the hiring market, it puts people with special needs at a disadvantage and makes it very difficult for them to compete with others.
Equality is not about prohibiting unbalanced distribution of resources, but rather incentivizing people not to let it happen in the first place. In order to resolve fairness issues in ML-based automated recruiting, our guest speaker Dr. Fish has proposed a formalized model aiming to realize or at least approach the goal of equality described by political economist Elizabeth Anderson - the proper negative aim of egalitarian justice is not to eliminate the impact of brute luck from human affairs, but to end oppression, which by definition is socially imposed. Its proper positive aim is not to ensure that everyone gets what they morally deserve, but to create a community in which people stand in relations of equality to others.[2]
Model
Suppose a game where the players are several firms and several candidates. Every player is rational and plays the best strategy based on their beliefs. Every player's action is public to other players. Players' beliefs are based on both existing information and the game outcomes for previous rounds. Each firm has two types of offers, a low offer and a high offer. As a first intuition of the game, a firm only gives the low offer if it believes the candidate has no other option.
In this model, the concept of social standing is formalized. By verbal definition, social standing is a kind of social differentiation whereby members of society are grouped into socioeconomic strata, based upon their occupation and income, wealth and social status, or derived power. In this model, social standing is reflected in the firms' beliefs. The model has a discount factor that represents the future utility if the candidate waits to the next round. When the discount factor is 0, the best strategy for the candidate is to always accept the current offer because there is no potential utility in the future. Apart from the candidates' background information, the most significant factor that drives this game is actually the "next round" estimation effects. Firms make decisions based on the state of the current hiring market, e.g. how many firms and candidates are waiting for a match and what their strategies would be. Therefore, a candidate's social standing is determined both by his/her background information and by all the players' previous actions in the game. An enhanced version of this model introduces a new concept named public option, which is an option available immediately to every candidate if he/she does not work for any of the firms. In this case, the low offer of a firm can never be lower than the public option otherwise no candidate would accept it. The public option provides a lower bound of the hiring market, like a guarantee. However, it is still controversial whether the idea of public option is realistic given the current capitalism in the United States.
Reflection
Since most of us are or have been at the candidate side of the hiring market, there are a lot of examples we can relate to this model. For instance, in the tech industry, there are certain companies known for giving offers significantly lower than the market average to candidates without competing offers. Some companies do this to take advantage of the candidates as they are normally the weaker side of the game, while some companies do this openly and deliberately. There exist companies that directly tell the candidate that the offer can go up as long as they have competing offers from other recognized companies. In this case, social standing becomes not only a secretive judgement of the company but also a public evaluation of the candidate, i.e. higher social standing is achieved by getting more offers. This phenomenon makes sense to a certain extent as getting multiple offers can reflect a candidate's competitiveness in a certain way. However, in reality it has a lot of negative influences on the market dynamic because candidates expect their competence to be recognized in their present performance instead of being defined by historical beliefs in the market.
This model has opened our eyes about how ML systems can harm fairness in a subtle way. By focusing on a relatable social domain, the hiring market, it provides us deeper insights on the complex mathematical definitions. As mentioned by Dr. Fish, the goal to achieve social fairness in ML systems is not the issue for either computer scientists or social scientists but requires collaboration between both areas of expertise. As a lot of research effort is put into making ML systems perform better, more attention should be paid to the social and moral aspects of these systems. Social values in computational systems should never be taken for granted.
References:
Comments