By Krishna Deepak Maram | MA Student | Cornell Tech
Timnit Gebru (pictured above) recently gave a talk at the DLI seminar titled “Limits of AI”. A constant underlying theme of the talk was about how her life experiences shaped her work. She currently works at the Ethical AI team at Google AI. As a cofounder of Black in AI, she promotes for diversity in the field.
AI is everywhere
Artificial Intelligence is already being used in automated decision-making tools. For example, face recognition is being used to automate the hiring process. In one such tool, video recording of the interviewee is fed into a machine learning tool to automate the assessment of the interviewee. Another example is a natural language processing tool that automates the process of reviewing applicant’s resumes. AI is also being used in a wide range of fields such as power and water grids, banking, financial sector etc.
Despite the high impact of the decisions made by AI tools, there are no testing requirements to determine the suitability of such tools. In 2016, a ProPublica article found that the software used to predict future criminals is biased against black people. More recently, Amazon scrapped a recruiting tool that showed bias against women. The speaker points out that the frequency of such events has increased dramatically.
Why is it happening?
The speaker presents some of her recent work showing the gender and skin tone bias present in commercial face detection systems. The authors find that the standard training datasets encode such biases which end up in the resulting machine learning models. For example, facial analysis datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) leading to an unfair bias towards darker-skinned females. The important takeaway from this work is the need for diversity to build a fair and transparent ML tools. The speaker also points out to the lack of gender/racial diversity in the current tech community as a possible reason for this.
Solution: AI Datasheets
“All is not lost” – The speaker points out that the AI community can take inspiration from other ‘developed’ industries. For example, in the hardware industry, each capacitor is accompanied by a datasheet describing its properties in excruciating detail. Taking inspiration from it, the speaker points out to the need for a Datasheet for Datasets. There is a need for standards and documentation about the machine learning models being used. For example, information about the training data, the standard operating procedure under which the tool can be used will improve transparency. The speaker ends the talk by calling for an increase in fairness studies to improve the accountability and transparency of machine learning systems.