Is it the AI That's Racist, or is it the Humans That Create the AI?

Racism is a poison in our society, one which until recently, AI was thought immune to. Underlying this is the notion that AI are incapable of conscious thought, so they cannot consciously discriminate. However, much like humans can have unconscious bias, so can AI. Over the last decade there have been countless examples of racial bias displayed in AI algorithms, or AI learning racism through machine learning. As a mixed-race individual, I want to know where AI has been racist and why this was the case.

MIT were embarrassed in July this year, when they were forced to take offline an AI training data-set which, following an investigation by The Register, was found to be describing people with racist, misogynistic and discriminatory language. The data-set had been used to train machine learning models to identify people and items in images. However, the descriptions of those people were often highly derogatory and contained highly offensive language. The issue here was, due to a lack of oversight, that the models were accidentally trained using discriminatory data. While this problem is easily rectified once identified, it does highlight the risk that machine learning algorithms with poorly constructed data-sets pose, especially if the “racism” in those data-sets is more subtle, such as an machine learning algorithm which scores negative points for “non-British names” on CVs. 

Google was forced to apologise in April after its “Vision AI”, an algorithm which labels images based on their content, was found to come up with very different results dependent on the skin colour of people in the image. This is demonstrable by the image below, where when a black person holds a thermometer, it is labelled as a “gun” but when a white person holds the same thermometer, it is labelled as a “tool”. This result purports the racial stereotype that black people are violent, leading to concerns that the algorithm was racially biased. Yet again, we see an issue with a poor dataset used to train the algorithm unintentionally leading to racial bias, which further affirms just how important it is that the datasets are properly curated before training.

 

Source: bjnagel

 

On the 6th of August this year, the Home Office announced that it was getting rid of the algorithm it had been using to make VISA decisions. This decision came following investigations from the Joint Council for the Welfare of Immigrants and Foxglove concluding that the algorithm used to approve or deny VISAs had a “racial bias”, both in favour of white people and against black people. FoxGlove went as far as to claim that the Home Office kept a blacklist of “suspect nationalities”, which were more likely to be denied visas. Furthermore, the nationalities of those denied VISAs were processed in  a negative feedback loop, this meant that the algorithm was more likely to deny visas from those with the same nationality, exacerbating the bias. If that was the case, then clearly it is not a case of the AI being accidentally racist but the AI being deliberately designed to have a nationality bias, which became a racial bias as the blacklist and the negative feedback loop would need to be coded directly into the system.

It isn’t the AI that’s racist; it’s the humans who trained it. They might not be racist personally, but through either the presence of human biases or a lack of training for certain ethnicities in a dataset, the AI is taught to be racist. This has the potential to be an enormous issue if we are to move to full automation in the future. Those creating the AI and the datasets used to train them must ensure the AI is properly tested and that any potential for racial bias is eliminated before the AI is used in a professional setting. Otherwise, almost every automated facet of society could be made more difficult for ethnic minorities to navigate without us even realising.

Thumbnail Credit: phys.org