Skip links

New Study Shows How Algorithms Trained on Text from the Internet Reproduce Human Biases

That AI reproduce and enhance human capabilities, including prejudices and biases, is not a novelty. But now this claim has been tested and even further evidenced by a group of Princeton University researchers that by training a popular algorithm on text from the internet have reproduced documented human prejudices.

In their study the researchers used an old psychological experiment where people where asked to mark names as pleasant or unpleasant (i.e. white sounding or black sounding names) that showed their prejudices. But instead of humans they have used a popular algorithm that is normally used to parse natural human language.

In the abstract of the study, which has not been published yet, the researchers claim to demonstrate for the first time, that widely-used language processing algorithms trained on human writing from the internet reproduce human biases (such as racist and sexist biases).

“We show that prejudices that reduce the number of interview invitations sent to people because of the racial association of their name, and that associate women with arts rather than science or mathematics, can be retrieved from standard language tools used in ordinary AI products”

They conclude that transparancy of the design (the code and the process of application should be public) of the algorithm as well as ethically designed algorithms are not sufficient when the training data (human nomral language) is already biased:

“Bias should be the expected result whenever even an unbiased algorithm is used to derive regularities from any data; bias is the regularities discovered”.

Abstract of study

Article from Motherboard on the study