Ai can be sexist because of internet content, study says

The momentum of which artificial intelligence has been subjected over the last time, in addition to bringing us interesting, useful, and curious advances, has also triggered the development of new debates.

One of these topics of discussion is around the biases that could reflect and replicate these technologies, responding not only to predefined conditions by hand but also to the same content of the web, including that of some social networks.

It is these conditions that motivated an American research team to analyze the work of two image-based algorithms, which automatically tend to complete photos of men in suits and ties, while for women they apply bikinis or neck T-shirts.

These biases are based on the content used as the basis for training these algorithms. Whereas on certain web portals, more social networks such as Reddit or Twitter circulate without filter any content that can be right-listed as sexist, offensive, or uninforming; which, unfortunately, are standardized by algorithms. This dynamic also occurs in AI systems that work with images.

Ryan Steed and Aylin Caliskan, of Carnegie Mellon and George Washington universities, respectively, stated that if a close-up photo of a person (face only) is introduced for an algorithm to complete, there is 43% of a body with a suit, if it is a man, while 53% of the changes will be self-completed with a decoyed garment or bikini if the photo corresponds to a woman.

Two popular algorithms under the magnifying glass

Recent research by Steed and Caliskan focused on two copies: OpenAI iGPT, a version of GPT-2 that works with pixels instead of words; and Google's SimCLR.

Both algorithms are widely used in AI solutions that have emerged over the past year and share as a common element, in addition to their popularity, the condition of using unsupervised learning systems, which makes them do without human help to classify images.

Image GPT (

With supervised systems, the training of algorithms of this type was done based on predefined classifications by humans. That is, under this model an AI can recognize, for example, as photos of trees only those that meet the criteria that meet what was initially provided to the algorithm as a sign of the concept.

The main vice of these supervised systems is the spread of the biases of those who contribute to the construction of their training databases. The most common have been sexist in the face of women and discriminatory against various minorities.

Google AI Blog: Advancing Self-Supervised and Semi-Supervised Learning with SimCLR (

Different formulas, same result

The analysis work around the two aforementioned algorithms, unfortunately, does not reveal a more encouraging picture in this regard, since in the absence of pre-defined guidelines, the benchmark of unsupervised systems is Internet content. Based on that search, the algorithm begins to make associations between words or images that usually appear together.

[2010.15052] Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases (

Under the same principle, iGPT is responsible for grouping or separating pixels according to how often images used in your training appear inside. Through the results obtained, the relationships established by the algorithm can be revealed. SimCLR, for its part, despite using a different methodology, aims at the execution of processes and obtaining similar results.

Despite the differences in origin, both algorithms obtained similar results. Photos of men and ties and suits tended to appear together, while photos of women appear more separate from these elements, but are more familiar with sexualized photographs.

Artificial intelligence challenges in the face of sexism

Video candidate evaluation, facial recognition technologies, and modern surveillance systems are in the process of development. Its base is based on AI algorithms.

Considering that scope potential, which expands profusely beyond the examples cited, the observations shared from this investigation ignite a warning sign about the direction this technology is taking.

An AI saw a cropped photo of AOC. It autocompleted her wearing a bikini. | MIT Technology Review

Aylin Caliskan pointed out in this scenario to MIT Technology Review  that "we must be very careful about how we use it (AI), but at the same time, now that we have these methods, we can try to use them for the social good."

The full report with the details of this research is available in a paper  for consultation

Post a Comment