An AI saw a cropped photo of AOC. It autocompleted her wearing a bikini.
In their study, Steed and Caliskan once again found that those distances mirror the results of IAT. Photos of men and ties and suits appear close together, while photos of women appear farther apart. The researchers got the same results with SimCLR, despite it using a different method for deriving embeddings from images.
These results have concerning implications for image generation. Other image-generation algorithms, like generative adversarial networks, have led to an explosion of deepfake pornography that almost exclusively targets women. iGPT in particular adds yet another way for people to generate sexualized photos of women.
But the potential downstream effects are much bigger. In the field of NLP, unsupervised models have become the backbone for all kinds of applications. Researchers begin with an existing unsupervised model like BERT or GPT-2 and use a tailored datasets to “fine-tune” it for a specific purpose. This semi-supervised approach, a combination of both unsupervised and supervised learning, has become a de facto standard.
Likewise, the computer vision field is beginning to see the same trend. Steed and Caliskan worry about what these baked-in biases could mean when the algorithms are used for sensitive applications such as in policing or hiring, where models are already analyzing candidate video recordings to decide if they’re a good fit for the job. “These are very dangerous applications that make consequential decisions,” says Caliskan.
Deborah Raji, a Mozilla fellow who co-authored an influential study revealing the biases in facial recognition, says the study should serve as a wakeup call to the computer vision field. “For a long time, a lot of the critique on bias was about the way we label our images,” she says. Now this paper is saying “the actual composition of the dataset is resulting in these biases. We need accountability on how we curate these data sets and collect this information.”
Steed and Caliskan urge greater transparency from the companies who are developing these models to open source them and let the academic community continue their investigations. They also encourage fellow researchers to do more testing before deploying a vision model, such as by using the methods they developed for this paper. And finally, they hope the field will develop more responsible ways of compiling and documenting what’s included in training datasets.
Caliskan says the goal is ultimately to gain greater awareness and control when applying computer vision. “We need to be very careful about how we use them,” she says, “but at the same time, now that we have these methods, we can try to use this for social good.”