Skip links

The Majority of The Costs of Data Pollution are Hidden from us

This is the intervention of Director of the Sustainable AI Lab Aimee van Wynsberghe at the first meeting of the DPP Group of the Data Pollution & Power Initiative.

The DPP Group is set up to examine the data of AI as a human and natural resource in a sensitive “eco-system” – and ”data pollution” as the interrelated (big) data of AI adverse effects on the UN sustainable development goals. It is part of the Sustainable AI Lab’s[1] Data Pollution & Power initiative[2]. The DPP Group will in 2021-2022 explore the ”data pollution” of AI as the interrelated adverse effects of the data of AI and the power dynamics that shape the field. The group’s meetings are not accessible to the public. Mini reports from the meetings are made accessible on the DPP website:

The objective of the first DPP group meeting was to make an initial dive into the different expertises and interests in the data pollution of AI represented in the group, map out themes and power dynamics that are particular to the respective fields of expertise and research focus. Ultimately the aim was to understand how and where the various perspectives on and analysis of the data pollution of AI intersect and scoping out the power dynamics of a potential common research field.

Aimee Van Wynsberghe:

Although not all AI is based on data driven models (there are also theoretically driven models etc.), data is today the driver for the most part of the AI in use and commercially available. A variety of ethical implications of AI can be mentioned, but one of the ethical issues that are under researched and under valued at the moment is the environmental consequences; that there is a link between the “pollutant effect” of data and AI. When we train algorithms there are CO2 emmissions, electronic waste, mining of precious minerals etc. And here we have to consider the environments of the vulnerable demographics that are suffering the consequences of this. For example, on the continent of Africa there are people who are dealing with the electronic waste in their back yards, people who are working in the mines. In the West, we are so distant from the core problem. The environmental consequences are what is felt by these vulnerable demographics that do not have a seat at the table when decisions are made. This means that sustainability of the data of AI is in fact a human rights issue, because vulnerable demographics are suffering the most. It is not equal. Data pollution and AI pollution is about the deterioration of human rights on an intra-generational as well as intergenerational scale. 

Under this umbrella of data and AI as a deterioration of human rights, we can think of data ethics. There is a list of ethical issues related to the data life cycle of AI. How data has been collected, how it has been acquired, how it has been sourced, how it is stored, how it is been labelled. There are both social and environmental costs related to the data. Then we look at the usage there are issues related to AI bias, responsibility gaps, safety and security issues are all environmental consequences. A core problem is here a rush to use AI for anything. Your company will not be successful if you do not use AI. It is a vicious cycle. We accelerate the data ethics issues, because we use the data for AI. It is “unsustainable” on top of “unsustainable”. Data + AI unsustainability. 

How can we calculate the environmental impact of the data life cycle? As well as the AI lifecycle (the training, the tuning and the usage). How do we systemize this? What are the different methodologies to do this? And how do we regulate it? In addition to regulating in terms of privacy and security we also need to regulate in terms of the environmental effect. Maybe we need a carbon cap? Companies and scientists need to track this and then they need to put a cap on it. 

We also need to think about proportionality – it is of course not all or nothing. Nevertheless, there should be room to explore areas where the data will only be “abusable” (see Pak Hang Wong in first meeting report) and therefore it should be banned. 

There are indeed power dynamics that prevent change. Big tech have the power. They are the ones that have the ability to collect data and because of their data driven business models governments are afraid to “stifling innovation”. Europe does not have a unicorn, which is a good thing. But the narrative is that it is not. The general public does not understand that there is an environmental impact of every click they make. For example, why do we need autonomous vehicles? Why not build a train instead? There is this idea that there is always the potential that some good could always come out of it and this is why we have to do it. This is  not a good enough argument when we know the severity of the consequences of these technologies.  

The majority of the costs are hidden from us. They are in the backyards of vulnerable communities. They are hidden from us because we don’t live with those consequences every day. Here, I think there is also a risk of imperialism. India and Africa are trying to get in to the AI space and they look at the European Commission or the United Nations ethical guidelines for AI. And they say: You have had all this time to develop your AI and models and data and now you are telling us that we have to do it in this ethical way. What is the solution to that? Because in this way we are accelerating digital divides all over again.  

On a last note in the scientific field, we should ask journals to require that the research they publish should track the carbon emmissions that result from their AI research. Big tech and academics should also do this. 

[1] The Bonn University’s Institute of Science and Ethics Sustainable AI Lab is a new section established in 2021 by Aimee van Wynsberghe

[2] The Data Pollution & Power initiative is set up at the Sustainable AI Labby the independent senior researcher Gry Hasselbalch to explore the power dynamics that shape the data pollution of AI across the UN Sustainable Development Goals. The project examines how power dynamics and interests in the data of AI determine how data resourcesare handled and distributed in our data eco system and considers actions and governance approaches that are intrinsically interrelated in systems of power and interests. In addition to the establishment of the DPP group, a DPP white paper will be published in 2022.