Skip links

New White Paper: Data Pollution is to the big data age what smog was to the industrial age

A new white paper published by the Bonn Sustainable AI Lab describes a nascent environmental data pollution movement. The white paper is one of the results of the first year of the Lab’s Data Pollution & Power Initiative and its group of interdisciplinary researchers led by’s Research Director Gry Hasselbalch. Read the conclusion here.

The entire white paper is available open access on the Data Pollution & Power website. The white paper and initiative will be officially launched at an event in the Autumn.

Conclusion to White Paper: A Data Pollution Movement

By Gry Hasselbalch

This white paper outlines the connections between the different actors and components of a nascent environmental data pollution movement with ‘sustainability’ as the thread that links its elements in a shared understanding and approach. The main objective is to ensure that data pollution of AI in particular is included in the  global sustainable development agenda.

Data pollution is an environmental problem with interrelated adverse impacts on our natural, social and personal environments. It is the unsustainable handling and distribution of data resources defined in a global society with power dynamics that are transformed, affected and even produced by interconnected streams of data. Data pollution reinforces and affects asymmetric power balances between actors on a local, regional and global scale. This is why we need a data pollution movement.

The data pollution movement is already taking form. In the policy and legal space several policy initiatives have recently been negotiated and put in place to address the sustainability and ethical implications of the adoption and implementation of AI and data-based systems and technologies. Governments worldwide and intergovernmental organisations have presented AI ethics principles and recommendations – several with a special focus on the sustainability of AI and the environmental impact. Since 2017, no less than 60 countries worldwide have adopted artificial intelligence policies.[i] The EU, in particular, has here taken the strongest regulatory position. Thus, a comprehensive European data protection regulatory framework was adopted in 2016 to harness threats to privacy and individual empowerment in an age of massive collection, storage and use of big data. In 2018 the EU’s AI Strategy was adopted and in 2021 the world’s first AI law proposal was published with a risk-based approach.[ii] 

Expectedly, in the tech industry, we have also seen the emergence of new AI and data companies with an ethical agenda, such as the Finnish privately-held AI lab Silo.AI,[iii] which builds human-centric AI solutions to support rather than replace humans in various work situations, all with the slogan ‘AI for People’. Also, larger, more established companies are increasingly differentiating their business practices with an ethical stance on data. This includes consumer tech giant Apple, whose CEO, Tim Cook, for years used the argument that he ‘sells products, not user data’ to differentiate the brand from its Silicon Valley competitors. In this sphere, we also are seeing the emergence of AI ethics and sustainability claims and initiatives. Unfortunately, however, many activities in this sphere do not account for historical data pollution and thus advantages. They do not address the core structural power problems of technological dependency creation and data power centralisation. Moreover, while presenting sustainable data and AI practices in one domain, they continue data pollution practices in others while enacting no or very little real meaningful change.

In terms of technical data infrastructure, the ‘personal data store’, ‘trust’ and ‘stewardship’ movement has been ongoing for a while now among innovative entrepreneurs with the aim to shift data power asymmetries embedded in current data infrastructures. As a result, a range of new services that by default respect people’s privacy and empower individuals with their data have been developed. The MyData global community includes organisations, SMEs, individuals and local networks working with the aim to:  … help people and organisations to benefit from personal data in a human-centric way. To create a fair, sustainable, and prosperous digital society for all.[iv] Many of these are challenging the privacy implications and CO2 emissions of an asymmetrical data economy that collects and stores data on central servers. They call for ‘greener data’ with a decentralized data trust model. Much is left to be explored both in terms of the basic functioning, interoperability and, last but not least, legal framework of data trusts and cooperatives, but the movement is growing and expanding.

Tides are changing in the sea of big data, and society is starting to understand and act on this shift. There is a sense of urgency to develop and implement an ethical and sustainable approach to data and AI, and the world’s most advanced companies and governments are positioning themselves within this movement. Nevertheless, we are still far from the kind of widespread societal awareness that will lead to real change.

This white paper is a step in that direction. It explores the powers, interests and impacts of data pollution in eight domains, which can be summarised as follows:


Data pollution is a carbon footprint. It can be addressed in the design phase of AI as a component of the competences, practices, education and technological dependencies of AI practitioners. However, the extent and impact of data pollution is incrementally complex to measure and mitigation strategies are accordingly difficult to design and apply. We need a global coordinated response that recognises the power players shaping the contexts in which data pollution and its impact on the natural environment can be measured and tackled.

Science & Innovation

Data pollution is part of the culture of big data science and innovation. In a big data economy, the most powerful technology companies, institutions and accordingly also AI practices are dictated by the collective imagining of big data as an unlimited resource and opportunity. Tackling data pollution in the science and innovation domainrequires a counter-balanced science and technology ‘data sustainability culture’ supported in policy, innovation and education. There is also a need for environmentally sound development strategies for alternative technologies.


Data pollution is an imbalance in the information eco-systems of constitutional democracies. A democracy is founded on sensitive information balances between citizens and the State, which is stipulated in laws, state governance, institutional procedures and frameworks for the conduct of elected representatives and public servants. Modern democratic societies must ensure socio-technical infrastructure that reinforces and ensures the democratic ecosystem of information distribution between citizens, States and other powerful actors.

Human Rights

Data pollution is a corrosion of the international human rights system. As big data and AI socio-technical infrastructures (BDSTIs and AISTIs) are integrated in society, individual human rights protections are increasingly challenged and held up against, for example, the interests of nation States to control and gather intelligence, or the interests of the data-based business models of internet platforms. Fortunately, data pollution issues that affect people’s rights are also more and more often challenged in court via human rights legal instruments.


Data pollution is a concentration of data power in socio-technical infrastructure. Just like with air pollution, where human exposure is increased by the concentration of pollutants in the air, its negative effects are increased in correlation with the concentration of data pollutants in the socio-technical infrastructure of personal, social and natural environments.


Data pollution is a bias in human decision-making with adverse consequences for individuals and society. Decision-making in the domains of everything from civic participation, social networking, judicial practice, etc. is increasingly extended with Autonomous Decision-Making Systems (ADM Systems). The human impact of data pollution in the various domains of human decision-making are most profoundly expressed as the reinforcement or creation of discrimination in society.

Global Opportunities

Data pollution is colonialism. It reinforces existing social hierarchies and colonial power dynamics that impact the distribution of global opportunities. The big data and AI ‘revolution’ has made the greatest difference in terms of opportunities in the economies of the Global North, while leaving the Global South behind. At the same time, the very experience and impact of data pollution are the most intense in communities and among people that have traditionally been the most exposed in local and global power dynamics.


Data pollution is a disempowering rationalisation of time. The data design and classification models of AI only take into account what is useful to the system which is established by dominant interests in it.  In this way, AI systems ultimately reduce dynamic cultures and multiple experiences. Qualitative pasts and multiple futures do not make sense in and of themselves in AI systems. Data pollution of time disempowers the experiences and voices of less powerful communities that are reduced to mere instantaneous data to be used and acted on according to dominant interests.

In conclusion, power is currently integrated in very real digital data architectures and, as is increasingly highlighted in public debate, it most often upholds the world’s most powerful actors while putting others at a disadvantage. These asymmetries of power are hidden in the narratives shaping AI governance, business strategies, and even science and innovation that result in very different experiences of data pollution. This why we need the data pollution movement. 

[i] “The EU and U.S. are starting to align on AI regulation” (February 1, 2022), Brookings.