News. Bad data can seriously impact on fundamental rights and liberties. Therefore, Spanish-based Eticas Foundation has launched a BAD DATA Challenge, a research, mobilization and awareness project that aims to demystify data and question whether the quality and accuracy of the data that feeds the algorithms are good enough.
By Gemma Galdon Clavell
All over the world, decision-making processes impacting our day-to-day lives are being shaped by the use, processing and analysis of data. Big data promises to increase our ability to make good decisions, and so both public and private bodies are continuously exploring ways to improve and increase data collection, algorithmic decision-making, machine-learning.
At the same time, more and more voices warn against the claim that technology is neutral and highlight how data-based processes often reproduce human bias and perpetuate discrimination. There are many examples of how these voices are exposing data-related problems, from research papers to campaigns, events or pieces of news. However, up until now, most of the focus of such debates has been on algorithms and the need for transparency and accountability in algorithmic decision-making. But little is said about the quality and accuracy of the data that feeds these algorithms. Therefore, the blame is often put on the decision-making process, and not on the dynamics that take place in data capture and original analysis/classification.
That is why in early 2018 we at Eticas Foundation launched the BAD DATA Challenge (see video), a research, mobilization and awareness project that aims to demystify data and question whether current information points are a good reflection of who we are and a good basis for decisions that are made on its basis. Decisions that may affect our shopping choices, but also our life chances and fundamental rights. It is vital to look at the basics of the problem and understand that BAD DATA, as well as a poor understanding of the shortcomings of data, is often the root cause not only of algorithmic discrimination and injustice, and of the problems that an increasing number of advocates, engineers and academics are exposing, but also of a myriad of small data processes that permeate our daily lives and activities and affect us in invisible ways.
At Eticas, we have often found this to be the case. In our extensive work on the role and impact of technology on migration processes, for instance, we have seen how badly transcribed names in Thai or Arabic are fed into databases that can determine whether one is detained at the border. We’ve also come across out-of-date databases even at the national security level. Therefore, we have seen how BAD DATA can seriously impact on fundamental rights and liberties.
But the negative impact of BAD DATA can also be seen in our day-to-day lives, such as when something we buy online for a relative or friend feeds into our commercial profile and affects what information is sent to us based on what we are expected to like or need. And in the middle of this spectrum, from fundamental rights to commercial practices, there’s the many spheres of our lives where data is making a significant impact that we often can’t control or hold accountable, including banking and insurance, labour and recruiting, health data, education and performance data, etc.
BAD DATA can be the result of human error, but also of badly-designed data curation processes (unkept databases, sharing of data among different platforms that don’t interact seamlessly with each other) or profiling assumptions that were badly conceived from the start. Understanding how data quality works in practice taking into account engineering and social processes is crucial to opening up the black-box of the algorithmic society.
In the last few months, we have been gathering examples of BAD DATA from experts and the public to advance our understanding of what constitutes BAD DATA, and we are now starting the second phase of the project, where we leave the desk research to actually explore where BAD DATA happens and see what can be done about it. We have teamed up with Visualizar, a project of Medialab Prado Madrid, to spend two weeks with a team of data experts to dig deep into BAD DATA problems and investigate the input we’ve had from the community.
If you have examples of BAD DATA or would like to get involved, contact us on email@example.com
Eticas Foundation, who is an international partner of DataEthics.eu is the advocacy and awareness wing of Eticas Consulting. Find our projects on privacy in schools, automation and labour, data commons, civic tech and responsible data practices, among others, in our site www.eticasfoundation.org