Skip links

Data Predictions Should Only Be on Patterns – Not Individuals

Analysis. Personal data can help develop societies. We use data to predict patterns and manage risks, to invent new vaccines, get effective access to information, create new and cleaner technologies. But, in the case of predictions based on personal data, we must be very cautious. Especially when it comes to predicting human well-being and behavior. There is a huge difference between predicting patterns on a group of people and then predicting individual behaviour. The United States is experimenting with individual predictions but not with much luck. Research also indicates that it increases inequality.

A mind reader in this Belgian Bank’s video from 2013 knows much more about you that you wish to know.

When the police use data analysis to predict on what locations, violence is likely to occur this Saturday in order to allocate police officers to the area, it may be possible to in a data ethical way. This will at least be the case if fully anonymized data is used to find patterns and not used to predict which individuals will cause the violence.

If a public authority uses data to identify what general trends in behavior and life events may cause social problems like homelessness or why people end up on public support, it may be done in a data ethical way, if it concerns groups of citizens and not specifically which individuals are at risk.

If a business authority aims to identify people with a history of business and tax fraud, and reject them from registering new companies, it may be done in a data ethical way. It will most likely be possible if tax data is compared with criminal data and publicly available financial data from the business authority. When a suspicion is well founded, it may be both legal and ethical to use data prediction – even on an individual level.

We live in an era where data becomes increasingly valuable. Data can be use positively and constructively in order to increase quality, service and well-being, but as soon as personal data is involved, we need to be cautious. Personal data, like any other data, has shown to be an extremely valuable ‘tool’ to develop our communities and businesses. But there must be clear limits on how far we will accept to include personal data to create specific insights into individual citizens and consumers’ lives.

The EU Personal Data Regulation, GDPR, has set a clear framework for the use of personal information. Introducing a risk based approach, the key purpose is the protection of the citizen and the consumer. GDPR emphasizes as a fundamental principle the individual’s right to control his/her personal data and the obligation on transparency. Another key principle is the obligation to minimize the risk for the individual in the personal data processing. A risk of discrimination or identity theft will increase the obligation to abide to the key principle, including appropriate security and technical measures. The GDPR principles sets out fundamental standards for privacy and personal data protection in the digital age.

But one thing is what is legal or not – and of course the processing must be legal.

Another thing is how far do we want to go.

Should a company’s or society’s interest in efficiency and optimization have priority over the interests of citizens and consumers – regardless of the violation of democratic basic principles and data ethical values?

There is an increasing tendency to rely blindly on quantitative data predictions to be able to solve societies various problems. This is problematic. A present example in Denmark is a municipality who wants to analyze personal data from various registers in order to predict social problems in families with children – at an individual level.

If we develop and depend on systems based on individual prediction, society may accept as a control mechanism the arrest of individuals based on the likelihood of future illegal actions. The arrest being an acceptable society precaution. It sounds like the science fiction movie Minority Report. However, it is well underway.

Individual Data Prediction
In the United States, individual data prediction is used to predict which customers are getting pregnant – even before their customers know it themselves (the Target case from 2012 described by the New York Times ‘How Companies Learn Your Secrets‘).

States in the US use personal data to predict the risk that an inmate will commit a crime within 12 months after released on probation. And the Chicago Police runs an algorithm that develops a heat list of individuals who may end up being criminal. The police claims that it reduces crime, but they don’t seem to provide sufficient documentation.

Before we embark on similar activities in Scandinavia, we as society must ask ourselves two basic questions:

Are the data on which the algorithm relies, true, neutral, updated and relevant?
And equally important: Do we want a society using individual prediction as control mechanism?

The American Research Institute Data Society recently released lectures and promoted new research by Virginia Eubanks. In her book ‘Automating Inequality’, she describes several examples of how profiling and predictions punish the weakest and poorest.

The research from US provides very important aspects that we must consider in Europe.

Individual prediction based on data is at its core about probability, and regardless of super computers and extreme computing power, it is still too uncertain when deciding the fate of humans. In line with this, is it reasonable to ask:  Is it ethical to decide today’s actions on what will most likely happen tomorrow, when we are dealing with human living?

European societies must set limits for predictions based on individuals’ personal data collected for entirely different purposes. It might be acceptable when based on patterns and when there is full transparency. It may also be accepted, if the personal data is verifiable anonymized. In line with this, use of predictions is accepted, e.g. for weather forecast, traffic, products, logistic etc. if based on data that is not personal identifiable.

Data predictions, when implemented ethically, are costly,  as Dana Boyd, the founder of Data & Society pointed out.

“Algorithmic systems don’t simply cost money to implement. They cost money to maintain. They cost money to audit. They cost money to evolve with the domain that they’re designed to serve. They cost money to train their users to use the data responsibly.”

A democracy, especially as a value-based and protective welfare societies as the Scandinavian countries, must be willing to accept and pay the price.

A Data Ethical Basis

Predictive algorithms and systems are everywhere in the digital infrastructure. This is challenging but can be controlled if based on some basic data ethical values.

Technology is an integrated part of the support, care and welfare services that characterize the Scandinavian societies. The technologies must support human and human living, not the other way around. Data ethics is about innovating, creating solutions, technologies and data processes with respect for humans values and with a human-centric approach. Fundamental questions must be asked before introducing numerous different IT solutions and data-based processing, offering all sorts of convenient solutions. The answers to these questions is the foundation of our future society.

We should ask who benefits from predictive computing at the individual level? Is the individual’s needs and rights in center of the beneficiary part? We should ask who controls the data on which the predictions are based. Can we offer full transparency on the data processes? Do all people comprehend what happens to their data? May the systems be rectified and data deleted? Are we able to audit the processing by independent third parties?

These and many other data ethical questions are absent in the debate. However, these fundamental questions must be asked, researched, examined and considered. Approaching new technologies with curiosity but also with caution is crucial for the development of the well-fare society. It must – as always – be with respect for democratic principles and individual freedom.

We must build sustainable systems which are not only capable of dealing with today’s request for efficiency and convenience but capable of carrying human values into the future. Technological progress is happening at an extreme pace, and we need to get the debate running in order to prioritize humans before technology.