Skip links

Algorithmic Models for Detecting Welfare Fraud Are Risky

Algorithms have been used when approving mortgageshiring employees, predicting exam grades, and investigating welfare fraud. But the data is sometimes biased, and incorrect predictions can have severe consequences for individuals. It is therefore important to have external, independent reviews of algorithmic models used to support decision-making. Rotterdam city in the Netherlands has given Wired insight into the algorithmic model they used to calculate risk profiles for citizens receiving social welfare benefits from 2017-2021. Below is a short recap of the analysis. The full analysis can be read here.

Rotterdam city used a machine-learning algorithm developed by the consulting firm Accenture, to help detect welfare fraud by flagging individuals with high-risk profiles for investigation. 

The algorithm consisted of 500 decision trees (a series of yes-no questions) to determine the risk profiles of citizens. The risk profiles were based on things like gender, age, kids, and language skills. 

  • Is the person a female?
  • Is the person young?
  • Does the person have kids?
  • Has the person passed their Dutch language test? 

According to Wired, things such as being female, being young, having kids or not having passed a Dutch language test, all individually increased the total number of a citizen’s risk profile, making it more likely to be flagged for investigation. 

But using decision trees to estimate risk profiles, makes it hard to understand how single variables affect the overall risk scores. Take the variable of gender, here women and men move down different branches of a decision tree. As the path in the decision tree is not the same, the risk profiles might be generated using different aspects. As a result, some variables might increase the risk profile for women, while others increase the risk profile of men. 

Using different variables to estimate people, can lead to unintended discrimination if a variable (such as age) only increases the risk profile of one gender.

Besides the risk of discrimination, the algorithmic model was built on training data that included both honest mistakes and deliberate fraud. The system hereby looked for patterns and tried to find commonalities between two different groups: people who deliberately tried to cheat and people who made simple paperwork mistakes. 

As the calculated risk profiles do not distinguish between the two groups, citizens with more paperwork mistakes (maybe caused by lower language skills), are put in the same group as deliberate fraudsters. This can give a skewed picture and cause vulnerable groups an undeserved bad reputation for trying to deliberately take advantage of the welfare system. 

An external ethical review, commissioned by the Dutch government did also criticise the algorithm used by Rotterdam, which led to a suspension of the use of the system in 2021. 

According to Annemarie De Rotte, Rotterdam’s Director of Income, the decision to provide Wired insight into the algorithm, has been done to ensure transparency and to learn from the insight of others.

So, what can we learn from the Rotterdam case?

Maybe a few general things to consider, whenever algorithms are used in decision-making. 

  • Have transparent systems
  • Have external reviews of the system
  • Evaluate people on a fair foundation
  • Ensure explainability of results

Other European countries, such as Denmark, France, Netherlands, Ireland, Spain, Poland, and Italy are using algorithms to detect welfare fraud. A hope is, that the insights from the flaws in the Rotterdam algorithm will be used to avoid similar mistakes in other countries. 

Read more on the Rotterdam algorithm here.

Photo: Chris Anderson

Signe Agerskov is researching blockchain ethics at the European Blockchain Center and is a member of the European Blockchain Partnership’s Expert Group on Blockchain Ethics (EGBE).