Skip links

How Should A Data Democracy React on Generative AI

Analysis. The AI race is on. Thanks to Open.ai and Microsoft, who kickstarted the global race, it is driven by profit without taking risks to humans, democracy and society into consideration. Part of the solution lies in regulation, where both the EU and now China is leading the way. The other part of the solution is data ethics.

Generative AI – or general purpose AI (GPAI) – is hordes of data harvested from all over the web and put into a mathematical model. This model or algorithm can be prompted by the user and thus gives us answers in a very human-like language or with pretty good illustrations. It is extremely fast, it can be correct, at least sounds so, and it is very fascinating to humans that a machine behaves in a human-like manner. But there are so many risks that we as a society have not yet come to terms with. In other words, we are all lab rats of big tech companies sucking up even more of our personal data, some are even charging money on top of it and we train the models for free giving them our time.

The other day I heard serious journalists from the New York Times podcast Hard Fork explain that people could upload their health journals to ChatGPT and get an explanation of it, if they did not have the patience to wait for their human doctor to do it the next day. Not a word on the fact that it is not a good idea to upload your health data to a chatbot, who is very good at hallucinating.

Here’s a list for risks to humans and society:

  • Misinformation. It hallucinates and make up lies
  • Privacy (Italy says it is violating GDPR)
  • Violation of copyright
  • Bias and copyright
  • Anthropomorphisation
  • It sucks up company secrets
  • Crime like voice scams and deep fakes
  • AGI – Artificial General Intelligence

Misinformation and Distrust
A huge risk is that more and more content of the web will be AI-generated. Some believe that it will be 90% in 2026 according to a Europol-report. A lot of the AI-generated content will be false, and it will be even hard for humans, including professional journalists and researchers, to distinguish between facts and fals. The more fals the more distrust, and distrust is a virus to democracies. Just think of the US, where they can even agree on who won the last elections. We have tons of examples of how the generative AI models not only distribute existing lies from the web, but that is also makes up lies itself. A recent example is this one from The Washington Post about ChatGPT who invented a sexual harassment scandal and named a real law prof as the accused.

Data Theft and Data Abuse
Another risk is that our privacy. Just think of the example above, where people upload health data to the chatbot, and you have no idea how your privacy will be violated. Italy’s data protection agency believes ChatGPT is violating GDPR and is trying to enforce the regulation. Also employers are trying to keep corporate secrets out of the chatbots.

Is it okay to vacuum all data from the web and then use it in a chatbot and make money on it – without getting consent or remunerating the creators of the content? That is what most of the AI generators out there has done and thus artists are fighting back.

Further, many of the chatbots are full of bias. Just look at the early picture from Dall-E prompted in this way: Good Digital Life. Then we see a happy write lonely woman with a PC.

Humans Not Ready
Another risk is that we humans are simply not ready for very human-like technology. We can’t distinguish between content created by humans and machines, and no laws dictate that AI-generated content should be labeled as such – that is only if you practice data ethics that you do that. Humans also tend to anthropomorphise machines speaking/writing and looking like humans. Some even fall in love with so-called social robots. Or over-trust a chatbot – or at least that is what the relative of a man, who killed himself after talking to a chatbot believes.

A pause in Giant AI Experiments, as many prominent people have suggested, including the historian Yuval Hariri, is a good idea, but maybe for longer than 6 months and instead until we have proper regulation in place, as we cannot rely on the tech giants behaving ethically responsible.

Regulation and Enforcement is One Answer
The EU AI act which has been discussed for over three years and might be in force next year, is definitely one solution. It is a very good draft, though it already needs quite a lot of updating with the releases of generative AI tools. A list of respected (mostly US) AI experts and scientists have offered guidance to the EU how to update the draft.

“GPAI is an expansive category,” they write“ChatGPT, DALL-E 2, and Bard are just the tip of the iceberg.”…”For the EU AI Act to be future-proof, it must apply across a spectrum of technologies, rather than be narrowly scoped to chatbots/large language models (LLMs).”

They underline that GPAI models carry inherent risks and have caused demonstrated and wide-ranging harms. While these risks can be carried over to a wide range of downstream actors and they must be regulated throughout the product cycle, not just at the application layer, in order to account for the range of stakeholders involved.

China follows the EU – the US Slow to Regulate
It is hard to say anything good about a data dictatorship like China, who uses personal data to suppress and control its population. But the unregulated surveillance capitalistic society, the US, could actually learn something from it, when it comes to regulation. And so could the EU.

The EU’s AI draft entails very good provisions such as declaring whenever something is AI-generated (so humans know the difference), securing data quality, demanding risk assessments and avoiding discrimination and bias. But it also needs updates, so that e.g. generative AI – chatbots – get into the category of high risk or another risk group and thus will be required risk assessments before launch. What the EU could learn from China is speed. We need to speed up the regulation process, even though we know democracy takes time. See blue box on the process.

Update 24th April: How the EU works with the update from TechChrunch: What’s likely to be the parliament’s position in relation to generative AI tech in the Act, Dragos Tudorache, co-rapporteur of the AI Act, suggests MEPs are gravitating towards a layered approach — three layers in fact — one to address responsibilities across the AI value chain; another to ensure foundational models get some guardrails; and a third to tackle specific content issues attached to generative models, such as the likes of OpenAI’s ChatGPT.
It is not likely to be in force before 2025.

China very recently released their proposed regulation – definitely inspired by the EU’s AI draft. The Cyberspace Administration of China wants to regulate generative AI, and there are some interesting promises. But as the founder of the Center for AI and Digital Policy Marc Rotenberg write on LinkedIn: “There are provisions intended to limit criticism of the government. That is censorship plain and simple. And there is a real name requirement for users of LLM services that is a threat to privacy.”

Apart from the his points to following, and I quote directly:
– Companies should ensure the data used to train these AI models will not discriminate against people based on ethnicity, race, and gender
– Companies should ensure that AI content is accurate, and measures should be taken to prevent the models from producing false information
– The data must not contain information that infringes intellectual property rights. 
– If the model contains personal information, companies are expected to obtain the consent of the subject of the personal information or meet other circumstances required by law.

According to Fortune, “Chinese companies that want to use generative A.I. to serve the public will first have to submit their tech for an official security assessment.”

In stead of totally ignoring anything coming out of China, the US could start listening both to China and the EU, when it come to regulating technologies. At least now, the Biden administration will host a public comment on regulation of AI, according to the Guardian.

Daniel Lincoln – Unsplash.com

Meanwhile, the AI race will continue. Initiated by OpenAI and Microsoft (who is lead investor in OpenAI), who is inside many EU computers and who has managed to label itself as an ethical and responsible company. Now, with profits ahead, it is hard to find even a grain of responsibility in Microsoft at the moment. Other big data companies, who has held back on their generative AI of various reasons, are launching services in a non-regulated global landscape. According to Bloomberg, Google just released their Bart AI Chatbot raising serious ethical concerns from current and former employees.

Data ethics often comes on top of regulation and ofte before. You behave responsibly even though the law is not telling you what you can and not can do. Of course you cannot release a new product where you have no ideas about the rights to humans and society.

None of the big tech AI companies are acting with data ethics in mind.