Big Tech's Generative AI has Given Rise to 'Openwashing'

Photo: Megan Lee, unsplash.com

Being open is something positive. Open is a trend, and the decades-old term for open software is open source. But with generative AI, open as a term has been distorted by some Big Tech players, who have become masters of openwashing.

In April 2024, Meta launched a new version of its large language model, Llama. “Today, we’re introducing Meta Llama 3, the next generation of our state-of-the-art open source large language model,” the company wrote.

Meta is a master of washing. Over the years, it has been conducting ‘privacy washing’ and ‘ethics washing’ at various conferences such as at this one in 2018; ‘How Facebook Builds AI With Privacy And Ethics By Design’. Today, we know that Meta is often far away from the truth or reality, but with Llama the social media giant is showcasing a perfect example of openwashing.

At the heart of openwashing is a distortion of the principles of openness, transparency, accountability, and reusability. Transparency in AI would entail publicly documenting how models are developed, trained, fine-tuned, and deployed. This would include full access to the data sets, weights, architectures, and decision-making processes involved in the models’ construction. Meta falls short as do most other AI companies.

There is, however, critique. Radboud University’s Andreas Liesenfeld and Mark Dingemanse’s scientific paper, ‘Rethinking Open Source Generative AI; Open-washing and the EU AI Act’ from 2024 analysed 45 generative AI systems (both text and text-to-image). The results showed that while the term open source was widely used, many models were ‘open weight’ at best, and many providers thus sought to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning the data.

Meta Continues to Falsely Label Llama as Open Source

Both Llama and the French LLM Mistral were considered openwashers, but while Mistral quickly changed and now calls itself ‘open weight’ (open about the model, closed about what data it is trained on), Meta continues to falsely label Llama as open source, and unfortunately many traditional media outlets have echoed the open source claims uncritically. “While a first crop of text generators, including BloomZ and OpenAssistant, clearly aimed at meaningful degrees of openness, soon enough large corporate players started releasing systems billed as open source while they were in fact at best open weight, significantly diluting the term.”

Liesenfeld and Dingemanse’s paper further states that some smaller players have dropped the development of their own models to simply plug-and-play big players’ foundation models.

“In that way, well-funded corporate heavyweights are taking oxygen out of the room for smaller organisations that operate with higher professional and ethical standards,” their paper concluded.

OpenAI is not Open

Another example of openwashing is from OpenAI. When it was founded in 2015, OpenAI was a non-profit organisation that promoted itself as being open. Yet, as they saw the revenue potentials in generative AI, both being ‘open’ and later ‘non-profit’ disappeared. In October 2025, OpenAI restructured its operations in order to receive billions of dollars from investors. The OpenAI Foundation is still in charge and owns part of the commercial group, OpenAI Group PBC, a for-profit with beneficial purposes which demands that OpenAI follows its mission. And guess what, the mission of OpenAI Foundation and OpenAI Group is the same:

“OpenAI’s mission is to ensure that artificial general intelligence (AGI) – by which we mean highly autonomous systems that outperform humans at most economically valuable work – benefits all of humanity.”

Smaller Models Might Be Both More Private and More Open-Source

Smaller models might be better to use though it is hard to understand if they are fully open source.

Lumo from Proton in Switzerland calls itself open source but is probably more open weight, as it is built on various foundation models, one of them being Mistral, which is now open weight. Lumo, however, is recommendable, as it is built by a privacy-first company and based in Europe.

The US-based Olmo, which is number one in on a European AI open source index (osai-index.eu/the-index) recently marketed itself as open source, but then they changed their wording and called it open weight. Yet, they recently changed it again and now state they are a ‘fully open language model’. It is number one on the European Open Source Index.

Then we have HuggingChat, which also calls itself open source. The platform itself, including its user interface and backend infrastructure is open source, allowing anyone to inspect, modify, and contribute to the code. But it is built on foundation models, primarily from Meta, Mistral, and other community contributors, which are all only open weight models.

Apertus from Switzerland, number three on the European AI open source index, might be the only true open source model and the ‘first fully reproducible large language model from public institutions’. It was developed by researchers from EPFL, ETH Zurich, and the Swiss National Supercomputing Center as part of the Swiss AI initiative, according to its website. It is also released under the open-source license, Apache 2.0 which allows for both research and commercial use and it complies with openness on all fronts: training code, data preparation scripts, evaluation tools, intermediate checkpoints, complete documentation and methodology as a Medium post explains.

Open Source AI – Preferential Treatment?

In the AI Act, which the EU Commission wants to relax with a Digital Omnibus, there are at this moment special provisions for open source AI models. Several of the paragraphs in the AI Act sound like this:

“The obligations set out in paragraph xxxxx shall not apply to providers of AI models that are released under a free and open-source licence that allows for the access, usage, modification, and distribution of the model, and whose parameters, including the weights, the information on the model architecture, and the information on model usage, are made publicly available. This exception shall not apply to general-purpose AI models with systemic risks.”

How to Change the Level of Openness

Policy is one way to address openwashing, as well as legislation. Instead of a watered-down EU AI act, with corporate interest lobbying in Brussels for easy licence-based decisions and with ‘open-weight branded as open source’ proliferating, instilling key values openness and accountability in AI systems is crucial.

One way is ‘data sheets for datasets,’ as highlighted by Gebru et al (2021) that can help expose which data the models were actually trained on and, in turn, encourages ethical attributions and legal scrutiny.

Similar to energy efficiency labels for electrical appliances and environmentally green ratings for companies, perhaps having openness scores or labels would be beneficial for users, as Liesenfeld and Dingemanse suggest (see illustration).

One thing is certain, the label ‘open source’ is evolving and the EU AI Act and further legislation will also evolve with it, yet it will need to regulate data disclosure for generative AI and follow these guidelines suggested by Liesenfeld and Dingemanse: openness is important for risk analysis (the public needs to know); for auditability (assessors need to know); for scientific reproducibility (scientists need to know); and for legal liability (end users need to know).

Read the scientific paper, ‘Rethinking Open Source Generative AI; Open-washing and the EU AI Act’
Check out the NGI search project
Read ‘Datasheets for Datasets‘ by Gebru et al.
Read ‘Why Apertus Matters‘
Have a look at the EU’s recent European Open-Source AI Landscape report by the StepUp Startups project

Go to the landing page of our Open Source Democracy

Listen to episode 3: Generative AI Comes With Openwashing

Open Source Democracy is a project about why open source is important for democracy supported by Carlsberg Mindelegat. It aims to communicate the ethics and values of open source alternatives to big tech structured by three overarching topics, education, mobility, information. All articles and podcasts will be freely available at dataethics.eu/opensource

Big Tech’s Generative AI has Given Rise to ‘Openwashing’

About Data Ethics

Contact us

You may also like

About Data Ethics

Contact us

Subscribe to our newsletter