Skip links

When Reality Slips Away: Voice Cloning as the Latest Wake-Up Call

New AI-based voice cloning tools are pushing humanity further towards the complete dissolution of the boundary between reality and illusion. This risky development has been underway since the invention of the computer mouse.

In Plato’s seminal work, “The Republic,” written around 400 B.C., he introduces the allegory of the cave. A group of slaves are chained in a dark cave with a fire behind them. All they see are the shadows cast on the cave wall by objects passing outside. Having never seen anything else, they assume these shadows are reality. Plato uses this allegory to suggest that human understanding of reality may be just as limited; we only see “shadows” and assume that’s all there is.

We can never fully escape the cave, as our senses can deceive us. However, through the study of philosophy and science, we can get closer to reality. For instance, we now know the Earth is not flat, despite our senses suggesting otherwise. The more we understand about reality and agree on what that reality is, the less room there is for conspiracy theories, division, and manipulation.

In this context, it’s unfortunate that after decades of progress, we are regressing back into Plato’s cave in some areas. Not in terms of science, which is thriving, but digitalization has caused our understanding of everyday phenomena to slowly implode. 

Blurring the Lines: Reality vs. Digital Illusion

Elevenlabs’ voice cloning tool is a current peak of this unfortunate trend. But before we delve into that, let’s look at some defining moments in the blurring of reality and illusion. The first occurred in the 1960s when internet pioneer Douglas Engelbart invented the computer mouse. This innovation allowed us to interact with computers in the same way we interact with the physical world. Inspired by how children learn, the mouse was followed by the windows computer, which popularized its use, and then Apple’s iPhone, which made touch-based interaction ubiquitous. Within a few decades, we’ve become accustomed to spending large portions of our lives in a virtual space that few of us understand. With accelerating speed, we are entering a world we know as little about as we did when we believed the Earth was flat.

Advancements in artificial intelligence have further accelerated the blurring of physical reality and digital illusion. Generative AI’s ability to create new data from existing data has made it possible to simulate language and visual expressions almost perfectly. A computer can now behave and look like a real, living human being. The result is a digital world filled with synthetically created content that is even harder to discern: Is what I’m seeing on my screen a digital copy of something that exists in the physical world? Is it a blend of physical reality and digital simulation? Or is it pure digital illusion? Was it created by a human or a generative AI model?

The Ethical Quagmire of Voice Cloning

In this slightly dystopian context, Elevenlabs’ voice cloning tool is the last thing the world needs. With just a minute of voice recording, anyone can clone anyone’s voice and make them say anything. In five minutes. For $1 a month. The technology, especially its user-friendliness and easy accessibility, further blurs the lines between the authentic and the fabricated. Anyone can participate, regardless of intentions. And it doesn’t stop at sound; the company has announced a partnership with D-ID, which creates digital human—screen avatars that will now have extra authentic voices. Whether someone wants to create fake political statements or extort money from parents by making fake kidnapping videos of their children, the field is wide open.

Elevenlabs’ voice cloning tool is the last thing the world needs

Thomas Telving

The gravity of the situation is hard to overstate. The consequences of spreading doubt about truth and falsehood on a large scale can be severe, with the potential to create polarisation, dissolve democracies, start wars, trigger terrorism, and countless other disasters. The erosion of a solid and shared view of reality gained momentum with the echo chambers of social media. We are now beginning to see the very concrete dangers of more advanced technologies like voice cloning and various forms of generative AI.

Legislation Lagging Behind Technology

The powerlessness of politicians in the face of digital development is not new. It’s an inherent problem that while good legislation takes time, digital technology develops and spreads rapidly. Yet it’s ironic that in many countries, it still requires more approvals to launch a new soft drink than to launch a technology with almost limitless potential for political instability and criminal abuse.

Plato’s Warning: The Need for Ethical Vigilance

In an era where technology is constantly pushing the boundaries of what is possible, it’s more important than ever to remember Plato’s allegory. We need a common ground, shared facts, and a solid foundation for our understanding. Instead, we allow ourselves to be seduced by the shadows that Elevenlabs and numerous other dazzling technologies cast on the cave wall.

“With accelerating speed, we are entering a world we know as little about as we did when we believed the Earth was flat.”

Photo: Jason Rosewell