Imagine answering the phone and talking, and a few minutes later finding out that the “person” on the other end wasn't human at all. Weird? Impressive? Maybe a little bit of both.
Exactly the same thing happened at Global Fintech Fest 2025, where SquadStack.ai made waves by claiming that the artificial intelligence had effectively used its voice passed the Turing test – an age-old measure of whether a machine can convincingly imitate human intelligence.
The experiment was simple but bold. More than 1,500 participants took part in unscripted live voice calls, and 81% were unable to tell whether they were talking to artificial intelligence or a human.
It's a milestone that makes even skeptics stand up straight. We've heard about the art of artificial intelligence and chatbots, but this? It's artificial intelligence that talks – literally – and does it so well that it blurs reality.
Reminds me when OpenAI presented its voice enginea model that can generate natural speech from just 15 seconds of sound.
Then the Internet went crazy for consequences – creative, ethical and downright disturbing.
What SquadStack now seems to do is push that vision further, proving that the nuances of conversation are not just about pitch and tone, but also about timing, emotion, and context.
But let's stop for a moment – because not everyone celebrates. Regulators have started to tighten their belts.
In Europe, policymakers are already pushing for stricter disclosure of the identity of AI-generated voices, reflecting growing concerns about deepfakes and digital impersonation.
For example, Denmark is drafting Art law against AI-based voice deepfakesciting cases of cloned voices being used for fraud and disinformation.
Meanwhile, the business world is cheering. Companies like SoundHound AI are reporting huge increases in profits, showing that voice generation isn't just cool technology – it's good business.
If consumers can't tell the difference between AI and real people, call centers, virtual assistants and digital sales agents may soon seem indistinguishable from their human counterparts. This is stereo performance.
There's also a fascinating parallel here with Subtle Computing's work on AI voice isolation – teach machines to recognize speech in a chaotic environment.
It's truly almost poetic: One startup makes AI listen better, and another makes it speak better.
When these two threads meet, we will have artificial intelligence that will hear us perfectly, respond naturally, and maybe even argue convincingly.
This, of course, raises a fundamental question: how much of it do we really want? As someone who still enjoys barista chats and phone calls with real people, I find this idea both exciting and unnerving.
The technology is undoubtedly stunning. But part of me misses the stumbles, the awkward pauses and the little imperfections that make people's voices seem alive.
Still, it's hard not to be delighted. Whether you see it as a step towards a seamless digital world or a warning sign of things to come, one thing is undeniable – the voices of tomorrow are already speaking. And if you don't know who's talking… well, maybe that's the point.


















