I’m never going to use voice control for my technology, sorry – and I don’t care how much better it is now thanks to AI

By James On Dec 25, 2024

So Google wants me to say “Hey Gemini” now, huh? No thanks, you can go to sea with that nonsense. I don’t have it. Call me a Luddite, call me a miser, tell me to move with the times; I don’t care, I’m not going to talk to my technology.

Before I get into the content of this article, let me first say that I am not against the existence of voice control features in general. They are actually an extremely essential accessibility feature that many disabled technology users rely on to get the full experience from their hardware. But for those who actually don’t need it, like myself: what the heck is wrong with just pressing a few buttons or tapping a touchscreen?

It irritates me when someone talks too loudly on their phone on public transport. When tech companies like Google tell me that voice control is the future of how we interact with our technology, I’m immediately filled with horror at the idea of traveling through a city where everyone constantly barks commands at their phones and tablets.

How many people really use voice control?

I did some research into the actual statistics behind voice control usage and was surprised by the results. I have literally never saw anyone using their phone to search for something on the internet using a voice command; Sure, I’ve seen people ask their question Alexa smart speaker play music or turn off the lights, something I will probably never do because I always have a phone in my pocket that can do those kinds of things, but search the internet? Real?

Apparently yes: according to a 2018 study from PWC32% of voice assistant users ask their chosen digital helper at least one thing they would normally use a search engine for every day, while 89% do so at least once a month. Of course, that’s just people al use a voice assistant, but analysis of Statistical claims that nearly half of Americans talk to their phone or smart speaker at least semi-regularly (although on a global scale that figure drops to about 1 in 5).

Talk to the sphere, says Amazon. The sphere always listens. The sphere hears everything. (Image credit: Amazon)

The thing is, as I delved further and further into these statistics, I became less and less convinced by them. For starters, the very first set of statistics I came across (which I won’t link here) claimed that “an estimated 8.4 billion people worldwide use voice assistants” – that’s… more than the current total human population. I started noticing more discrepancies in the data, and also had to throw out some sources due to obvious pro-tech marketing biases.

More confused than informed, I ultimately concluded that much of the statistical research in this area of technology is based more heavily on product sales than on actual, unbiased public opinion polls: and that’s a serious mistake, because someone who owning one piece of voice-activated hardware is likely to own more. I have a friend who has three identical ones Echo point smart speakers placed in different rooms in her house, and she uses Siri on her iPhone to make music requests while she’s in the car. Me? I just have a driving playlist that I play on shuffle before I start the engine.

Voice control is getting better – slowly

I have to admit that my usual excuse for why I despise voice-activated technology no longer holds as much weight as it used to. That excuse was, in short: it’s nonsense. The early days of Siri, Cortana and their ilk were plagued by a constant refrain of “I’m sorry, I didn’t quite understand that,” but with the rise of AI, things are starting to improve.

Tools such as Apple Intelligence And Google Gemini provide multimodal input, allowing them to understand both spoken requests and text prompts. Today’s major language model AIs are much better at parsing spoken words than older voice recognition software, and can even adapt to an individual user’s speech patterns over time to provide more accurate answers.

Google Gemini AI

Google Gemini has much more potential as a pocket-sized voice assistant than Cortana ever did. (Image credit: Google)

However, there are still stumbling blocks that need to be overcome. While voice recognition typically supports multiple languages, it often suffers from strong accents and speech impediments (I have a lisp myself, which doesn’t help). This could be due to undetected biases in the training data used: if an American company uses recordings of Americans speaking English to train its speech recognition AI to understand spoken English, it will unsurprisingly have a hard time if it is a Japanese or Swedish person hearing that language spoken. .

I really hope that one day voice controls will work perfectly, because the people who really need them deserve a service that works as well as simply typing a question into Google. But I won’t be using it, and I don’t want to live in a future where everyone is – you can bet I’ll be first in line to accuse any tech company that tries to make voice commands the default mode . interact with their product.