@david_chisnall@infosec.exchange
@davidgerard@circumstances.run
I havenβt been able to find them again, but about 20 years ago I read some papers (which were ten to twenty years old then) about voice command. They didnβt have magical AI, so they simulated it by having a person do the task, and then they subtracted the time that the person spent doing the task on behalf of the speaker from the total. They then compared this to the speaker using a direct-manipulation UI to do the same task. The conclusion was, for almost all tasks, a GUI outperformed a voice UI by a massive amount. There are two cases where a voice interface works better:
First, if you donβt have your hands available. If youβre cooking or performing surgery, you donβt want to stop what youβre doing, wash your hands, and then do the computer task and resume. Same thing with changing the music while driving: voice lets you keep your focus on the important task and do a second lower-priority task at the same time.
Second, when the person performing the task had a lot of agency. This is why big ships use voice command from the captain: the captain isnβt telling the pilot to press a couple of buttons, theyβre telling the pilot a desired outcome that the pilot will apply a load of domain expertise to achieve.
The second thing is what people want from an βAIβ voice interface, but even with AGI itβs largely infeasible to meet usersβ requirements.
The problem is that a lot of films and TV series have used voice command as a narrative device. Someone talking to a computer and the computer skipping over twenty steps of ambiguity by magic is great for story telling. If you showed them using a GUI or CLI, it would be tedious.
Thereβs an episode of TNG where one of the crew is kidnapped by aliens and isnβt sure whether itβs a dream. He tries to reproduce the dream in the holodeck. The first prompt tells the computer to produce a table, and the table is nothing like the table he wanted, but two or three tweaks later itβs identical to the one in the dream. And this is possible because the script says so, and the props department used the same one in both scenes. But even with a computer that was at least as intelligent as a human, this is unrealistic. Imagine a human completely in control of drawing a 3D image, who took zero time. How many prompts would it take you to draw exactly something that youβd seen? A hundred? More?
@baoigheallain@mastodon.ie
@david_chisnall@infosec.exchange @davidgerard@circumstances.run Which explains why Apple Siri, introduced in 2011 and pushed relentlessly since, is only useful for setting a timer when Iβm cooking.
Everything else, EVERYTHING, I've tried on it is too painful to repeat.