David Pogue has an article in The New York Times about a recent 'voice-off':
“Your review was the dumbest thing I’ve ever read. It strains me to avoid profanity in describing how stupid you sound.” That’s the kind of email that brightened my day after I reviewed Google’s Moto X phone two weeks ago. My correspondents seemed especially unhappy with one sentence in that review: “Android’s voice commands are still no match for Siri.”Rico says that it's amazing that it works at all, and most people don't know what 'ellipsis' (…) are. But Humus. Compost. Pumice. Silt. Gravel? Now that's funny... (But Rico has never had a fucking problem with avoiding profanity in describing how stupid anyone sounds...)
Man, I really was stupid. Who’d be dumb enough to take sides in a religious war? I’d have been better off writing “Conservatives are better-looking than liberals” or “Pro-life people are worse drivers than pro-choice.”
But the superiority of cellphone speech-recognition technology is not an idle question. Once touch screens became the future of phones, voice recognition became desperately important. Without physical keys or buttons, entering text and manipulating software controls are fussy, multistep procedures.
So I’ve just spent two weeks immersed in voice recognition. I carried an iPhone and a phone running Google’s Android operating system (photo) with me everywhere. I spoke to both phones simultaneously. I wanted to get to know the differences, the strengths, the weaknesses.
When people talk about speech recognition, they mean, and often confuse, three different functions. There’s dictation, where the phone converts speech to text; commands, where you operate the phone by talking; and Internet information searches. There are vast differences among the successes of the three.
Dictation, for example, is still fairly poor on both systems. Both Android phones and Siri, the iPhone’s speech feature, make many transcription errors. When you hear people bashing cellphone transcription, declaring, “I gave up on it”, they’re usually referring to dictation.
That’s forgivable, but come on. You’re asking your phone to understand varying accents at varying distances from its microphone, in rooms with varying background noise. It’s a wonder this feature works at all.
The latest Android version doesn’t require an Internet connection to do basic dictation. And, in Android, the words appear on the screen as you utter them; Siri doesn’t transcribe until you stop talking.
On the other hand, Siri understands formatting controls like “capital”, “all caps”, and “no space”, as well as all kinds of punctuation — “colon”, “dash”, “asterisk”, “ellipsis”, and so on. Android understands only the basic symbols, like “period”, “comma”, and “exclamation point.”
The second category, phone-control commands, is far more successful for far more people. This is when you say: Call Mom, Text Emily, Wake me at 7:30, Play some Billy Joel, Remind me to feed the cat when I get home, and so on.
Controlling your phone without touching it is important for safety, of course. If you must interact with your phone while driving, speaking to it certainly seems safer than looking at it. But don’t forget the convenience factor. It’s much faster to say, Open Angry Birds than to flip through home screens full of icons. And Set my alarm for 8 am is about 375 finger-taps quicker than using the clock app.
Here, Siri has the edge. As you’re driving along, for example, and you hear the incoming message sound, you can say, Read my new messages, and Siri reads them aloud. It even invites you to dictate a reply, without ever taking your eyes off the road. Android can’t do that.
Both systems can tap into some of the phone’s own apps. They recognize commands like Make a meeting with Bob Barnett Thursday at noon (a calendar interaction), Make a note to pay back Harold (notes), Send an email to Danny Cooper (mail), and What’s Steve Alper’s home address? (contacts).
Android blows away iOS, though, in Web searches. Both kinds of phones do an amazing job fetching weather updates (What will the weather in Detroit be this weekend?), times (What time is it in Belgium?), stock prices, sports information (When’s the next Cowboys game?), conversions (How many dollars in 32 euros?), calculations (How many days until Valentine’s Day?) and every kind of Web-search query (How many calories are in a Hershey bar?, When is the next solar eclipse?, How do you spell schadenfreude?, Show me pictures of a 1985 Corvette, and so on).
But Google’s bread and butter is Web searches, so Android responses are generally much, much faster. (To try this speak-and-ye-shall-find business on an iPhone, download the Google Search app.)
Android is especially amazing at dialing places without having to look them up (Call the Macy’s on 34th Street) and directions (Get me to La Guardia Airport by public transportation), since its Map app is so unbelievably good. It’s also smarter about connecting questions. If your first question was Who is Hillary Clinton?, you can follow up with Who is her husband?
And Google has a built-in music-recognition feature, like the Shazam app. Tap the voice-recognition icon, let the phone listen to whatever song is playing, and marvel as it instantly identifies the song and singer.
Unfortunately, Android has an Achilles’ heel; actually, more like Achilles’ entire leg. To issue spoken commands, you have to tap the microphone icon on the Google search bar. And it’s only on the home screen or the Google Now screen (swipe up from the bottom). So you can’t speak commands when your phone is locked, or when you’re in another app.
On the iPhone, you hold down the Home button or the clicker on your earbuds cord, so the voice command feature works when the phone is asleep or in any app.
In other words, to use an Android phone’s speech features, you frequently have to pick it up, and you always have to look at it, which defeats much of the purpose. The exception: Motorola’s new phones, like the Moto X, which can be set to listen all the time.
Siri is better with restaurants and movies, too. Both phones understand, Good Indian restaurants around here or Call the Olive Garden on Daleford Road. But Siri can also book reservations, thanks to integration with OpenTable.com. You can say, for example, Make a reservation at an inexpensive Italian restaurant Saturday night at 7.
Similarly, Siri provides attractive, consolidated answer screens for What movies are opening this week?, Give me the reviews for ‘The Way, Way Back’, or What are today’s showtimes for ‘The Smurfs 2’? Android just shows you Google search results.
And then there’s the issue of personality: Siri has it, Android doesn’t. We’re talking about wisecracks, jokes, attitude, addressing you by name. If you ask Siri the question Who’s your daddy?, she replies: You are. Can we get back to work now? Say Beam me up, Siri, and she says: Please remove your belt, shoes, and jacket, and empty your pockets. Say, Talk dirty to me, and she replies Humus. Compost. Pumice. Silt. Gravel.
Now, on the great battlefield of the Apple-Google fanboy war, humor is small potatoes. Apple haters practically claw their eyes out when you mention Siri’s personality. “It’s not useful! It’s a parlor trick! It strains me to avoid profanity in describing how stupid you sound!” And that’s fine. That’s why there’s choice: two camps in this philosophical school. (Well, there’s also Windows Phone and BlackBerry, but their speech recognition is extremely rudimentary.)
And so, put down your swords, fanboys. Both systems are exceedingly useful, once you spend the time to learn them. (Here’s a site with a good list of Android voice commands: j.mp/12kEFDo. And here’s one for Siri: j.mp/16Yy4yy.)
Though Siri has the edge, the gap has closed substantially, and both systems are rapidly improving. For example, until recently Android had no phone-control features at all— only Web searches. And in this fall’s iOS 7 update, Siri will gain a more pleasant speaking voice, faster searches, and the ability to change settings by voice (Turn on Airplane Mode, Turn up the brightness, Turn on Bluetooth), something neither phone can do now.
This much is clear: cellphone speech recognition is getting better fast. Very soon, we’ll do less talking through our phones and more talking to them.
No comments:
Post a Comment
No more Anonymous comments, sorry.