Why your voice is your new password

Google now lets some users verify purchases using voice alone. It’s just the beginning. Welcome to the future of biometric ID and verification.

alexa virtual assistant echo amazon alexa voice control
Getty Images / Amazon

Face recognition is having a moment.

Apple mainstreamed face unlocking for smartphones with its Face ID. Clearview AI freaked everyone out about the implications of a system that can let you upload a photo and find out anything about anyone in seconds. And several companies in the United States and China have developed face-recognition systems that can ID faces even when most of those faces are covered by a face mask. Everyone is talking about face recognition.

On face value, it appears that face recognition will serve as the dominant biometric system over the next 20 years. But let me voice an alternative prediction: Voice ID will become extremely important over the next decade.

Here comes voice ID

Google is slowly rolling out a new Google Assistant feature called Voice Match, which enables the identification of users to authenticate purchases by the sound of their voice.

You can find Voice Match on Android phones and in the Google Assistant app on iOS. The feature is both limited in availability and designated as a pilot project, not a universally available feature. The feature works only for Google Play purchases and restaurant orders.

Voice Match was introduced in 2017 as a new feature in the Pixel 2 smartphone and Google Home smart speakers. Back then, Voice Match enabled Google Assistant to tell who was speaking in order to choose the right calendar, email and media services. Last month Google upgraded Voice Match to become more accurate.

All the major smart speakers recognize the voices of users. Amazon's Alexa recognizes voices to know whose music playlists to reference when songs are requested. Apple's HomePod does something similar.

What's new is the use of Voice Match for purchase verification, rather than mere content customization. While fingerprint ID and face ID are options for authenticating the buyer on smartphones, tablets and laptops, it's not an option on smart speakers. Voice ID systems like Voice Match enable that.

While voice ID for purchases is new in the personal assistant and smart speaker space, it's not new in the world of money generally.

Hundreds of banks and other financial organizations use a service from a company called Pindrop, and also from Pindrop's competitors. So in the financial industry voice ID is already mainstream.

The U.K. bank HSBC says that using voice biometrics for call center calls has prevented nearly $500 million in fraud since the bank introduced the feature four years ago. For example, some 17,000 attempted fraudulent calls were identified.

Despite this industry ubiquity, the public is wary about voice ID, and increasingly so in the past year.

Bret Kinsella, the founder and CEO of the analyst and news organization Voicebot.ai, told me that his recent surveys found that "security and privacy concerns have really crept up over the last year." He said that the number of people who don't have voice-based devices cite security and privacy as the reason. The security issue is a barrier for companies wanting to drive more adoption.

A different survey by biometric security startup ID R&D found that two-thirds of U.S. adults are concerned their accounts could be accessed by hackers faking their voice.

The Pew Research Center’s American Trends Panel survey found that more than half of smart speaker owners don’t want better personalization because they’re concerned about personal data security and privacy.

Google is trying to overcome growing public privacy and security concerns with its Voice Match by designing it so that the voice model that enables the identification resides on the local device, not in the cloud. However, at the point of usage, the device uploads both the query and the voice model to process the query, then deletes both the voice model and comparison data. (In general, voice ID works by converting the unique attributes of a recorded spoken voice into an algorithmic template called a voice print or a voice model.)

It's unclear to me why uploading the voice model with the query over the public internet and into the cloud each time the feature is used is more secure than leaving the voice model in the cloud. Google didn't respond to my request for comments.

In any event, the worry that voice ID will threaten security and violate privacy is not irrational.

We're likely facing another arms race between cybercriminals trying to spoof people's voices and technologies that detect whether it's a live or synthetic or recorded voice. Already companies like ID R&D offers what they call "liveness detection" technology.

Voice ID is also being heavily used by police and spy agencies around the world. Interpol, for example, uses a system called the Speaker Identification Integrated Project (SiiP). And the NSA has reportedly deployed advanced voice ID technology since 2006.

As these agencies demonstrate, any voice recordings captured on YouTube, in phone calls or via surveillance microphones can be used to generate the voice model needed for voice ID.

And that's one of the attributes that voice ID has in common with face recognition. The "biometric data" can be captured easily, at a distance and without the knowledge or permission of the person being scanned. And then it can be applied just as easily to identify people.

So it appears that voice ID could prove to be a problematic technology in the future. But I believe it will ultimately be accepted by the public because it will be so incredibly useful and convenient.

Why voice ID has a bright future

The coronavirus pandemic is changing everything. In the tech space, it's accelerating previously slow trends.

Kinsella pointed out to me that in the era of the coronavirus, when people may not want to touch fingerprint readers and where masks foil face identification, the acceptance of voice ID may accelerate.

He also believes that voice ID will become most prevalent "in the car and the home," mostly for convenience.

But I think the biggest boost for voice ID will be the decline and fall of the smartphone as the central device in everyone's lives.

Over time, the importance of smartphones will fade, as the importance of wearables will grow. Smartwatches will increasingly get their own SIM cards. It seems likely to me that Apple will mainstream smart glasses within the next five years. Hearables -- basically earbuds powered by artificial intelligence -- will become increasingly popular. I'm even bullish about smart rings, which can protect you from diseases and function as pointing devices for our smart glasses.

Voice ID is a better method for identification and authentication than face recognition and fingerprint reading for wearables. Some wearables won't have the space for fingerprint readers. Most won't have cameras or similar sensors pointed at user faces for face recognition. But they'll nearly all have microphones for processing voice ID.

And the best part is, if someone is wearing multiple wearables -- smartwatch, smart ring, smart glasses and smart clothing, they can authenticate them all at once through voice.

I think the era of smart wearables will also usher in an era of voice ID as the primary way that we are identified and our actions authenticated.