Future voice interfaces could turn us all into geniuses

If voice interfaces are indeed the future, Sophia Yeres — of Brooklyn-based Huge Inc. — today gave us a peek of what that evolution looks like.

Alexa, Siri, or Google’s aptly-named ‘Assistant’ are what’s available, but it’s what’s next that’s worthy of awe. Each of the above, like it or not, is a gimmick more than a workhorse. There’s utility, but none are indispensable, and each gets it wrong nearly as often as they get it right. But that’s changing, and quickly.

Yeres calls it the ‘quantified mind’ — a future where our ability to communicate with machines takes on nuance similar to how we’d talk to other humans. It’s important work, and it’s not without potential pitfalls. Pitfalls aside, we’re moving toward this faster than most realize.

By 2018, at least 30-percent of all interactions will happen through a voice interface.

Still, for most of us the voice assistant that comes baked into our phone is merely an extension of the device itself. The problem here is in the secondary language each of us must acquire when talking to our machines. We don’t speak, we command, and each of these commands is almost equally likely to produce the desired result, or something else altogether.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Voice interfaces that really get us

In the future, we’ll still talk to our devices. Better, we’ll rely on them to talk back, and truly understand not just what we’re saying, but the meaning behind each sentiment based on contextual clues, past interactions, and an innate understanding of the world they live in.

First, conversation. A conversation boils down to three key elements:

Dialogue: Conversing, not commanding. Advances in deep reinforcement learning have the potential to improve machines ability to understand the subtle nuances of dialogue, and the back-and-forth nature required to achieve it. The goal here, according to Yeres, is Aaron Sorkin-level walk and talk made famous by The West Wing.
Range: The best dialogue systems we currently have are limited to very specific domains: flights, car controls, shopping, etc;. Advances in deep transfer learning and general AI will deepen understanding, allowing voice assistants to shift seamlessly between domains while gaining understanding of other topics, “kind of like that guy who knows something about everything,” says Yeres.
Emotion: The ability to understand and express emotion is completely lost on machines. Advances in effective computing could push toward machines capable of understanding not just how we’re feeling, but how stimuli in our current environment could shift that mood.

None of our current devices are adept at handling any of the three key elements of conversation. It’s the last, however, that could prove most revolutionary.

Rather than responding to commands, future voice assistants could recognize when you need a bit of a pick-me-up, and dive into human understanding on a much deeper level in order to make us more aware of what causes good moods, bad moods, and everything between. For example, a future voice assistant might notice you’re in a bit of a mood and offer the following:

You laugh a lot with Laurie, but you haven’t seen her in a while. Want me to schedule a visit?

— or —

When David is in the office, your mood takes a hit. Do you think you might want to work with others today (or from home)?

It’s the quantified mind, and it’s coming — like it or not.

Reducing choice

The interface of the future is all about reducing choice. That’s not a bad thing, we make an estimated 35,000 choices a day; half of those probably came in the form of word selection for this article. To further break it down, that’s 1,458 decisions an hour, and a staggering 24 a minute. Literally every two-ish seconds we’re forced to make another decision.

It’s overwhelming.

Fear not, machines are here to help. Rather than pointing you in the direction of an answer, or opening an app for which you to complete the required task, future voice interfaces exist to solve the problem for you, to eliminate the choice. Do you select the Uber X or the Uber Pool? Do you want your pizza with extra cheese, or would you prefer to skip the $2.99 cost and apply it toward chicken wings? Did you really mean to use “duck” instead of “fuck?”

On the surface, each of these aren’t taxing decisions. When pooled over the course of a few hours, or a day, they often lead to additional stress and anxiety. Ironically, the devices that make our lives easier tend to tax our mental health while doing it.

It’s not all sunshine and rainbows

The unfortunate trade-off of the so-called ‘quantified mind’ is, well, becoming a sort of mental zombie. These thousands of micro-transactions between your brain and the world around you, it turns out, are important.

And unfortunately we haven’t figured out the way to exercise muscle groups by watching someone else put in the work. Humans keep their mind sharp by discovery. In essence, each time we reach into uncharted territory and add a new experience, we are training our brain the way bodybuilders might train their biceps. Each positive new experience also limits the damage caused by bad ones. Over time, these experiences shape our thoughts, personality, and general well-being.

Ironically enough, the way around this dilemma is actually leveraging the strengths offered by the very things weakening our brain. Through collective data, the quantified mind could actually lead to more brain-sharpening discovery, or at least that’s the hope.

And that’s the next step: finding symbiosis between convenience and, well, not being a brain dead tech junkie that couldn’t survive without a smart device.