Design | 6 min read

Why voice will be the dominant chatbot interface

hero

When consumer internet arrived in the 1990s, it ushered in a new age of communication.

Free internet chat applications like ICQ, MSN and AIM were limitless and meant we no longer had to rely on expensive and metered phone calls. Yet it was text, not voice that was to dominate in the years to come. Slow dial-up internet meant we ended up typing to each other, not talking to each other.

So in a world of 4G and fibre broadband, why are we still clinging to text interfaces in unsuitable contexts? Nowhere is this clearer than with chatbots. Text-based messaging between humans is useful because it’s asynchronous – we’re happy for the conversation to carry on later. But chatbots are different. They represent the return of the command line interface – call and response – so we expect communication in real-time.

2016 saw thousands of new chatbots emerge, and while texting with them might have seemed fun, they consistently delivered sub-optimal value. Typing on small screens with friends is convenient, but text-based interactions with chatbots less so. The effort rarely reaps the rewards we expect:

1800 flowers facebook bot

Is Facebook’s 1-800-Flowers bot really making things easier?

Text-based messaging works best with a human on both sides of the conversation

Imagine walking into Starbucks and ordering a “Venti, half-whole milk, one quarter 1%, one quarter non-fat, extra hot, split quad shots, no foam latte, with whip, 2 packets of splenda, 1 sugar in the raw, a touch of vanilla syrup and 3 short sprinkles of cinnamon.”

Now imagine again if Starbucks insisted each part of the order was written by customers on a separate Post-it note? If Starbucks had never employed humans all over the world parsing each and every voice-based order, they would not be the global success they are today.

Therein lies the challenge with today’s chatbots. The way we ask for their help – through text, not speech – is unnatural for humans. Which makes sense. We’ve been muttering and grunting at each other for at least 100,000 years, but writing things down for only 5,000 of them.

Text-based messaging works best with a human on both sides of the conversation. Humans understand the context, intent and sentiment of other humans. A conversation with a text-based chatbot lacks human emotion, clarity, and urgency. Surely there must be a better way?

Drawing inspiration from offline conversations

To explore ways in which voice can make chatbots more useful, let’s use a simple example of a normal everyday conversation between two humans.

It’s Friday afternoon, and Jim calls Jane. Both work on opposite sides of New York City. They chat for a few minutes to plan grabbing dinner and drinks, and maybe take in a Broadway play after work.

Jim adds Concierge-Bot to the conversation with a simple command:

Jim: “Hey, Concierge-Bot”

Concierge-Bot: “How can I help?”

Jim: “Jane and I would like to meet near West 47th Street and 9th Avenue for drinks at 5pm, have dinner close by at 6pm, and would love to see a comedy show at around 8pm.”

Concierge-Bot: “Bar Centrale serve cocktails, and Obao serve an Asian fusion menu with availability at 6pm for 2 people. Drunk Shakespeare have availability for their 8pm show. Would you like me to reserve these options for you?”

Jim: “Yes, please.”

Concierge-Bot: “Should I use the Mastercard you have on file?”

Jim: “Yes, please.”

Concierge-Bot: “I’ve booked Bar Centrale, Obao, and Drunk Shakespeare using your Mastercard. Confirmation is on it’s way to you now. Can I help with anything else?”

Jim: “No, thank you.”

Concierge-Bot then leaves the conversation, leaving Jim and Jane to continue to catch up.

Voice bot technology is here right now

A human-like, voice bot conversation like the one described above may sound like a vague futuristic promise, but we’re closer to this than you might think. The technology required exists today.

Google Assistant marks a significant leap towards our voice bot future

Take Google Assistant, a significant leap towards our voice bot future. Two-way, contextually aware conversations are possible, making a variety of day-to-day tasks fast and easy to complete. With one simple voice command, it’s easy to add multiple items to your shopping list. After that, simply ask “When does Whole Foods close?” to know when to pop out to pick up your groceries. And, when cooking dinner, setting multiple timers for different items you’re cooking is as simple as “set a 12-minute timer for pizza, set a 20-minute timer for lasagna.” Massive utility value, for tiny effort or input.

Voice commands are useful everywhere. Whether at home through Google Voice, or on-the-go using Google Pixel, the same natural commands are close by to help you day and night.

Remember our earlier example of a complicated coffee order? One example of voice bot value on-the-go is Starbucks’ initiative to take voice-based ordering out of their stores and into our pockets. Utilizing voice commands and AI, Starbucks will make it easy for customers to call in their order on the way to the store, ready for them to pick up minutes later. The same efficient voice-based process, but with extra time-saving benefits, as customers won’t have to get to the store to place the order.

Just imagine walking to Starbucks to meet your friends, calling them to tell them you’re on your way, and including a Starbucks Barista bot in the conversation to tell them which complicated coffee order to prepare ahead of your arrival.

Voice will transform business communication too

Cortana, Microsoft’s digital voice activated assistant, is a small step towards voice bots being adopted in the world of work.

Voice bots won’t just add value to our personal lives – they’ll quickly add value to our work lives as well, becoming useful “on demand” assistants to make voice conversations between humans more productive whilst making our “concentration time” more productive too. And we’ll be commanding help from voice bots through every device we use, not just our smartphones.

On desktop, voice bots will help us book follow-up meetings, send summary notes, and share files mid-meeting through voice commands in Google Hangouts, Slack, and Microsoft Teams. Requesting an on-demand marketing report will be as simple as asking “How did our demand generation campaigns perform last month?”. We’ll filter complex data in data analytics tools like Looker quickly and easily, and hold remote daily stand ups all through voice commands. Voice-activated digital assistants are already available on desktop computers. Microsoft’s Cortana is available on Windows 10, Siri is available on Apple MacOS, and Google are adding voice controls to newer versions of ChromeOS.  Each is a small step towards voice bots being adopted in the world of work.

Voice has always been our most human, and most natural interface – so let’s make the most of it. Soon, unnecessary typing and tapping will be a memory of the distant past.

Join over 25,000 subscribers and get
 the best content on product management, marketing and customer support.