Fisher Price
Audio engineering can make computerized customer support lines seem friendlier and more helpful.
Say youâre on the phone with a company and the automated virtual assistant needs a few seconds to âlook upâ your information. And then you hear it. The sound is unmistakable. Itâs familiar. Itâs the clickity-clack of a keyboard. You know itâs just a sound effect, but unlike hold music or a stream of company information, itâs not annoying. In fact, itâs kind of comforting.
Michael Norton and Ryan Buell of the Harvard Business School studied this ideaâthat customers appreciate knowing that work is being done on their behalf, even when the only “person” “working” is an algorithm. They call it the labor illusion.
Now that interactive voice recognition (IVR) systems are becoming the new normal for customer support lines, and as theyâre able to handle increasingly complex transactions, callers are expecting the same, if not better, service than they once received from human operators. But at the same time, customers still want the benefits of a live interactionânamely, personality. Even when we know thereâs not a real person on the line, we want to feel âheardâ and trust that weâre getting the best results possible.
âEven though technically you shouldnât really care whether the website shows you its work or not, it really resonates with us,â explains Norton. âIt makes us humanize the website, makes us feel like work is being done for us, and then it makes us actually like the product or service more.â
A good IVR system provides the customers with a virtual assistant whoâs clearly responding to the callerâs needs in a way that keeps him or her informed throughout the process, through both verbal and non-verbal audio cues.
So what goes into designing a successful IVR system?
To put it succinctly: a lot. Like, more than you probably thought.
Because itâs not just about making a working navigation system that gets the job done efficiently. Itâs also about composing audio, finding voice talent that reflects the brand, and creating an experience that mimics a natural human-to-human interaction.
Even when itâs an algorithm at work, not a person, customers appreciate knowing that work is being done on their behalf.
Take Delta Airlines as an example. In 2013, in partnership with Nuance Communications, the company launched its custom IVR system.
The process of designing the IVR started with identifying Deltaâs âkey brand attributesââin this case, âoptimism, determination, leadership, innovation, and passion,â says Gorm Amand, the Director and Global Discipline Leader/User Interface Design at Nuance. âWhat we wanted to do was come up with audio [for non-verbal cues] that reflected and promoted those attributesâ while people navigate the customer service hotline.
How exactly do you take the word âdeterminationâ and turn it into song?
That responsibility fell on Nuance Senior Audio Engineer Dan Castellani. He started by studying the music and sounds Delta has used in its advertising campaigns, in order to understand âwhat Delta wanted in their brand from a musical standpoint.â From there, Castellani sat down at the piano and composed around 30 different iterations of possible filler sound, eventually narrowing it down to the four or five that best aligned with the pre-existing materials and were the least intrusive for customers. The final result is a trance-like sound they call percolation, somewhere in between a piece of music and a basic sound effect.
âItâs really analogous to the process of selecting a voice talent for one of these systems,â says Amand. âThe voice embodies the system and automatically conveys brand, and most of us draw conclusions [from it] very quickly.â
When you call Delta, a male voice answers. It’s in the tenor range, seems friendly enough, and inspires a sense of trustâfor me, anyway. This impression is in keeping with the results found in a 2014 study out of the University of Glasgow, Scotland, that looked into how different voices are perceived. Out of two male voice samples, the higher pitched sample was seen as less threatening. Think about it: Would you rather have a virtual assistant with Vin Dieselâs voice helping you plan your trip, or one with Paul Ruddâs? Iâd personally rather give my credit card information to a Ruddbot.
Along with the pitch and gender association of the voice itself, brands need to consider the pacing. If the virtual assistant is talking too fast, it sounds like theyâre reciting canned responses. Too slow, and people get impatient. Many times, âthe technology can go faster than a human could, but itâs often not the right thing for establishing trust,â explains Jane Price of Interactions, a Massachusetts-based speech recognition and virtual assistant technology company. Slowing it down a little âhelps to put it on track with [a customerâs] normal expectations and how they would prefer to communicate.â
This brings us back to the automated typing sound. Michael Pell, Interactionsâ director of design services, decided that the companyâs signature filler Slotbar audio would be keyboard clacking. Theyâve licensed the audio to other companies, including Hyatt, Humana, and LifeLock.
âWith the filler you want it to do two things,â explains Pell. âYou want people to understand that youâre still there and that youâre doing something for them. In the computer age, you can say work is being done by typing. It has the immediate naturally understood connotation of âIâm doing something for you.ââ
A well designed IVR really combines the best of two worlds. Itâs a human interaction without a potential attitude problem. Robots canât have bad days, and theyâre never hangry. Plus, reducing the number of human call center assistants saves the companies a substantial amount of money.
Hereâs a question though: what will happen to filler sounds when computers can handle complex, context-sensitive tasks in a second or less? Bill Byrne, the original inventor of what he calls âfetch audioâ for Goog-411, Googleâs first speech recognition effort, thinks filler may soon become a thing of the past. Goog-411 and other early iterations of speech recognition programs required extra time to execute customer requests, so that filler was a necessity, but already, as processes speed up, weâre hearing less of it.
Still, the teams at Nuance and Interactions have faith that filler will never become obsolete. Not because computers wonât get faster, but because innovation in the field of speech recognition will continue. Computers will be tasked with handling even more complex demands as the algorithmsâ capabilities evolve, and will once again, need additional time to do so.
Plus, as Norton says, âwe really like other people doing what we tell them to do,â especially when itâs a no fun, time-sucking, labor-intensive task. So being able to sit back and hear someone else do the typing will always be a gratifying experience. Even if itâs just a bot.
Our editors found this article on this site using Google and regenerated it for our readers.