by John Edwards

Speech Integration Technology Can Improve Customer Service and Cut Costs

News
Jul 01, 20039 mins

To Bob DuPont, vice president of reservations for Dollar Thrifty Automotive Group, speech integration sounds like success. That’s because the car rental company is using the technology to both improve customer service and trim costs.

Speech integration technology is nothing new, as any telephone caller who has ever barked back responses to a seemingly endless series of voice prompts can testify. But an improved generation of speech integration software, based on more powerful processors and emerging Internet-focused standards, promises to make the technology more useful and cost-effective.

Until recently, organizations tended to shy away from speech integration because of the technology’s complexity and cost. “I had one client who had 60 people on its [speech integration] project,” says Elizabeth Ussher, Meta Group’s vice president of global networking strategies who covers speech technologies.

Today, preconfigured speech templates, drop-in objects and other packaged tools make speech integration development less burdensome. Hardware improvements, particularly speedier processors, also help make speech integration a more practical technology. “Speech recognition is now very widely deployable,” says Ussher. “I’m seeing clients with a return on their investment within three to six months.”

Yet another reason for increased interest in enterprise speech integration can be found in the almost exponential proliferation of mobile phones, PDAs and other portable wireless devices. Speech input/output is an attractive alternative to cramped keyboards and miniscule displays. “If I’m on my mobile phone while driving my car, I’m not going to push buttons for my account number,” says Ussher. “I’m going to wait for an agent?living or virtual.”

Calling for Cars

Dollar Thrifty is using speech integration to handle some of the more than 1 million calls it receives each year from “rate shoppers”?bargain hunters who phone several different car rental companies in search of the best deal. “Many of the folks who call are just interested in checking rates,” says DuPont. “They aren’t interested in making a reservation; they just want to get information for comparison purposes.”

To free its call center staff from the burden of handling routine data lookups, Dollar Thrifty installed SpeechWorks International’s software at its Thrifty division. The system lets callers check rental rates and availability at airport locations by talking with a virtual call center agent. “It’s a very natural, realistic interchange,” says DuPont. The software also automatically adapts to unique requirements, such as providing personalized rates for members of Thrifty’s loyalty program.

After checking rates and availability, callers who decide to make a reservation are seamlessly transferred to a live agent. A screen “pop” automatically appears on the agent’s display, presenting all the information the caller provided during the speech interface dialogue. DuPont estimates that 35 percent of calls to the company’s toll-free number go through the speech integration system.

And speech integration has not hurt Thrifty’s conversion rate?the number of people calling for a quote who ultimately make a reservation, says DuPont.

Deploying the system wasn’t especially difficult, he adds. “Just the normal tweaking of the application and getting the voice recognizer to work better. Once we got through the first 90 to 120 days, it became apparent that we had a very solid application.” Uptime has been more than 99 percent, which is a critical factor, says DuPont. “If it were to go down, we certainly would be understaffed.”

Natural Language

Enterprises looking into speech integration face two basic technology choices. The oldest and simplest type of speech integration?”directed dialogue” products?prompts callers with a series of questions and recognizes only a limited number of responses, such as “yes” and “no,” specific names and numbers.

A new and more sophisticated approach?”natural language”?to speech integration handles complete sentences and aims to engage callers in lifelike banter with a virtual call center agent. The technology is also more forgiving of word usage. “If a customer calls Thrifty and asks about rates from JFK Airport in New York, they might say ’JFK’ or ’John F. Kennedy’ or ’Kennedy Airport,’” says SpeechWorks cofounder and CTO Michael Phillips. “The system has to be prepared for the different variations that might be used.”

Directed dialogue tools, while less expensive than natural language systems, suffer from their limited recognition capabilities. As a result, they are mostly used for simple applications, such as automated switchboard attendants or credit card activators. Natural language systems, such as the type used by Dollar Thrifty, have a wide range of applications, including product and service ordering, telebanking, and travel reservation booking.

A pair of emerging technologies?VoiceXML and Speech Application Language Tags (SALT)?are also helping to advance voice integration. Both specifications rely on Web technology to make it easier to develop and deploy speech integration applications.

VoiceXML is an XML extension for creating telephone-based, speech-user interfaces. The specification lets developers create directed dialogue speech systems that recognize specific words and phrases, such as names and numbers. That style of interface is well suited to callers who have no screen from which to select options.

SALT, on the other hand, provides extensions to commonly used Web-based markup languages, principally HTML and XHTML. It makes such applications accessible from GUI-based devices, including PCs and PDAs. A user, for example, might click on an icon and say, “Show me the flights from San Francisco to Boston after 7 p.m. on Saturday,” and the browser will display the flights.

Both specifications aim to help developers create speech interfaces using familiar techniques. “You don’t have to reinvent the wheel and program a new interface to get speech recognition access to your data,” says Brian Strachman, a speech recognition analyst at technology research company In-Stat/MDR.

Speaking Internally

While most people think of speech integration in terms of customer self-service, the technology can also be used internally to connect an enterprise’s employees and business partners to critical information. Aircraft mechanics, for example, can use speech integration to call up technical data onto a PDA or notebook screen. Likewise, inventory takers can enter data directly into databases via speech-enabled PDAs, without ever using their hands.

The Bank of New York, for example, has tied speech recognition into its phone directory and human resources systems. Using technology supplied by Phonetic Systems, the bank operates an automated voice attendant that lets callers connect to a specific employee simply by speaking that person’s name. But in the event of a major emergency that requires entire departments to move to a new location, the employees can call into the system to instantly create updated contact information. The information then becomes available to anyone calling the bank’s attendant.

The speech-based approach is designed to help bank employees resume their work as soon as possible, even before they have access to computers. “The automated attendant was already connected to our back-end systems,” says Jeffrey Kuhn, senior vice president of business continuity and planning. “We simply expanded the number of data fields that are shared between the Phonetic’s product, our HR system and our phone directory system.”

The biggest challenge Kuhn faced in deploying the technology was getting it to mesh with the bank’s older analog PBX systems. That problem was eventually solved, although the interface ports on the old PBX units must now be manually set, which is a minor inconvenience.

Bottom-Line Benefits

Speech integration’s primary benefit for callers is convenience, since the technology eliminates the need to wait for a live agent. Problems handling foreign accents, minor speech impediments and quirky word pronunciations are largely fading away as software developers give their products the capability to recognize and match a wider array of voice types. “Every four to five years, speech technologies improve by a factor of two,” says Kai-Fu Lee, vice president of Microsoft Speech Technologies.

Dollar Thrifty’s DuPont says his company’s internal research has found an end user satisfaction level of around 93 percent. “It either met or exceeded their need to get information, and they had an improved perception of our company,” he says.

For enterprises, speech integration’s bottom-line benefits include cheaper 24/7 user support and data access. DuPont says his system paid for itself in less than a year, lopping about 45 cents off the cost of each incoming call. Bank of New York’s Kuhn estimates that his system handles the work of five full-time employees.

Despite the potential benefits, CIOs shouldn’t view speech integration as a panacea to their rising call center costs. The technology itself requires constant attention, which adds to its base cost and detracts from potential savings. “It’s labor intensive,” says Meta Group’s Ussher. “It’s not like a washing machine that runs on its own. It’s a technology that requires constant tweaking, pushing and updating.”

DuPont warns potential adopters not to consider speech integration as solely an IT issue. Since the technology affects a wide range of business processes, he believes that it’s vital to garner enterprisewide support. “I would certainly recommend getting all the stakeholders involved,” he says. “When we put our system together, we involved people from many disciplines, including IT, HR, finance and telecom, as well as the reservations group.”

While speech integration will certainly become more capable and self-sufficient in the years ahead, few observers believe the technology will ever fully replace living, breathing call center agents. In-Stat/MDR’s Strachman says that speech integration will primarily be used to eliminate call center grunt work, such as the recitation of fares and schedules, and to give end users a new way to access critical data. The handling of complex issues, such as technical support, will probably always require access to a live expert. “For call center agents to stay employed, they’re going to have to be more highly skilled and trained than they are now,” says Strachman.

That suits Dollar Thrifty’s DuPont just fine. “We want our agents to do something more than just quote rates,” he says. “You can get a system to do that, and we did.”