“AI is the new UI” may be a cliché now, but back in 2011 when Apple first released Siri the capability to control a mobile device by talking to it through an intelligent assistant was revolutionary. Granted, Siri wasn’t as smart as HAL in the movie 2001: A Space Odyssey or Eddy, the shipboard computer in The Hitchhiker’s Guide to the Galaxy, but it made enough of an impact on consumer technology to spawn a stream of similar intelligent assistants.
Siri was soon followed by Amazon’s Alexa, Microsoft’s Cortana, and Google’s Assistant. And these will likely be joined soon by many others, including Samsung’s Bixby, which is based on technology Samsung acquired when it bought Viv, a company founded by the people behind Siri.
And just as the iPhone took off when Apple opened it up to third-party app makers, the key to the success of these intelligent assistants may well be the ability for third-party developers to access them and employ them as a user interface to their applications.
It’s an idea that’s not been lost on the technology companies behind these assistants. Here’s a look at how the ‘big four’ assistants — Alexa, Siri, Cortana, and Google Assistant — are being used now, where they differ, and what comes next.
Amazon has been particularly successful in driving third-party adoption of Alexa. The company first made its Alexa Skills Kit available to developers in June 2015, and six months later over 130 skills were available. (Skills, in Amazon parlance, are applications that can be accessed on one of Amazon’s Echo devices using Alexa as the user interface.) Since then, the development of Alexa skills has exploded. By September 2016 over 3,000 were availabe, and in February 2017 Amazon announced that the number of skills had burst through the 10,000 mark. That means over 10,000 applications use Alexa as their user interface.
Perhaps more significant is the availability of Alexa Voice Service (AVS). Released in 2015, AVS allows manufacturers to build Alexa into connected products that have a microphone and speaker. Chinese manufacturer Tongfang has announced plans to integrate Alexa into its Seiki, Westinghouse and Element Electronics smart TVs using a microphone built into the remote controls, enabling owners to use Alexa to carry out actions such as searching channel listings and managing the TV settings.
Chinese phone manufacturer Huawei has also announced plans to build Alexa into its Mate 9 smartphone, and car makers such as Ford are planning to build Alexa into their vehicles to enable drivers to carry out actions such as playing music or setting destinations on the navigation system. Ford owners will also be able to use specially developed skills on Echo devices to carry out functions on their cars such as activating remote start or locking and unlocking doors.
When it comes to voice-enabling third-party devices, James McQuivey, an analyst at Forrester Research, says that Amazon has a huge advantage over its rivals. That’s because it has been working on Alexa for two years, and it can draw on its experiences with its AWS cloud. “If you are working on a washing machine then any problems have probably already been solved, Alexa has been tested, and someone may already have deployed it for that use,” he says. “Amazon has realized that for Alexa to be deployed like this then it has to handle the cloud, security and so on, and it has learned how to do that from AWS. Apple doesn’t have that.”
Of course, both Microsoft and Google do have experience running large-scale clouds, and technically there is probably not a big difference between their intelligent assistant technologies, McQuivey says. Companies’ decisions about which to adopt could therefore hinge on more strategic considerations, like whether Amazon could become a direct competitor to them or if they want to align themselves with Google or Microsoft.
Despite being the first in this wave of intelligent assistants, Apple has been slow to offer Siri as a user interface to third-party applications; it’s only with the release of iOS 10 and SiriKit that it has been possible for external developers to build software that can be controlled by Siri. Even so, the possibilities are severely limited compared to what developers can do with the Alexa Skills Kit.
SiriKit can be used to build apps only in a particular set of “domains” with specific “intents.” For example, a messaging app can register to support the Messages domain and the intent to send a message. Siri then handles all of the user interaction, including the voice and natural language recognition and getting information. As well as messaging, apps can be built that support the following domains: ride booking, photo search, payments, VoIP calling, workouts, and adjusting the climate controls and radio settings in CarPlay apps. But that’s all.
Of all these intelligent assistants, Siri stands out as the exception because it is confined to devices made by Apple. And although that means that it is the most widely distributed, it is by no means the most commonly used. “A high percentage of Apple users say that they have used Siri once, but they don’t use it often,” says McQuivey. “By contrast, one third of Echo users use Alexa multiple times per day, and another third use it once a day,” he adds.
Microsoft’s Cortana is different from Siri or Alexa in that it is a multi-platform intelligent assistant. It first started out on Microsoft’s mobile platform and is now available as an app on iOS and Android, as well as on Windows 10 and Microsoft’s Xbox game platform. It’s a strategy that Microsoft calls making Cortana “unbound,” which the company explains — in an apparent dig at Apple — means that it is “tied to you, not to any one platform or device.”
Microsoft is also developing a Skills Kit for Cortana that will allow developers to take bots created with the Microsoft Bot Framework and publish them to Cortana as a new skill. In addition, developers will be able to repurpose code from existing Alexa skills to create Cortana skills, and they will be able to integrate their web services as skills.
Taking another page from the Amazon playbook, Microsoft has also announced a Cortana Devices software development kit (SDK), which will allow OEMs and ODMs to put Cortana into all kinds of products from cars to televisions to mobile devices, and even an Echo-like connected speaker, which audio equipment manufacturer Harman Kardon plans to release later this year.
Cortana will also work on the IoT Core edition of Windows 10, offering the possibility of IoT devices that can be controlled using Cortana as the user interface.
Google is slightly behind the curve when it comes to intelligent assistants. Google Assistant appeared on the scene only within the last few months — in the Google Home device that was launched in November 2016, the Pixel smartphone launched in October 2016 and the Allo messaging app launched in September 2016. And it was only in December 2016 that Google launched its Actions on Google program, allowing developers to use Google Assistant to work as a user interface for Google Home skills. It’s also promised an Embedded Google Assistant SDK that will allow other hardware makers to embed Assistant in their products.
Beyond the big four
There are other companies also developing general intelligent assistants that may be made available to product makers of all sorts. For example, audio recognition and cognition company SoundHound has launched a platform called Houndify that includes large-scale speech recognition and natural language understanding, connections to various data providers to access different types of information, and the capability to add custom phrases and queries.
Once a product is integrated with Houndify the company promises that it will instantly understand a wide variety of questions and commands in the same way that Alexa or Cortana can. The difference is that rather than making a feature of having Alexa or Cortana in their product, manufacturers will be able to use Houndify as an ‘own brand’ intelligent assistant. As well as Android and iOS SDKs the company has also made SDKs available for C++, Web, Python, Java, and C#.
How widespread will the use of intelligent assistants as a user interface to applications become? It seems unlikely they will replace the keyboard and mouse for desktop applications in the foreseeable future. But for applications you need to acces while driving or walking, or applications running on mobile devices or devices with no screens — such as many IoT devices — an intelligent assistant seems to make perfect sense.
What will be interesting to see play out is whether having an intelligent assistant, such as Alexa, becomes a selling point for products, or whether an own-brand solution such as Houndify ends up being all that users demand.
On that question, McQuivey says that the answer will depend on the type of product under consideration. For a television, for example, he says that the use case has been established: People talk to their televisions to find programs and to control the television itself. For that reason, consumers are unlikely to care if the voice interface is provided by Alexa, Cortana, Assistant or an own-brand solution.
But for other types of devices McQuivey says that consumers will want interoperability: At the start of a laundry session you might want to ask the washing machine to turn off the central heating, or turn on the oven. In that case, you would want to buy a washing machine that uses the same voice assistant as their heating system or oven. Today that’s likely to be Alexa, but in theory Cortana and Assistant could catch up very quickly, says McQuivey.
Of course, Apple is the master of creating an ecosystem of products that interoperate, but Siri can be used only in Apple’s closed ecosystem. That means that unless Apple branches out into domestic appliances and other household devices, Siri is unlikely to become an important user interface for the future.