When it comes to making use of AI and machine learning, trust in results is key. Many organizations, in particular those in regulated industries, can be hesitant to leverage AI systems thanks to what is known as AI’s “black box” problem: that the algorithms derive their decisions opaquely with no explanation for the reasoning they follow.
This is an obvious problem. How can we trust AI with life-or-death decisions in areas such as medical diagnostics or self-driving cars, if we don’t know how they work?
At the center of this problem is a technical question shrouded by myth. There’s a widely held belief out there today that AI technology has become so complex that it’s impossible for the systems to explain why they make the decisions that they do. And even if they could, the explanations would be too complicated for our human brains to understand.
The reality is that many of the most common algorithms used today in machine learning and AI systems can have what is known as “explainability” built in. We’re just not using it — or are not getting access to it. For other algorithms, explainability and traceability functions are still being developed, but aren’t far out.
Here you will find what explainable AI means, why it matters for business use, and what forces are moving its adoption forward — and which are holding it back.
Why explainable AI matters
According to a report released by KPMB and Forrester Research last year, only 21 percent of US executives have a high level of trust in their analytics. “And that’s not just AI — that’s all analytics,” says Martin Sokalski, KPMG’s global leader for emerging technology risk.
This lack of trust is slowing down AI adoption, he says — in particular, the pace at which companies move AI out of the lab into large-scale deployments.
“You have smart data scientists coming up with these amazing models, but they’re put in a corner because the business leaders don’t have the trust and transparency in this stuff,” he says. “I’m not going to deploy them in a process that’s going to get me in deep with the regulators, or have me wind up on the front page of the newspaper.”
And it’s not just healthcare and financial services industries that have to consider regulatory scrutiny. Under GDPR, all companies must be able to explain to their customers why an automated system made the decision it did.
Moreover, without the ability to analyze how an algorithm comes to its conclusions, companies must resort to blind faith on AI system recommendations when their business may be on the line.
Retail company Bluestem Brands, for example, is using artificial intelligence to offer customized shopping recommendations. But what if the AI system recommends items that aren’t historical bestsellers, or that don’t match a sales expert’s gut sense?
“The inclination is to say, ‘No, that AI is broken, we should be recommending the pants that are our best seller,'” says Jacob Wagner, IT director at Bluestem Brands.
The solution to these issues of trust is to offer an explanation. What factors did the AI system use in making its recommendation? This is where explainable AI comes in — and it’s a feature that’s becoming more in demand.
Explainable AI encompasses the tools and techniques that are aimed at making the resulting solutions of an AI system more readily understood by humans with domain expertise. Explainable AI enables humans to enter the process of deriving decisions, increasing both trust in those systems and accountability for the results. Often this amounts to outputting rules the AI learns through training, and allowing humans to audit those rules to understand how the AI might draw conclusions from future data beyond the training set.
For Bluestem Brands, Wagner says he is able to get about 85 percent of the explanation he needs from his current system, provided by Lucidworks, but he would like to see more.
“Walking people over the trust barrier is a challenge,” he says. “The more information we have about why something was recommended, the easier that experience is.”
A question of governance
Much of the AI in use by businesses today is based on statistical analysis. These machine learning algorithms are used for a range of purposes, from improving shopping recommendations and search results, to calculating credit risk, to spotting suspicious behaviors in computer networks.
To make their recommendations, the algorithms analyze particular characteristics, data fields, factors, or, as they’re called in the industry, features. Each feature is given a particular weight for helping the AI group things into categories or identifying anomalies. So, for example, when determining whether an animal is a cat or dog, an algorithm might rely on the animal’s weight as leading factor, then the animal’s size, then its color.
Understanding then what factors go into a decision should be a straightforward process. But companies aren’t yet making listing the factors relevant to decisions a priority.
“One of the key trends that we’ve noted is the lack of internal governance and effective management over AI,” says KPMG’s Sokalski. “We found that only 25 percent of companies are investing in developing control frameworks and methods.”
It’s a problem of business process maturity more than of technology, he says. “It’s about building internal capabilities, end-to-end governance, and end-to-end management of AI across the lifecycle.”
The state of explainable AI today
All the major AI platform vendors, as well as most of the top open source AI projects, have some form of explainability and auditability built in, Sokalski says.
KMPG has been working with one of those vendors, IBM, to develop its own toolkit, AI in Control, to bring to clients. Such frameworks make it easier for companies to develop AI systems with explainability built in, instead of having to piece together functionality from various open source projects.
In August, IBM released its own toolkit, AI Explainability 360, which contains open-source algorithms for interpretability and explainability for all major types of machine learning in use today, with the exception of recurrent neural networks, which are often used for time-series problems such as stock market predictions.
There are eight algorithms in the toolkit, most of which have not been available publicly in the form of usable code. And the underlying research was published just this year or in late 2018, says Kush Varshney, IBM’s principal research staff member and manager of the IBM Thomas J. Watson Research Center.
“Anyone can use the toolkit, whether an IBM customer or not,” he says.
But adding explainability to an AI system isn’t as simple as providing a list of the factors that went into a decision, he warns. “There are different ways of explaining.”
Take, for example, a decision about whether to give someone a bank loan. The customer wants to know why their application was rejected, and what they can do to increase their chances of getting a loan in the future, Varshney says.
“And the regulator wouldn’t so much care about every single applicant,” he says. “They would want to look over the totality of the decision-making process, get a global explanation of how the process works. They’ll want to be able to simulate for any input how the model makes its decisions and figure out if any problems exist such as fairness or other potential issues.”
As for the bank, it would have a totally different set of questions, he adds, in making sure the system was making accurate predictions.
Explaining neural networks
As AI systems get more advanced and rely less on predefined lists of features and weights, explanations get more difficult. Say, for example, the system classifying cats and dogs isn’t working from a tabulated set of data points but is dealing with animal photographs.
Convolutional neural networks, the ones often used for image processing, look at training data and discover the important features on their own. The form those features take can involve very complex mathematics.
“If you have a complicated black-box model that takes all those features and combines them in millions of ways, then the human cannot understand that,” Varshney says.
Declaing a picture as that of a cat and not a dog because of a complex relationship between particular pixels is just as unhelpful as one human telling another that this is a cat because a particular neuron in their brain fired at a particular time. But there are still ways to make the systems explainable, Varshney says, by working on a higher level of abstraction.
“You can find representations that are semantically meaningful,” he says. “For example, if it’s an image of cats, it will figure out that whiskers are an important feature, the shape of the nose, the color of its eyes.”
Then, to explain its decision, the AI can highlight those places in the photograph that show that this is a cat or show a comparison photo of a prototype image of a cat.
“It is really a way to engender trust in the system,” he says. “If people can understand the logic of how these things are working, they can gain confidence in their use.”
This is exactly the approach taken by Mark Michalski, executive director at the Massachusetts General Hospital and Brigham and Women’s Hospital Center for Clinical Data Science.
The hospital uses AI to spot cancers in radiology images. Medical staff must have high levels of trust in the system to use it. To address this, health care providers don’t just get a simple yes or no response to the question of whether the scan shows that the patient has cancer.
“You can provide heat maps over the tops of the images to explain why the machine is looking where it is,” Michalski says.
Proprietary systems and reluctant vendors
Full transparency isn’t always to everyone’s benefit. For some AI vendors, giving away the details of how their AIs make decisions is akin to giving up its secret recipes.
“The software companies are somewhat greedy and assume that everyone has bad intent and wants to steal their ideas,” says Justin Richie, data science director at Nerdery, a digital services consultancy that helps companies with their AI projects. “There are some vendors who’ve let customers walk because they won’t expose their weights. Other vendors are showing interpretability directly in their tools.”
It’s a market issue more than one of technical limitations, he adds. And as AI technologies become commonplace, the game will change.
Commercial, off-the-shelf algorithms frequently lack critical explainability features, says Alex Spinelli, CTO at LivePerson, which makes AI-powered chatbots.
“Some of the better ones do come with inspection and audit capabilities, but not all,” he says. “There aren’t a lot of standards. Auditability, traceability, ability to query the algorithm why it made the decision, is an infrequent capability.”
LivePerson writes its own algorithms with explainability built in or uses open source tools that come with those capabilities, Spinelli says, like Baidu’s Ernie and Google’s Bert open-source natural language processing algorithms.
AI standards on the rise
But there are widespread industry efforts to make AI systems more transparent, he says. For example, LivePerson is involved in the EqualAI initiative, focusing on preventing and correcting bias in artificial intelligence by developing guidelines, standards and tools.
Existing standards bodies have also been working to address these issues. Red Hat, for example, is working with several standards designed to help AI and machine learning systems be more transparent, says Edson Tirelli, Red Hat’s development manager for business automation.
“These standards help open up the box,” he says.
One such standard is the Decision Model and Notation standard from the Object Management Group.
This relatively recent standard helps close some gaps in understanding all the steps involved in a company’s decision-making process, Tirelli says. “You can have all the tracing of every step of that decision or business process, all the way down to the AI part.”
The standards also make it easier to move processes and models between vendor platforms. But besides Red Hat, only a handful of companies are supporting DMN.
The Object Management Group’s Business Process Model and Notation standard, however, is supported by hundreds of vendors, Tirelli adds.
“Virtually all tools out there support PMML [Predictive Model Markup Language], or the sibling standard, the Portable Format for Analytics,” he says. “It’s supported by basically every tool out there that creates machine learning models.”
These standards connect with each other to provide the functionality for explainable AI, he says.
As AI is used for more complicated tasks, explainability gets more challenging, says Mark Stefik, research fellow at PARC.
“It doesn’t help if the explanation gives you 5,000 rules,” he says.
For example, PARC has been working on a project for DARPA, the Defense Advanced Research Projects Agency, involving training drones for forest ranger rescue missions. For simple tasks, it is easier to know when to trust the system than it is for expert-level missions in complex mountain or desert scenarios.
“We’re making a lot of progress on that, but I wouldn’t say that we have explainability for all kinds of AI,” he says.
The final challenge and the one that may be the hardest yet is that of common sense.
“The holy grail is causal reasoning, and that’s the direction I and other people like myself are heading towards,” says Alexander Wong, Canada Research Chair in the field of AI at the University of Waterloo.
Today, he says, it’s difficult for computer systems to decouple correlation and causation. Does the ringing of the alarm clock cause the sun to come up?
“We want to find a way to separate spurious correlations from true causation,” he says. “It’s hard to even train a person to do proper causal reasoning. It is a very, very difficult problem.”
This ability to think through a chain of causation is what people talk about when they talk about general artificial intelligence, he says.
“We’re making good headway in that direction,” Wong adds. “But the ultimate form of causal reasoning, if I had to guess, would be within a time frame.”
But even if explainable AI is still in its infancy, it doesn’t mean that companies should wait.
“Explainability, even in its current form, is still very useful for a lot of business processes,” Wong says. “If you start using it right now, the system you build will be so much head of other people — and will also be a lot more fair. When it comes to responsible AI, one of my core beliefs is that explainability and transparency are a key part of it.”