Montserrat is one of the many islands sprinkled among the Caribbean West Indies. In just 39 square miles, it boasts mountains, rain forests, beaches and groves of bananas, mangoes and coconuts. The air temperature rarely dips below 78 degrees and neither does the water.
In short, Montserrat is paradise. Or it would be but for the Soufriere Hills volcano, which erupted for the first time in July 1995 and hasn’t stopped since.
Soufriere Hills has rendered nearly two-thirds of the island—an area now called the Exclusion Zone—uninhabitable. Since 1995, the island’s population has fallen from 11,000 to 4,000. The volcano has buried Plymouth, the former capital. It killed 20 people in one violent belch in 1997. It has suffocated the economy, once driven by tourism and rock stars like Sting, the Stones and Paul McCartney, who partied and recorded music there at Air Studios, the recording facility once owned by the Beatles’ former producer George Martin but now buried by the volcano.
This dichotomy—Eden on one side of the island, the fires of Hell on the other—makes Montserrat a perfect laboratory for risk analysis. Just as much of Montserrat is buried in ash, it’s also buried in probabilities. Scientists know, for example, that there’s only a 3 percent chance that Soufriere Hills will stop erupting in the next six months. They also know there’s a 10 percent chance of injury from the volcano at the border of the Exclusion Zone, and they can draw an imaginary line across the island where the threat from the volcano equals the threat from hurricanes and earthquakes.
"Thirty years ago, you needed the biggest computer in the world to do the statistical risk analysis," says Willy Aspinall, who helped develop these figures in the shadow of Soufriere Hills. "Now all you need is a laptop and a spreadsheet." He says the risk calculations get better and more textured all the time. He uses Monte Carlo risk analysis simulation software and spreadsheets to quantify the risk levels that help decision-makers minimize the volcano’s threat to people’s lives.
If this type of risk analysis is good enough for Aspinall, it ought to be good enough for CIOs, especially now that they’re working in an economic environment looming as ominously over their businesses as Soufriere Hills looms over Montserrat. For the most part, though, CIOs have not adopted statistical analysis tools to analyze and mitigate risk for software project management.
This is why they should.
Experts will tell you that statistical risk analysis is as essential to real portfolio management as a processor is to a computer. Without it, portfolio management is simply a way to organize the view of projects that will almost certainly fail. CIOs who are serious about portfolio management need to be serious about statistical risk management. (For more on portfolio management, see "Portfolio Management: How to Do It Right" at www.cio.com/printlinks.)
"If you don’t succeed with risk management, you won’t succeed with project portfolio management," says Raytheon CIO Rebecca Rhoads, who credits risk management with lowering her project failure rate and helping Raytheon IT achieve its cost-performance targets. Rhoads is ahead of the curve, but despite her engineering background, she has yet to apply the kind of sophisticated statistical analysis that Aspinall uses for his volcano.
Robert Sanchez, senior vice president and CIO of Ryder, credits risk analysis with bringing order to his company’s decision-making process for projects. He would welcome statistical analysis, but he’s not there yet. "Have we really embraced it completely and understood it in all of its detail?" Sanchez asks rhetorically. "No, we haven’t. But we will."
CIOs should become familiar with two statistical tools. They are the colorfully named workhorses of risk analysis: Monte Carlo simulation and decision tree analysis. Probabilities figure heavily into both, which means that risk has to be quantified. CIOs must draw their own line between the Exclusion Zone, where it’s too risky to venture, and the beaches, rain forests and coconut groves, where the living is easy and the threats are manageable.
The Trap of Common Sense
Even a simple task like choosing to drive to work requires a risk assessment, although not a computational one; you can do shorthand probability in your head. Though the cost of being wrong is high, the risk is relatively low (a 5 percent probability of being seriously hurt in a car accident) and easily mitigated by wearing a seat belt.
This sort of informal risk analysis can sometimes be useful. Steve Snodgrass, CIO of construction materials supplier Granite Rock, has the misfortune of managing IT for a company that literally straddles the San Andreas Fault. Snodgrass doesn’t need statistics to tell him that it would be a bad idea to do nothing to mitigate the possibility that a quake will take out his critical applications. So he outsources his applications’ backup far from the fault line.
However, CIOs often use this kind of commonsense reasoning as a way to avoid doing real risk analysis, say Tom DeMarco and Timothy Lister, authors of Waltzing with Bears: Managing Risk on Software Projects, a primer on statistical risk analysis for IT. "It’s been very frustrating to see a best practice like statistical analysis shunned in IT," says Lister. "It seems there’s this enormously strong cultural pull in IT to avoid looking at the downside."
In lieu of choosing projects based on acceptable risk, Ryder’s Sanchez says, IT often uses what he calls the moral argument, in which the greatest risk lies in not doing the project. Therefore, the risk is mitigated by doing the project. This reasoning was particularly valid during the boom years when there was a palpable fear of getting left behind technologically. But it was never called risk analysis. "I came into IT and was never really comfortable with the moral argument," says Sanchez, whose background is in engineering and finance. "I was looking at it thinking, We analyze the risk of building a new office, but we don’t on an ERP system that costs the same amount."
How to Create a Risk Analysis Process
As the director of foreign exchange at Merck, Art Misyan uses statistical risk analysis for evaluating the impact of foreign currency volatility. Like Sanchez, he’s puzzled by IT’s laissez-faire attitude toward risk analysis. "Risk gives you the ability to look at a whole range of outcomes, but IT looks at only two possible outcomes," he says. "Either you hit deadlines or budgets, or you don’t."
IT needs to think in probabilities, Misyan says, not ones and zeros. The best way to start is for the CIO to formalize the risk process. "First you have to set up a process to determine and track risks," he says. The good news is that much of the risk process is built into project management methodologies CIOs have been adopting anyway, so it should be familiar. Here are the basics for developing a risk analysis process.
Gather experts to determine project risks. These brainstorming sessions should be free and creative. "You want the pessimist in the group, the dark cloud," says Anne Rogers, director of information safeguards at Waste Management, who teaches risk analysis. "You want the person that will ask, What if a truck ran into the building?"
When you don’t ask the off-the-wall question, you run the risk of smacking into it. "Motorola gambled on developing Iridium satellite phones and charging $7 a minute," recalls DeMarco. "No one seemed to wonder what would happen if cell phones came along offering similar service for 10 cents a minute and free nights and weekends."
Assign researchers to uncover known risks. "We came up with 20 or 30 risks we knew we’d face by research," says Sandy Lazar, director of key systems for the District of Columbia, who is overseeing a five-year, $71.5 million administrative systems modernization program (see "Get a Grip on Risk" at www.cio.com/printlinks). "If you read up, you realize ERP has failed over and over for the same reasons for 15 years now." In fact, there are five typical risks to software projects that every CIO should include in a risk analysis (see "The Five Universal Risks to Software Projects," Page 62).
Divide risks into two categories—local and global. The risk of staff turnover during a project is a local risk. War is a global risk. Often, those new to risk analysis focus only on the local risks, but they need to consider the global risks and their impact.
Create a template for each risk. The template should include a unique risk number, a risk owner, potential costs (in dollars and other terms), a probability of occurrence (a low-medium-high scale will do at this point), any potential red flags or signs that the risk is materializing, mitigation strategies and a postmortem for noting if the risk factor actually happened. (A good example of such a template can be found in Waltzing with Bears. See "Risk Control Form" at www.cio.com/printlinks.)
One important footnote for developing this process: Value consistency over accuracy. If you do things in a consistent manner and the numbers are off, at least they’ll be off in a consistent—and therefore fixable—way. "The process," says Raytheon’s Rhoads, "is so much more important than the math rigor. Mature, consistent processes—you need that first."
How to Use Monte Carlo Simulations
Once you have a repository of project risks, you can get statistical. The most commonly used tool for this is the Monte Carlo simulation. This technique was developed in the 1940s for the Manhattan Project. It’s used today for everything from deciding where to dig for oil to optimizing the process of compacting trash at a waste treatment facility. It’s a deceptively simple but powerful tool for risk analysis. All Monte Carlo really does is roll the dice (hence the name).
Here’s the theory: Roll a die 100 times, and record the results. Each face will come up approximately one-sixth of the time—but not exactly. That’s because of randomness. Roll the die 1,000 times, and the distribution becomes closer to one-sixth. Roll it a million times, and it gets much closer still.
The die represents risks—albeit evenly distributed, predictable risks—where each side has about a one-sixth probability of occurrence or a five-sixths probability of not occurring. What if each die were a project risk and each side represented a possible outcome of that risk? Say one die was for the risk of project delays due to staff turnover. One side would represent the possibility that the project is six months late because of 20 percent turnover. Another side could represent a two-year delay due to 80 percent turnover. The die could also be unevenly weighted so that certain outcomes are more or less likely. There would, of course, be dice for other risks—sloppy development, budget cuts or any other factor unearthed during preliminary research.
Monte Carlo simulators "roll" all those risks together and record the combined outcomes. The more you roll the dice, the more exact they make the distribution of possible outcomes. What you end up with resembles an anthill (see "The Shape of Risk," Page 64), where the highest point on the curve is the most likely outcome and the lowest ends are possible but less likely.
Once you determine a project’s risk profile, you can build in extra resources (like money and time) to mitigate the risks on the highest points of the curve. If the distribution says there’s a 50 percent probability the project will run six months late, you might decide to build three extra months into the schedule to mitigate that risk.
Monte Carlo simulators also let you run "sensitivity analyses"—rolling only one die while keeping the others fixed on a particular outcome to see what happens when just one risk changes. A health-care company (that requested anonymity) using a Monte Carlo simulator from Glomark ran a sensitivity analysis for a pending software project. Each die was rolled, one at a time, 500 times while the other dice were kept fixed on their most likely outcomes. The exercise showed that three of the nine risks represented 87 percent of the potential impact on the project—allowing the company to focus its energy there.
You can (and should) repeat Monte Carlo simulations for all the projects in your portfolio, ranking them from riskiest to safest. This will help you generate an "efficient frontier"—a line that shows the combination of projects that provide the highest benefit at a predetermined level of risk—something like the line across Montserrat. An efficient frontier helps you avoid unnecessary risk. It will help stop you from choosing one project portfolio that has the same risk but lower benefits than another.
Admittedly, this description glosses over some of Monte Carlo’s dirty work. Someone has to determine which dots to put on the dice and how to weight the individual dots. That’s your job. Canvass your experts, mine historical data, and do whatever else you can to come up with possible outcomes from each risk, and then estimate the probability of that result occurring. In other words, the risks themselves are a range of outcomes contributing to a further range of possible outcomes for any given project, or even combinations of projects.