One of the issues we focus on in conversations with companies evaluating moving to cloud computing is the importance — and challenge — of
capacity planning in a cloud environment. The bottom line is that cloud computing is going to make capacity planning much more difficult for CIOs who
intend to maintain all or most of their company’s computing in internal data centers. Moreover, utilization becomes a highly risk-associated topic as
utilization risk is shifted onto the cloud operator.
Why is this?
As a starting point, it’s important to recognize that the scale of computing — the sheer number of applications that an organization runs is about
to explode. I wrote about
this last week and noted that we in the industry typically underestimate by a factor or 100 or more the growth unleashed by new computing platforms.
This recent comment by longtime analyst Amy Wohl on a Google group mailing list reinforced my perspective: “On the day the IBM PC was announced I
had a one-on-one call with IBM about their new product (I couldn’t get to the press announcement) and they assured me the total market for PCs was
5,000.” Which explains why I found laughable this forecast by Bernstein Research analyst Toni Sacconaghi. With all due respect, we are on the cusp of seeing server demand
explode as more and more applications get envisioned, funded, and implemented. The odds of server demand shrinking are vanishingly small.
Which brings us to the issue of capacity planning. The traditional mode of capacity planning — focused on new servers funded by applications
able to achieve capital investment funding — is finished off by by cloud computing. If an application group assumes that resources will be available on
demand, and can be paid for by assigning an operating budget funding code, there’s much less forecast insight about total demand possible. Put another
way, fewer signals about total demand are available, and the timeframe of insight is much shorter.
Some organizations feel they have dealt with this by imposing a limit to the number of servers that can be provisioned at any one time. The thinking is, a
limit of, say, 10 servers, is imposed and any larger number has to go through an exception handling process. Which is fine, but the assumption underpinning
it is that the number of applications will remain relatively stable — and if the total resources each app can request is limited, total resource demand
can be limited, thereby making capacity planning manageable.
However, the assumption that the total number of applications is going to remain stable is tenuous at best. With lower costs, no need of
application-level capital funding, and lower friction in obtaining resources, the total number of applications is undoubtedly going to skyrocket. So even if
each application can only request a limited number of resources, if the number of applications grows dramatically, capacity planning becomes
Simply put, forecasting total demand and planning for sufficient capacity to meet it is going to become much more difficult. And make no mistake about
it, when this cloud demands gets going, and apps groups begin to assume that resources will be available immediately whenever theyre requested, total
demand is going to explode.
One key point in all this is to recognize the difference in how the compute resources are funded. In the past, application groups had to obtain the capital
needed to fund compute resources to operate the application. This led to the “those are HR’s servers, and those are Finance’s” type of situation. This new
cloud world assumes that a central IT group will fund the resources, and then allow apps groups to use them in a shared fashion, paying only for what they
The traditional model meant that each application was responsible for its own utilization profile. Buy too little server, well, your apps run slow. Buy too
much server, well, you wasted some money. In the cloud world, however, the responsibility for ensuring appropriate utilization is transferred from the
individual apps groups to the cloud provider.
What You Must Learn From Airlines
This brings us to another challenge — utilization risk. It’s our perspective that the economics of IT are about to undergo a transformation to
something much more like how airlines operate. Airlines focus on being highly efficient — having just enough planes, enough labor, and sufficient
resources like fuel and food in place to meet demand, with a small amount of reserve capacity. And the airline assumes all utilization risk; individual
passengers take on no responsibility for funding the airline’s resources — the passengers only pay for what they use.
It’s the airline’s responsibility to ensure that sufficient capacity is in place to meet demand — without having excess capacity in place, being
wasted because not enough people wanted to fly somewhere at a given time. Airlines use discriminatory pricing (charging different people different prices
based on time of purchase, seat location, etc.) to achieve high utilization. Amazon has done something similar via a mix of on-demand, spot pricing, and
reserved instances to drive high utilization.
In the context of cloud computing, this means there is a transfer of utilization risk from the application to the cloud provider. It is no longer the
responsibility of the app group to figure out just how much capacity is required to support their app at all times, in both demand peaks and valleys. It’s the
responsibility of the cloud provider (in the case of a private cloud, the responsibility of the operations groups, and, by extension, the CIO) to ensure
sufficient capacity is on tap. And, in the case of a private cloud, it will be the responsibility of whoever is responsible to ensure efficient operation (in other
words, high utilization levels that allow low pricing). Inefficient operation will manifest as high resource pricing to application groups.
High resource pricing wasn’t much of a problem in the past; after all, there was a high degree of quasi-monopoly lock-in for application groups; it
wasn’t easy for them to move the hosting of their applications to other providers. Today, however, using an external cloud provider is trivially easy, which
means that in the future, apps groups will have easy access to alternatives to internal quasi-monopoly resource provision.
How can internal IT groups address this issue of capacity planning and utilization risk? What are the options for resource providers in attempting to
ensure demand is met at the moment it’s required? Here are some thoughts:
Three Smart Strategies
1. Implement governance and resource rationing. One method, of course, is to assert that IT resources are limited and overall demand has to
be limited to what’s available. An evaluation process that resource requesters (apps groups, in other words) have to undergo to justify resource access
could be imposed.
For a variety of reasons, this is unlikely to be successful, not least due to the easy availability of external cloud resources. In a recent workshop
attended by middle management trying to figure out the cloud, I demonstrated that Amazon could have a virtual machine up and running in two minutes (it
was especially fast that day). Their surprised responses told me how powerful this example was. As the old song goes, “How you gonna keep ’em down on the farm, after they’ve seen
Paree?” One attendee, however, insisted that end user self-service is not necessary and it’s still ok to have all requests evaluated and approved by an
operations group. I think his career prospects, along with the viability of this strategy, is limited.
2. Implement competitive chargeback rates. This strategy is, in essence, the “if you can’t beat ’em, join ’em” approach. All things being equal,
application groups will undoubtedly prefer to work with internal groups and internal resources.
Of course, there are several implications to this strategy. The first is that chargeback needs to be in place. This is a fairly controversial subject within
cloud computing, with many people asserting that chargeback is not necessary or that “showback” (reporting as to total resource consumption) is sufficient.
This seems untenable as an ongoing mode of operation. Price is a very efficient rationing mechanism (certainly more efficient than the process-based
rationing outlined above), and being able to demonstrate true pricing vis a vis external providers is table stakes.
The second implication is that the prices need to be fairly competitive with external providers — maybe not as low, but certainly not five, ten, or
twenty times higher. This will require the internal cloud to pare capital and running costs — and achieve very high utilization rates.
3. Develop a meaningful mixed deployment strategy. If an inability to achieve low costs and high utilization rates means that cost
competitiveness is not possible, pursue a strategy that offers both internal and external cloud hosting, with pricing transparency for both options. This allows
the application group to decide whether a higher price for an internal cloud is acceptable, based on factors like local control and better SLAs, or that a
lower price from an external provider that delivers less service is preferable.
One should not underestimate the challenges of this approach. It requires identification of true internal resource costs and chargeback capability. It also
requires that application groups can make the deployment decision on the fly (after all, if making the decision requires a lengthy discussion and evaluation
process, it’s not cloud computing, it’s just more of the traditional process).
One should also recognize the threat this poses to the internal capacity. If insufficient numbers of applications choose to host internally, the low
utilization will require costs to be spread across fewer applications, thereby resulting in higher allocation costs to them, which results in it being more
attractive for those applications to decamp to outside cloud resources. Frankly, there is no easy answer to this issue, but it will be significant for sure.
Overall, one can expect to see much more emphasis on increasing utilization rates in internal data centers, far beyond the levels achieved even by the
best of today’s virtualized environments. The sea change brought by cloud computing and its assumptions of “infinite” capacity and on-demand elasticity,
accompanied by pay-per-use pricing, will galvanize change in IT infrastructure and operations far beyond what most envision today.
Bernard Golden is CEO of consulting firm HyperStratus, which specializes in
virtualization, cloud computing and related issues. He is also the author of “Virtualization for Dummies,” the best-selling book on virtualization to
Follow Bernard Golden on Twitter @bernardgolden. Follow everything from CIO.com on