The total public cloud services market in 2011 was $91 billion and it will grow to $207 billion in 2016, according to Gartner. Despite this tremendous surge, large, very publicized cloud outages have everyone thinking about cloud risks. The reality, however, is that outages with large public cloud providers aren't more common than they are with a business' own private infrastructure. In fact, for many organizations, these cloud providers probably provide better uptime than they could achieve on their own.
The trick is to design for failure (see http://www.wired.com/business/2011/04/lessons-amazon-cloud-failure/). Organizations that take failure into account build a robust and dynamic infrastructure that can withstand any cloud failure. Here are three ways to help you avoid the impacts of any public cloud providers' next cloud outage.
[ MORE: 5 tips for surviving a cloud outage ]
* Balance across availability zones. Large public cloud providers' data centers are built across availability zones (AZs) and regions. While serving similar purposes across providers, Amazon describes its AZs as "distinct locations that are engineered to be insulated from failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same Region." The idea is that by having your application instances in separate AZs, if one zone goes down, users can be redirected in real-time to another one. If the secondary zone is far from the end user, performance may be slower, but your service will be up and running.
Okta, an on-demand identity and access management service, is one company that has been fairly vocal about using AZs to avoid business disruption. After Amazon's July outage, Okta wrote a blog post -- Own Your Own Availability: Zero Downtime During the AWS Outage -- talking about how the downtime that many businesses experienced didn't need to occur. Because of the software and operational investments Okta had made across its five-availability-zone footprint in the region the outage occurred in (and in two availability zones in another region), its customers weren't affected. Netflix and Zynga are two other companies that have publicized their use of AZs to help avoid service disruption during outages.
* Cloud balance. Similar to balancing across AZs, you can also balance across multiple cloud providers. This means that, instead of just using Amazon Web Services (AWS), you use a combination of AWS with Joyent, Azure, Rackspace and/or another provider, diverting traffic to an available cloud in the event of a failure.
Cloud balancing assumes that you have application delivery infrastructure in place that is hyper-portable across clouds, so all functionality implemented in your application delivery controller (ADC) is available in all locations. Traffic is routed to individual clouds based on a number of criteria, including the performance currently provided by each cloud, the value of the business transaction, the cost to execute a transaction at a particular cloud, and the relevant regulatory requirements.
Aside from this security blanket, cloud balancing across multiple providers enables you to develop an application that is battle-tested across multiple cloud platforms, benefit from different SLAs and different data center locations, and provides you the option to shift cloud strategies in the future. The ultimate end goal with cloud balancing is to ensure high availability with maximum performance.
For managed security service provider AlertBoot, which operates 100% of its business online, website performance is critical and downtime would be devastating to the company. When the company moved to an entirely virtualized IT environment running in the public cloud, a software-based ADC was an obvious choice to control site traffic and maintain performance, all because of its maximum portability.
According to CEO Tim Maliyil, "This gave us the flexibility to jump between cloud providers as needed. We couldn't be more pleased with our decision to use a software ADC in the cloud, as it has not just improved our site's performance, but provided a guarantee against downtime."
* Add another cloud into the mix. For businesses that rely on public clouds, a private cloud can be a secret weapon in their armory of designing for failure. These businesses' livelihood often depend on the Web and require high-scalability and elasticity, making the public cloud a fairly easy choice and often causing them to bypass private clouds all together. When designing for failure, however, adding a private cloud into the mix as a safety net in the event of a public cloud outage is a solid option, assuming you have the infrastructure and technical skills to run it.
These are just a few ways to design for failure, but the key takeaway here is that outages will happen. How well your organization is prepared for that outage will determine how much business and service disruption you experience. Hopefully if you haven't had the opportunity to learn from your own failure, you've learned from others' failures, and are well on your way to implementing a successful cloud strategy.
This story, "Learn to Fail and Avoid the Next Cloud Outage" was originally published by Network World.