At my cloud computing consultancy, we’ve been approached several times in the past few weeks by companies that have put their apps up on
Amazon’s cloud infrastructure and are now running into problems. Problems like:
1. Applications are installed on Amazon Machine Images and run just fine, but if the EC2 instance crashes or needs to be terminated, the app
is out of commission until a new instance comes on line.
2. If an EC2 instances gets overloaded, there’s no way to add more resources to improve app performance.
3. No way exists to update the application without taking it completely offline.
4. Performance gets bottlenecked by the database, but there’s no manageable way to move to database replication.
Frustrations with Cloud Computing Mount
In our discussions with these companies, their question is: “Shouldn’t this problem be solved by cloud computing? After all, the cloud offers
resource elasticity, processing power on demand, huge scalability. So why is my application running into these problems?”
The challenge they’ve run into is that they treated cloud computing like Hosting 2.0, and now they’re suffering for it.
The shorthand response to them is “cloud scalability isn’t the same as application scalability, and unless you architect a cloud app, you aren’t
going to garner the benefits of cloud computing. In our workshops, we phrase this as “build cloud apps, not apps in the cloud.”
So what does “build cloud apps” mean, and how is it different than treating the cloud as Hosting 2.0?
Here are key principles in building a cloud application:
- Recognize that individual compute resources can, and do, fail. In Amazon, individual EC2 instances will occasionally experience
poor performance, stop responding, or crash. At scale, resources fail. And this is true of all cloud providers. Google is well-known for its
philosophy of building ultra-cheap computers with (literally) the disk drives velcro’d onto the naked motherboards (Google’s machines have no
metal shell);when one of its computers fails, Google removes it and puts it in for recycling. With hundreds of thousands of machines running,
failures are common, so Google architects its solutions to remain robust in the face of resource failure. Likewise, one should architect individual
applications that run in cloud environments as though the individual resources (including virtual machines) will fail. So an application should be
written to run on two EC2 instances — at a minimum.
- Understand that the potential for failure means that your application must run on at least two instances in EC2. This means application files
need to be placed on both virtual machines or located in a central location both machines can access. It doesn’t mean that every application must
be segregated onto its own instances — a single EC2 instance can support multiple applications; for example, a single instance can host a
number of different web sites. It does mean that each application must be written so that it can span multiple instances.
- Write your application so that session management is handled properly. This either means that session affinity is handled by, for example, the
load balancer that sits in front of the application, or that the application itself places session information in a shared location. This can be
accomplished by placing session information in a database server that is shared among application servers, although this approach can end up
bottlenecked by the load on the database server. A common fix for this is to move session information into a memcached layer which provides
better performance. In any case, session information must somehow be available for whatever part of the application is going to require it.
- Ensure that additional compute resources can join and leave the application dynamically and gracefully. One key reason to use the cloud is to
enable applications to dynamically access the resources they need, varying the amount of resource according to load. If human intervention is
required to add or subtract resources, the bottleneck has moved from compute resources to human resources, which is not ideal. If the
application is not written so that resource levels can vary dynamically, then one has to assign a fixed level of resource; this ends up returning to the
old tradeoff between availability and investment, i.e., do I waste money or users?
I don’t want to trivialize the move to “cloud apps.” Writing applications so that they can dynamically scale without human intervention is not
trivial. For one thing, most software components assume manual, not automatic, administration, and follow an “update the config file and restart
the server” approach. This is fine for a fairly static application topology, but a real pain in a dynamically changing application topology.
Another issue is deciding how to handle files and objects common to multiple copies of an application. They can be placed on a networked
file system, but performance is often an issue. For cloud environments that support SAN- or NAS-type functionality, the files can be centrally
located, although that may impose latency issues. Copies of the files can be placed on each server, although that may cause a challenge in
distribution and version control. The best approach is to have all the files placed in a central location (e.g., in S3 for Amazon-based applications)
and have the virtual machine download the “official” files and install them on itself as it instantiates. Again, this is a bit out of the ordinary and not
common to a non-dynamic environment. The usual approach in most environments is to emphasize hardware (and virtual machine) robustness
and not plan for dynamic application topologies.
As I wrote a couple of weeks ago about the nascent “devops” movement, it’s not clear what percentage of applications will
experience the need for dynamic topologies based on load. So not every application may need to be a “cloud app.” On the other hand, it’s often
difficult to predict what loads an application will experience throughout its lifetime. In his presentation last night at the Hacker Dojo (see my blog
post about this event here), Josh McKenty, chief architect of
NASA’s Nebula cloud project, noted that NASA applications often have an odd user load: years of no traffic, with a short period (one to two
days) of massive traffic when the mission does something spectacular (his example was the project that landed on the Moon to check for water).
Because of the unpredictability of load and the odd load patterns that will be increasingly common to future applications, it’s likely that the design
patterns associated with writing dynamic apps will eventually become standard practice—in other words, every application
will be written so that it is robust in the face of highly dynamic loads. For those apps that experience those type of loads, well, they’re ready to
respond; for those apps that don’t experience those type of loads, well, the capability will remain in reserve, unexercised, available in the
eventuality it’s required.
For architects and software engineers, learning those design patterns today is important because the applications being designed and written
now will be in service for years and will, in all likelihood, end up running in cloud environments. This means that applications should be written
with an eye toward being “cloud apps,” even if the current plans don’t call for them being operated in cloud environments.
Bernard Golden is CEO of consulting firm HyperStratus, which specializes in
virtualization, cloud computing and related issues. He is also the author of “Virtualization for Dummies,” the best-selling book on virtualization to
Follow Bernard Golden on Twitter @bernardgolden. Follow everything from
CIO.com on Twitter @CIOonline