Many companies hesitate to explore cloud computing because of concerns relating to the security, reliability, and cost of an always-on cloud-based application versus one hosted internally. A perfect initial cloud application for this type of companies is dev/test. Moreover, because of its unique characteristics, a cloud environment can actually better meet dev/test requirements than the internal option.
[For timely cloud computing news and expert analysis, see CIO.com’s Cloud Computing Drilldown section. ]
Therefore, if you’ve been waiting to explore cloud computing, give some thought to using dev/test as your initial toe in the water.
Many dev/test efforts are poorly served by existing infrastructure operations. If you think about it for a minute, that makes perfect sense for the following reasons:
Dev/test is underfunded with respect to hardware: Operations gets budget priority. Companies naturally devote the highest percentage of their IT budget to keeping vital applications up and running. Unfortunately, that means dev/test is usually underfunded and cannot get enough equipment to do its job.
Infrastructure objectives differ: Dev/test wants to be agile, while operations wants to be deliberate. When a developer wants to get going, he or she wants to get going now. Operations, however (if it’s well-managed) has very deliberate, documented, and tracked processes in place to ensure nothing changes too fast and anything that does change can be audited.
Infrastructure use patterns differ: Dev/test use is spiky, while operations seeks smooth utilization to increase hardware use efficiency. A developer will write code, test it out, and then tear it down while doing design reviews, whiteboard discussions, and so on. By its very nature, development is a spiky use of resources. Operations, of course, is charged with efficiency with an aim of lowest total cost of operations.
Operations doesn’t want dev/test to affect production systems: Putting development and test into the production infrastructure, even if quarantined via VLANs, holds the potential of affecting production app throughput, an anathema to operations groups. Consequently, dev/test groups are often hindered in their attempts to access a production-like environment.
Dev/test scalability and load testing affect production systems: If putting dev/test in a production environment holds the potential of affecting production apps, what about when dev/test wants to test out how well the app under development responds to load testing or to variable demand? This means that some of the most necessary tasks of development—assessing how well an application holds up under pressure—is difficult or impossible to assess in many environments. Many of the most important bugs only surface under high system load; if they aren’t found during development, that only means they will surface when in production. Moreover, in constrained environments it’s difficult to reproduce a production environment topology, which means it’s hard to assess, prior to going into production, the impact of network latency, storage throughput, etc.
All of these reasons are examples of how current infrastructure management practices optimize toward supporting existing production apps. Left unsaid, most of the time, is how this very reasonable approach can eventually cause more problems for production systems, because bugs are not caught early in the development cycle, and only surface in production, where it is most expensive for them to be fixed.
I want to emphasize that these issues are not the result of personal antagonism or “bad” people; they are the result of the fact that, in most organizations, two groups with conflicting objectives are asked to share a common resource. Naturally, it is hard to mutually satisfy both parties; more important is the fact that, given the very reasonable prioritization for production app availability, dev/test is frequently starved of the resources it needs to do its job effectively.
Of course, the challenges are not limited to the interaction between operations and the development group as a whole. There are even challenges within the development organizations—specifically, between development (one might call it software engineering) and QA. Some of the challenges echo (on a smaller scale) the resource contention between operations and development.
QA gets delayed because engineering has the boxes: With a limited number of servers, QA has to wait until engineering frees up boxes from development. This is essentially a game of musical chairs: when the music stops (that is, when an alpha release is made available), everyone scrambles to get access to machines. Somehow it never seems that there are enough servers to go around. This is especially true in today’s world of multi-tier, horizontally scaled apps.
The machines are available, now let’s do a bunch of software installation: Often times, when a machine shows up, QA has to do a fresh install of the entire application stack. Part of a real testing cycle is ensuring that the application mirrors a production environment, which means the machine needs to have a clean install of appropriate versions of all software components. Of course, fresh installs requires configuration work as well. In my experience, it can often take a week between a release to QA and QA actually having an environment set up properly to begin testing work. And, of course, many times the available machine comes from a developer working on some other project, so there’s no way to avoid a wipe and install.
Oops, a component not on the manifest needs to be installed, too: Unfortunately, many times development environments have components needed by the app, but not identified as part of the install process. This means that testing is delayed while the reason the app fails to execute properly is tracked down. These kind of situations happen a lot, and they’re really, really frustrating, not to mention time-wasting.
These intra-development issues retard application development, time-to-market, and quality. Crucially, they also affect organizational morale. It’s disheartening to be a talented quality assessment engineer and be forced to spend days at a time feeding installation disks into a machine and then configuring components by hand.
One alternative that has been used a lot to address this situation is virtualization. For sure, virtualization ameliorates many of the intra-development issues. Because a single server can support multiple virtual machines, a complex application topology can be mirrored with much less physical hardware. And, of course, the portability and clone-ability of virtual machines means that apps can be handed off to QA without requiring component installation from scratch.
On the other hand, virtualization does nothing, absent sufficient hardware availability in a production environment, to address the ability to evaluate scalability and load stress testing. And putting virtual machines into a production environment confronts the same issues outlined above regarding a very natural reluctance to affect production applications negatively.
How does cloud help?
No contention for resources: Development and QA can each get as much computing resource as it needs. If you look at the UC Berkeley RAD Lab paper on cloud computing, the first characteristic of cloud computing it identifies is “the illusion of infinite computing resources.” This means that development and QA no longer have to contend for a limited pool of boxes. Each can gain sufficient resources to do its job easily.
Agile development and spiky usage is supported: The second characteristic identified by the RAD Lab is the lack of long-term commitment necessary for users of a cloud with respect to individual compute resources. One can use just what is needed at one point in time, and then release the resources with no further commitment. The short-duration, variable usage patterns typical of dev/test are well-aligned with this characteristic.
QA can be more productive: Extended installation and configuration efforts are unnecessary. Because cloud computing environments are typically based on virtualization, it is easy to clone virtual machines and recreate an application. Development can hand over a virtual machine (or, indeed, a collection of virtual machines) that have been cloned, which QA can begin testing. This obviates the issue noted above, where a component residing on a development machine is overlooked in the install instructions, with the affect of causing the application to fail to run. Of course, when it comes time to put the app into production, should the application target environment be the cloud, the same clone/handover process can be reproduced between QA and production (it’s even possible that, as VMware’s cloud strategy is fleshed out, that apps developed in an external cloud could be transferred into an internal data center; that, however, is a ways off, but definitely an attractive vision).
Scale and load can easily be tested: Infinite resources mean that setting up a stress test is easy. One company we worked with needed to test its application under circumstances when a large number of users (approximately 1000) simultaneously initiated contact with system. This is the kind of test that drives a development organization crazy, since getting enough hardware resources to host the user sessions is extremely difficult. Then, when the test is over a few hours later, there’s a big teardown effort as well. Instead, this company fired up a large number of Amazon EC2 instances, ran them for a weekend, and then shut them down. Total cost? Around $100.
Cloud computing is developer budget-friendly: Pay-by-the-drink for spiky, occasional use is cost-effective. The third characteristic of cloud computing identified by the Berkeley RAD Lab is that cost is tied to actual usage. Given the (relatively speaking) low level of resource use by dev/test groups, cloud computing aligns well with the budget realities outlined earlier in this post.
Dev/test avoids the common concerns about cloud computing: Most of the typically cited issues regarding cloud computing don’t apply to dev/test use. Application use of the cloud is ephemeral, the data is not production information, and the application is sandboxed in a cloud. So security and availability level issues are not impediments to use to support dev/test.
This should give you a perspective on why dev/test is a good initial application for the cloud. Of course, there are a number of things one has to do right to take full advantage of cloud environments for dev/test. In addition, there are things to pay attention to in order to avoid problems in leveraging a cloud environment for one’s dev/test efforts. I’ll address these topics in next week’s post.
Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of “Virtualization for Dummies,” the best-selling book on virtualization to date.
Follow Bernard Golden on Twitter @bernardgolden. Follow everything from CIO.com on Twitter @CIOonline