by Edward L. Haletky

How to Recover From Virtualization Disasters

Aug 29, 20083 mins

Virtual infrastructures are designed to recover from minor problems and major disasters, but most big problems fall somewhere in the middle and require different tools.

Disaster recovery, to many people, means not much more than a hot site, but there is much more involved. What exactly is involved depends on how much money you have to put to the problem.

Fully redundant hot sites cost quite a bit in hardware, software, and licensing. At best, they should be exact duplicates of your current environment; at worst, they should be able to run your most important virtual machines.

However, this is not the only aspect of DR that should be considered. Disasters come in all sizes, from the small-scale application failure to the catastrophic natural disaster. Both of these are fairly well understood.

But what about the middle of the road business-continuity and disaster issues, which somewhere in between the extremes in the scope of disaster, but are specific to virtualization infrastructures: single machine failures, SAN failures, VM failures, etc.

For these there are a few tools, mostly from VMware that will help. VMware High Availability tops the list. But any VM-to-VM clustering service will also work to solve these issues.

To help with storage server issues there is also the LeftHand Networks VSA and Xtravirt XVS products. These products use local machine disk to mirror between the systems using software. This way if one system failed, the data is not lost. These technologies add increased redundancy to the software stack and can replace redundant SANs in smaller shops.

Even good backups add to this concept of redundancy by adding replication features (VizionCore vReplicator and Veeam Backup). These will allow you to replicate VMs from storage device to storage device and place VMs in locations where they are ready to power on at a moments notice. Which is another good way to keep things running if your SAN or NAS device fails.

VMware SRM works with various SAN and NAS devices to allow the SAN or NAS’s own mirroring software to work better with virtualization.

As we put more and more VMs on a system we need to consider adding more and more redundancy into the systems. There are already some hardware solutions, like RAID Blade and RAID memory technologies; we have the ability to have redundant switching fabrics.

These software storage technologies add into the existing RAID level redundancy and expand them to include multiple systems.

While hot sites are the end goal for natural disasters, don’t forget to plan for the middling disasters by increasing your local redundancy, using these or other tools.

Virtualization expert Edward L. Haletky is the author of “VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers,” Pearson Education (2008.) He recently left Hewlett-Packard, where he worked in the Virtualization, Linux, and High-Performance Technical Computing teams. Haletky owns AstroArch Consulting, providing virtualization, security, and network consulting and development. Haletky is also a champion and moderator for the VMware discussion forums, providing answers to security and configuration questions.