Hancock Bank, a century-old institution headquartered on Mississippi’s hurricane-prone Gulf Coast, likes to boast that it will be the last to close and the first to open when stormy weather shuts down area businesses. That claim got the severest test imaginable when Hurricane Katrina roared ashore in 2005. “We were hurt badly,” says Ron Milliet, the bank’s director of IT services.
Hancock’s IT department, which serves 150 sites across four states, took a major hit, of course, but it could have been worse. The bank found that the relatively small number of servers it had virtualized (the project had just begun when Katrina hit) could be recovered in hours, while the physical servers took days, says Milliet. Many critical services were up within 24 hours.
MORE ON DISASTER RECOVERY
ABC: An Introduction to Business Continuity and Disaster Recovery Planning
Lessons from Hurricane Katrina: It Pays to Have a Disaster Recovery Plan in Place
Tough Technology: The Most Rugged Laptops, Phones, Mice, Drives and More
Virtualization steals the spotlight, but it’s just one of the innovative tools now available to CIOs who are rethinking their disaster recovery and business continuity strategies. Techniques including WAN optimization and appliance-based e-mail backup are reducing recovery times, lowering costs and most importantly, raising confidence levels that business will continue even after a major disaster. As for good old tape, it’s still a backup mainstay, but CIOs are looking for supplementary technologies that can be used to overcome the venerable media’s limitations.
Not only are CIOs adopting new disaster recovery technologies, “they are asking themselves what disaster recovery will do to improve business as a whole,” says Michael Croy, director of business continuity solutions for the Forsythe Solutions Group. That could mean, for example, leveraging IT assets acquired during a merger by putting the excess capacity to work as a backup or mirror site, or making underutilized resources part of a disaster recovery arsenal.
And because there are a wealth of new disaster recovery strategies available, customers are now in a stronger-than-ever position to cut affordable and flexible deals with vendors running offsite recovery services such as SunGard and IBM, says Croy.
The Virtual Solution
Gamblers look at a casino and see slot machines, roulette wheels, bars and restaurants. But for an IT exec, the same casino is a river of data and applications that must keep flowing 24 hours a day, no excuses accepted.
The Borgata Hotel Casino and Spa in Atlantic City, N.J., had been using a traditional tape backup solution, but it was “slow and inconsistent. We were in a labor-intensive manual world,” says John Forelli, the resort’s VP of information technology.
What’s more, the tape system gobbled a significant amount of network resources, and since the 2000-room hotel is a 24/7 business, it was difficult to find a time to back up a server without sacrificing overall performance, Forelli says.
In 2006, three years after the resort opened, management decided to virtualize its Windows servers using VMware and speed backup and recovery tasks with replication software from Double-Take Software.
Double-Take replicates application data from 77 virtual production machines to a single physical disaster recovery target and will failover to the target (automatically switch over to the backup system) in the event of an outage. When the reserve system is activated, the appropriate application services are started within a corresponding virtual machine at the disaster recovery site and users are automatically redirected, says Forelli.
Because the software looks at data on the byte level and replicates incrementally, there’s less bandwith pressure on the network. “It’s automatic, it’s quick, it’s under the covers,” he says.
That simplicity is one reason why virtualization is becoming so popular for disaster recovery. “Windows systems are miserable to recover,” says Donna Scott, an analyst with Gartner.
At Hancock, Katrina’s lesson that virtualization equals faster recovery, along with a corporate desire to cut hardware and power costs, convinced the company to move much of its operations to a virtualized environment (with the exception of a mainframe-based banking system). The bank replaced 55 physical servers with five blade servers running VMware infrastructure, saving $150,000 in server hardware capital costs alone, says Milliet. There, is however, a potential downside. “We have a lot of eggs in one basket. One bad motherboard can take out a lot of virtual machines at one time,” he says. To avoid that disaster, Hancock uses software that will automatically switch the VM workload to another physical server if trouble is detected.
Smart WAN Tricks
For companies struggling to ship large amounts of data across the network, WAN optimization can improve day-to-day performance and speed backup and recovery operations as well.
Cubist Pharmaceuticals was using a traditional disaster recovery model that involved backups to tape, a day or more of travel time to the recovery site, at times a wait for available machines and then a cumbersome restore. “Boring, static, not flexible,” comments Michael Geldart, senior manager of computer operations at the company’s headquarters, in Lexington, Mass.
Geldart was not only concerned about his disaster recovery strategy, he was also struggling with the large amount of data the company needed to move between headquarters and its facility in Italy.
Moreover, management wanted to use the same WAN link for video conferencing and VoIP. Increasing the bandwidth, says Geldart “would have been a very expensive proposition.”
Cubist had already introduced virtualization, “so one of the benefits that we wanted to get was the ability to do a snapshot of these [virtualized] machines and replicate them to other sites,” he says.
The company decided to move forward with a Riverbed Steelhead WAN optimization and application acceleration implementation. The major applications it needed to speed up over the link to Italy were Exchange 2003, Microsoft networking/CIFS, and for the disaster recovery link, FTP and NFS, says Geldart. With its own equipment in place at a third-party vendor’s recovery site (out of state), backup and recovery time have been reduced dramatically. That’s because the data is now replicated and sits on a live disk array, eliminating the need to restore from tape, which is one of the most time consuming parts of disaster recovery, says Geldart.
Tape is still useful, he adds, noting that it provides the ability to retrieve historical data and can also be a backup should replication fail.
Interestingly, deploying its own equipment at an offsite disaster recovery facility run by a third-party involved some struggle with that vendor. “The initial reaction [from the vendor] was a blank stare,” says Geldart. But [the vendor] came around, and Geldart reports that “they are absolutely changing their model.” (For security reasons, Cubist prefers to not reveal the name of the recovery site’s vendor.)
Croy, the Forsythe consultant, agrees. Vendors in this arena, such as SunGard, are becoming more flexible and competitive, he says. However, he argues those companies still need to lower costs, become even more flexible and broaden the scope of offerings “to better meet business needs.”
E-mail Appliances Deliver
Backing up e-mail in case of disaster has been a costly and time-consuming problem for years, says Gartner’s Scott. But now appliances are making it much easier to replicate Exchange and other major mail servers, she says.
Ken Adams, CIO of the Baltimore-based law firm of Miles & Stockbridge, says his company tried clustering Exchange servers, but found the strategy too complicated to engineer, requiring personnel to manage as well as hefty outlays for hardware and licensing. “We’re a law firm, not a technology company,” he says.
But the company’s 600 or so e-mail accounts are considered mission-critical, so a solution was mandatory. Adams eventually turned to Teneros, which sells continuity appliances designed to replicate Exchange servers. The Teneros appliances are IP-based and easy to install at production and disaster recovery sites, says Adams.
Should one of the firm’s Exchange, BlackBerry or Goodlink servers go down, the appliance takes over. And since Teneros monitors and maintains its appliances, there’s little overhead for Adams’ IT group.
While disaster planning needs to be high on your to-do list, that doesn’t mean you’ve got to bust your budget. In Katrina’s wake, Hancock “opened the checkbooks for DR,” says Milliet. “But now, we want to rationalize our spending to be in line with business value.”
One way to do that is to integrate disaster recovery needs with day-to-day operations, as Cubist did by optimizing its WAN.
On a larger scale, Hancock’s management realized that having a single, centralized call center in hurricane country was courting disaster, so it opened a second. Score one for disaster recovery, and chalk up a win for customer service: The new facility reduces caller wait times for customers during normal operations.
Bill Snyder is a freelance writer based in California.