For Monster.com, the initial benefits of virtualization in the data center were easy to see: With 500 virtual machines (VMs) running on 17 servers, Monster cut power and hardware spending and improved efficiency, since virtual machines can be deployed much faster than standard hardware. But as Monster’s virtual environment got big, and got big fast, management problems arose. The worst one: The company didn’t have enough visibility into which applications were competing with each other across storage and server resources—and this was affecting IT’s ability to meet service-level goals, says Pete King, manager of monitoring and analysis at Monster.
“We ran into a lot of contention,” says King.
So King turned to BalancePoint, a workload balancing and applications service-level-management tool from startup Akorri, to ease the pain. BalancePoint shows when and why a particular VM is not performing up to standard, and based on that data, King can redistribute the load to increase efficiency. It analyzes performance on the VMware side and storage area network side to avoid virtual fights for resources.
Now that Monster has been using BalancePoint for a little more than a year, “there’s less trial and error,” says Paul Neilson, senior vice president of technology services. Monster no longer has to move VMs around based on “intuition,” he adds.
Almost everyone using server virtualization will bump up against one or more of the common management problems, including workload balancing, “VM sprawl” and disaster recovery plan complications, says IDC analyst Stephen Elliot. Tools from VMware and a growing number of third-party vendors can help.
Keep Your Balance
Workload balancing can be a tough problem to get your arms around. One key benefit of virtual machines is the ability to move them easily from one physical server to another. Problem is, it’s hard to know how many VMs on a particular server are too many—since the answer may depend on the applications, plus factors like memory and attached storage. In an environment where critical applications compete for the same server, it becomes difficult to see which applications are contending with each other, and this affects a company’s ability to prevent slowdowns.
For Monster, managing this challenge required multiple tools, a situation that’s not uncommon. Monster uses Akorri’s BalancePoint to augment the capabilities of VMware’s two main management products, VMotion (which increases hardware utilization by migrating VMs on failing or underperforming servers to another machine) and Distributed Resource Scheduler (which couples with VMotion to allocate resources to high-priority VMs based on preestablished rules you set).
A key point: DRS and VMotion show where to balance workload, but they aren’t analytical and don’t see contention with other apps outside of VMware, King says. Since BalancePoint isn’t tied to the OS, it can see if VMware performance is impacted by other apps residing on the same SAN resources, he says. “DRS just sees what it sees for performance through the host (CPU, memory and storage), but it can’t see what the database server that’s on the same side as the SAN is doing,” says King.
The more VMs you move into production, the more critical predictability becomes, says Rick Knode, director of computing and communications infrastructure for San Diego Data Processing Corp. (SDDPC), a nonprofit provider of government IT solutions that serves customers like state agencies. Knode needed help managing resources in the company’s current environment (50 VMs on three servers) and in the future: Approximately 100 additional VMs will be added to production in the next fiscal year, Knode says. He looked to Vizioncore’s esxCharter tool to obtain performance information on SDDPC’s VMware ESX servers in real-time. This tool looks at performance levels and processes running inside the virtual machine. Being able to adjust the CPU power and memory allocated to VMs is critical when you need to make on-the-fly adjustments and terminate or move processes that are adversely affecting environments, Knode says. “It gives you more visibility into what’s going on.” For example, if a specific VM is eating away at one of his processors and affecting other VMs on that processor, he can use DRS and VMotion to move the VM onto another processor. But he says he wouldn’t know which VMs to move without Vizioncore.
At Wachovia, the fourth largest bank in the United States, Tony Bishop, chief architect, turned to Scalent for help balancing workloads for his 1,000 VMs running on a few hundred servers used in development, testing and back-office roles. Scalent, which may be used independently or in concert with VMware, helps Bishop repurpose servers quickly. “Some of the other [management] tools we looked at also have forms of provisioning, but they don’t have the ability to act in as near real-time as possible, like Scalent can,” says Bishop. Scalent’s software gives him management flexibility when apps are competing for resources, he says.
Among the vendors offering server virtualization software:
VMotion and Distributed Resource Scheduler (DRS) are part of the VMware Infrastructure 3 suite’s enterprise edition. DRS handles dynamic workload balancing, while VMotion migrates VMs across physical servers.
Scalent’s Virtual Operating Environment (V/OE) tools, which may be used with or without VMware, maintain network and storage connections while moving servers. Scalent also redeploys servers in case of failure or load change.
Vizioncore’s esxCharter tool augments the capabilities of VMware, letting you compare the performance of individual VMs, spot bottlenecks and create long-term performance reports.
Akorri’s BalancePoint bridges the gap between server and storage components, providing insight into virtualized machines and the SAN, locating points of contention and providing troubleshooting analysis.
You’ll find about 50 other vendors tackling virtualization management, says Cameron Haight, a research VP at Gartner, including: Platespin (disaster recovery and migration); Aurema (recently acquired by Citrix, VMware resource management); Cirba (data center consolidation planning); BMC (capacity planning); and CA (performance monitoring across multiple infrastructures, including VMware, Sun and AIX).
Masters of Disaster
Flexibility also pays with regard to disaster recovery, an area where CIOs are increasingly looking to virtualization. Nate Stuyvesant, CTO of Genilogix, an IT consultancy, says disaster recovery is his company’s biggest IT management issue, period. He’s not alone.
According to Gartner data, 70 percent to 75 percent of Gartner’s clients who are using virtualization for x86 servers are also using it for disaster recovery. Genilogix runs 60 VMs on four servers across development, testing and production environments. Stuyvesant relies on VMotion to move a server over to another physical box and effectively eliminate downtime, VMware’s DRS tool alone is a cogent reason to consider virtualization in the first place, he says.
Eric Miller, president and CEO of Genesis Multimedia, a Web hosting company that also designs its customers’ Web applications, uses VMotion to increase uptime and improve reliability in his environment of 55 virtual machines running on three hosts, where some customers need higher utilization than others. Miller relies on VMotion, driven by DRS, to move the virtual machines around.
Genesis is no stranger to virtualization—it has been operating in a virtual server environment since VMware made its debut—but management isn’t always easy. The initial move to consolidate 12 servers used for Web hosting, and two larger servers for database systems, helped Genesis manage its physical servers, but moving virtual machines around, implementing patches and performing BIOS upgrades without experiencing downtime was difficult, Miller says. As an infrastructure provider, Genesis must provide high service levels, so uptime is critical. “We couldn’t maintain those without VMotion and DRS,” says Miller.
Add-on tools can help address the problem of “VM sprawl,” by keeping track of how many VMs you have and where.
“It’s somewhat ironic that the benefit of virtualization is resource optimization, but it encourages messy behavior,” says Cameron Haight, a research vice president at Gartner, noting that almost all his clients cite VM sprawl as a big worry. “You can spend these things so quickly that you lose track of what you have,” Haight says.
SDDPC’s Knode says Vizioncore helps him prevent VM sprawl in the first place. “By watching the metrics of the virtual environment, we plan ahead. So by using VMware and Vizioncore I can see how many additional resources are available on an ESX host, and when is a good point to move machines or purchase additional servers or storage. We’re using the product as a preventative measure.”
Monster’s King and Wachovia’s Bishop both say they’d like virtualization management vendors to take the next step—better integration of their tools with existing management software. For example, King would like to see the tools in HP’s Mercury Business Availability Center suite (which Monster uses for transaction and infrastructure monitoring) integrated with BalancePoint.
Bishop agrees: “We’ve achieved very good results, but we’re trying to create an integrated management capability with all the tools in one view.” Bishop, who uses HP’s Mercury BAC suite, OpTier CoreFirst and Symantec i3, would like to see these tools better integrated with Scalent, VMware and DataSynapse, which he uses for application virtualization. After all, he says, virtualization tools can solve manageability issues, but CIOs want a holistic management picture.
Reach Associate Staff Writer Katherine Walsh at firstname.lastname@example.org.