Are Zombies Sucking the Life Out of Your Data Center?

If your organization operates its own servers and switches, it's likely that a percentage of them are zombies--steadily gobbling up resources but doing no work. Cleaning up a zombie server infestation takes good management and stringent documentation.

Even if your organization has gone the virtualization route or is leveraging the cloud, chances are you're still operating at least some of your own infrastructure. And that means there's a good chance you're operating servers and other equipment that are achieving nothing but the consumption of resources. That's right; you've got zombies in your data center.

"This is a very expensive issue for a lot of data centers," says Paul Goodison, CEO of Cormant, an infrastructure management company. "A server can cost something like $2,000 a year, and somewhere between 10 and 30 percent of your servers are dead. In a 4,000-server enterprise, if 400 of them are dead, you're looking at a bill for servers that are doing nothing of $800,000 a year. That's a very significant amount of money."

Goodison points to one Cormant customer that thought it had 900 pieces of equipment. When Cormant performed an inventory, it found 1,300 pieces, some of which had no LAN connections but were still hooked up to power.

Reasons for Zombie Server Infestations

Goodison says zombies tend to happen for one of two reasons. The first is that a server is lightly commissioned by the business for a period of time and becomes a line item in a spreadsheet somewhere. Over time, the need for the application on that server goes away, but there is no tieback to any physical process to decommission it, or if the decommissioning process does take place, it is only partially completed.

"The decom doesn't happen because they're not absolutely sure it's the right server," he says. "They say, 'We'll leave that one for now and come back to it.' Then they never come back to it."

It is at least as likely that the user of the service on the server was never recorded. Eventually it just stops being used and no one knows. This is common when there's no IT management solution in place, Goodison says. Servers tend to be commissioned in an ad hoc fashion, especially as part of skunkworks projects. In time, the organization knows there's a server physically there, but they don't know what it does or who provisioned it.

Getting the Data Center Under Control

To get your data center under control, Goodison says you need to start with good documentation. And that doesn't mean just another spreadsheet, Goodison warns. It starts with an accurate record of your physical equipment, along with owner information and a record of network and data connections. Switches can be zombies too, so you need to include them in your records as well. But you also need a data center infrastructure management (DCIM) tool that provides a common, real-time monitoring and management platform across your IT and facility infrastructures going forward. And your records must be updated regularly, including owner information as new equipment gets added.

"You shouldn't be able to put a new server into a live state unless you can identify who owns it," Goodison says. "Part of the change process is making people document what they do in a more structured fashion. You need to look at deployment slightly more holistically. It's no longer about just physically deploying the server. You need to make sure you've got the documentation there as well."

Once the documentation and tools are in place, you need to begin regularly querying and analyzing the logical information at your disposal: power draw, CPU utilization, network traffic and so on. This is especially important for servers you aren't immediately able to identify. These metrics can often help you pick out a zombie, as power utilization, CPU cycles and traffic are often flat for dead servers.

"You want to get down to a smaller list of servers that you don't know what they are doing," Goodison says. "And then you want to ask yourself, 'Is this server properly labeled? Let's log in and take a look. Is it doing work? Are things changing over time?'"

"You need to take a look at a complete physical picture and supplement that with query data," he adds. "Is that server actually plugged in and what power is it drawing?"

In the end, it's about making sure that you have complete visibility into your data center and how your equipment is performing.

"Good management pays dividends in terms of ROI," Goodison says.

Thor Olavsrud is a senior writer for CIO.com. Follow him @ThorOlavsrud.

Join the discussion
Be the first to comment on this article. Our Commenting Policies